Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies

López-Ibáñez, Carmen; Blázquez-Rincón, Desirée; Sánchez-Meca, Julio

Conference Object

Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies

Author(s) / Creator(s)

López-Ibáñez, Carmen

Blázquez-Rincón, Desirée

Sánchez-Meca, Julio

Abstract / Description

Background: An important psychometric property of the test is reliability which is defined as the scores’ replicability. A common issue is to interpret it assuming that reliability is inherent to test instead of to consider reliability as a property of the sample data (Sánchez-Meca, López-Pina, & López-López, 2009; Sánchez-Meca, López-Pina, & López-López, 2012). The Reliability Generalization Meta-Analytical (hereafter RG) approach has proven to solve that question (Vacha-Haase, 1998). RG aims to analyze the variability of reliability coefficients in the different applications of a test, with the objective of investigating the extent to which reliability of a test scores can be generalized to different applications (Sánchez-Meca et al., 2012). Specifically, an RG research comprises both the reliability coefficients found in different studies about the same test, and study characteristics of the study as predictors of variability of reliability coefficients (dependent variable) (Sánchez-Meca et al., 2012). Thus, one of the main objectives of the RG studies is to obtain an average reliability coefficient. Feldt and Charter (2006) presented six different procedures to obtain it. All of them can be applied as unweighted or weighted by the sample size, so we have twelve different procedures for averaging reliability coefficients (Sánchez-Meca et al., 2012). The first one is to average of the alpha coefficients directly untransforming them. The second, Feldt and Charter (2006) defined it as the value that doubles the average of typical measurement errors. Third method consists in transforming it to Fisher's Z to obtain the weighted average and then transforming it back to alpha coefficients (assuming that the alpha value is equivalent to that obtained by parallel forms). The fourth proposed by Hakstian and Whalen (1976), consists of transforming it to the cubic root, normalizing the distribution. In the fifth procedure, the reliability index is used, making the square root of the reliability coefficient. By last, the sixth method uses Fisher’s Z transformation of the reliability index, and then it is transformed back again, as in procedure 3. To prove the variations between the different methods, Sánchez-Meca et al. (2012) carried out a simulation study where they tested each procedure in its weighted and unweighted form, finding differences among them: regarding both the mean square error and the bias of the estimator, the methods that yielded better results were the procedures 2 and 4. In addition, they also observed better results when the coefficients were weighted by sample size of the empirical studies than when the coefficients were unweighted. Objectives: This study aims to determine whether these differences are also found when applying these procedures to real RG meta-analyses. In addition, we also included a seventh transformation proposed by Bonett (2002), which consists of calculating the natural logarithm of the supplementary coefficient. We hope to find differences among the different methods to pool reliability coefficients and their corresponding 95% confidence intervals (Sánchez-Meca, López-López, & López-Pina, 2013). Method: To carry out this study, all RG meta-analyses, published or not, that reported the database with the individual reliability coefficients, were selected for this study. The search is being accomplished through Google Scholar and Scopus search engines. In addition, since the reliability coefficient most commonly reported by empirical studies is usually Cronbach's alpha, we focused on meta-analyses that reported this type of reliability. To compare the different results of the procedures, we established two comparison measures: the differences between the average alpha values obtained with the different procedures and the width of the confidence interval around the average reliability coefficient. The confidence intervals were calculated according to different models assumed: the fixed-effect (FE) model (Hedges & Olkin, 1985; Konstantopoulos & Hedges, 2009), the random-effects (RE) model (Hedges & Vevea, 1998; Raudenbush, 2009), the varying-coefficient (VC) model advocated by Bonett (2008, 2009, 2010) and the improved method proposed by Knapp and Hartung (2003) under the random-effects model. Conclusion: In order to be the most comprehensive as possible, the search for the RG meta-analyses to be included in this study will finish on December 31st 2018. Once finished the literature search, the results of applying the different methods for averaging reliability coefficients and for constructing confidence intervals will be compared. Finally, the results will be discussed and recommendations will be made for meta-analysts that can be interested in conducting RG meta-analyses. References: Bonett, D. G. (2002). Sample size requirements for testing and estimating coefficient alpha. Journal of Educational and Behavioral Statistics, 27(4), 335-340. Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15(4), 368-385. Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. Hakstian, A. R., & Whalen, T. E. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41(2), 219-231. López-Pina, J. A., Sánchez-Meca, J., & López-López, J. A. (2012). Métodos para promediar coeficientes alfa en los estudios de generalización de la fiabilidad. Psicothema, 24, 161-166. Sánchez-Meca, J., López-López, J.A. y López-Pina, J.A. (2013). Some recommended statistical analytic practices when reliability generalization studies are conducted. British Journal of Mathematical and Statistical Psychology, 66, 402-425. Sánchez-Meca, J., López-Pina, J. A. y López, J. A. (2009). Generalización de la fiabilidad: un enfoque metaanalítico aplicado a la fiabilidad. Fisioterapia, 31(6), 262-270. Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58(1), 6-20.

Persistent Identifier

https://doi.org/10.23668/psycharchives.2474

Date of first publication

2019-05-30

Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

López-Ibáñez, C., Blázquez-Rincón, D., & Sánchez-Meca, J. (2019). Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2474

2_Dbk Carmen.pdf

Adobe PDF - 2.16MB

MD5: 8802c9e0700a092ec16bb14e0d22cb93

Sharing Level 0 (Public Use) CC-BY-SA 4.0

Download

Description: Conference Talk

There are no other versions of this object.

Author(s) / Creator(s)

López-Ibáñez, Carmen
Author(s) / Creator(s)

Blázquez-Rincón, Desirée
Author(s) / Creator(s)

Sánchez-Meca, Julio
PsychArchives acquisition timestamp

2019-06-11T14:18:18Z
Made available on

2019-06-11T14:18:18Z
Date of first publication

2019-05-30
Abstract / Description

Background: An important psychometric property of the test is reliability which is defined as the scores’ replicability. A common issue is to interpret it assuming that reliability is inherent to test instead of to consider reliability as a property of the sample data (Sánchez-Meca, López-Pina, & López-López, 2009; Sánchez-Meca, López-Pina, & López-López, 2012). The Reliability Generalization Meta-Analytical (hereafter RG) approach has proven to solve that question (Vacha-Haase, 1998). RG aims to analyze the variability of reliability coefficients in the different applications of a test, with the objective of investigating the extent to which reliability of a test scores can be generalized to different applications (Sánchez-Meca et al., 2012). Specifically, an RG research comprises both the reliability coefficients found in different studies about the same test, and study characteristics of the study as predictors of variability of reliability coefficients (dependent variable) (Sánchez-Meca et al., 2012). Thus, one of the main objectives of the RG studies is to obtain an average reliability coefficient. Feldt and Charter (2006) presented six different procedures to obtain it. All of them can be applied as unweighted or weighted by the sample size, so we have twelve different procedures for averaging reliability coefficients (Sánchez-Meca et al., 2012). The first one is to average of the alpha coefficients directly untransforming them. The second, Feldt and Charter (2006) defined it as the value that doubles the average of typical measurement errors. Third method consists in transforming it to Fisher's Z to obtain the weighted average and then transforming it back to alpha coefficients (assuming that the alpha value is equivalent to that obtained by parallel forms). The fourth proposed by Hakstian and Whalen (1976), consists of transforming it to the cubic root, normalizing the distribution. In the fifth procedure, the reliability index is used, making the square root of the reliability coefficient. By last, the sixth method uses Fisher’s Z transformation of the reliability index, and then it is transformed back again, as in procedure 3. To prove the variations between the different methods, Sánchez-Meca et al. (2012) carried out a simulation study where they tested each procedure in its weighted and unweighted form, finding differences among them: regarding both the mean square error and the bias of the estimator, the methods that yielded better results were the procedures 2 and 4. In addition, they also observed better results when the coefficients were weighted by sample size of the empirical studies than when the coefficients were unweighted. Objectives: This study aims to determine whether these differences are also found when applying these procedures to real RG meta-analyses. In addition, we also included a seventh transformation proposed by Bonett (2002), which consists of calculating the natural logarithm of the supplementary coefficient. We hope to find differences among the different methods to pool reliability coefficients and their corresponding 95% confidence intervals (Sánchez-Meca, López-López, & López-Pina, 2013). Method: To carry out this study, all RG meta-analyses, published or not, that reported the database with the individual reliability coefficients, were selected for this study. The search is being accomplished through Google Scholar and Scopus search engines. In addition, since the reliability coefficient most commonly reported by empirical studies is usually Cronbach's alpha, we focused on meta-analyses that reported this type of reliability. To compare the different results of the procedures, we established two comparison measures: the differences between the average alpha values obtained with the different procedures and the width of the confidence interval around the average reliability coefficient. The confidence intervals were calculated according to different models assumed: the fixed-effect (FE) model (Hedges & Olkin, 1985; Konstantopoulos & Hedges, 2009), the random-effects (RE) model (Hedges & Vevea, 1998; Raudenbush, 2009), the varying-coefficient (VC) model advocated by Bonett (2008, 2009, 2010) and the improved method proposed by Knapp and Hartung (2003) under the random-effects model. Conclusion: In order to be the most comprehensive as possible, the search for the RG meta-analyses to be included in this study will finish on December 31st 2018. Once finished the literature search, the results of applying the different methods for averaging reliability coefficients and for constructing confidence intervals will be compared. Finally, the results will be discussed and recommendations will be made for meta-analysts that can be interested in conducting RG meta-analyses. References: Bonett, D. G. (2002). Sample size requirements for testing and estimating coefficient alpha. Journal of Educational and Behavioral Statistics, 27(4), 335-340. Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15(4), 368-385. Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. Hakstian, A. R., & Whalen, T. E. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41(2), 219-231. López-Pina, J. A., Sánchez-Meca, J., & López-López, J. A. (2012). Métodos para promediar coeficientes alfa en los estudios de generalización de la fiabilidad. Psicothema, 24, 161-166. Sánchez-Meca, J., López-López, J.A. y López-Pina, J.A. (2013). Some recommended statistical analytic practices when reliability generalization studies are conducted. British Journal of Mathematical and Statistical Psychology, 66, 402-425. Sánchez-Meca, J., López-Pina, J. A. y López, J. A. (2009). Generalización de la fiabilidad: un enfoque metaanalítico aplicado a la fiabilidad. Fisioterapia, 31(6), 262-270. Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58(1), 6-20.

en_US
Citation

López-Ibáñez, C., Blázquez-Rincón, D., & Sánchez-Meca, J. (2019). Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2474

en
Persistent Identifier

https://hdl.handle.net/20.500.12034/2100
Persistent Identifier

https://doi.org/10.23668/psycharchives.2474
Language of content

eng

en_US
Publisher

ZPID (Leibniz Institute for Psychology Information)

en_US
Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

en_US
Dewey Decimal Classification number(s)

150
Title

Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies

en_US
DRO type

conferenceObject

en_US
Visible tag(s)

ZPID Conferences and Workshops