Conference Object

Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies

Author(s) / Creator(s)

López-Ibáñez, Carmen
Blázquez-Rincón, Desirée
Sánchez-Meca, Julio

Abstract / Description

Background: An important psychometric property of the test is reliability which is defined as the scores’ replicability. A common issue is to interpret it assuming that reliability is inherent to test instead of to consider reliability as a property of the sample data (Sánchez-Meca, López-Pina, & López-López, 2009; Sánchez-Meca, López-Pina, & López-López, 2012). The Reliability Generalization Meta-Analytical (hereafter RG) approach has proven to solve that question (Vacha-Haase, 1998). RG aims to analyze the variability of reliability coefficients in the different applications of a test, with the objective of investigating the extent to which reliability of a test scores can be generalized to different applications (Sánchez-Meca et al., 2012). Specifically, an RG research comprises both the reliability coefficients found in different studies about the same test, and study characteristics of the study as predictors of variability of reliability coefficients (dependent variable) (Sánchez-Meca et al., 2012). Thus, one of the main objectives of the RG studies is to obtain an average reliability coefficient. Feldt and Charter (2006) presented six different procedures to obtain it. All of them can be applied as unweighted or weighted by the sample size, so we have twelve different procedures for averaging reliability coefficients (Sánchez-Meca et al., 2012). The first one is to average of the alpha coefficients directly untransforming them. The second, Feldt and Charter (2006) defined it as the value that doubles the average of typical measurement errors. Third method consists in transforming it to Fisher's Z to obtain the weighted average and then transforming it back to alpha coefficients (assuming that the alpha value is equivalent to that obtained by parallel forms). The fourth proposed by Hakstian and Whalen (1976), consists of transforming it to the cubic root, normalizing the distribution. In the fifth procedure, the reliability index is used, making the square root of the reliability coefficient. By last, the sixth method uses Fisher’s Z transformation of the reliability index, and then it is transformed back again, as in procedure 3. To prove the variations between the different methods, Sánchez-Meca et al. (2012) carried out a simulation study where they tested each procedure in its weighted and unweighted form, finding differences among them: regarding both the mean square error and the bias of the estimator, the methods that yielded better results were the procedures 2 and 4. In addition, they also observed better results when the coefficients were weighted by sample size of the empirical studies than when the coefficients were unweighted. Objectives: This study aims to determine whether these differences are also found when applying these procedures to real RG meta-analyses. In addition, we also included a seventh transformation proposed by Bonett (2002), which consists of calculating the natural logarithm of the supplementary coefficient. We hope to find differences among the different methods to pool reliability coefficients and their corresponding 95% confidence intervals (Sánchez-Meca, López-López, & López-Pina, 2013). Method: To carry out this study, all RG meta-analyses, published or not, that reported the database with the individual reliability coefficients, were selected for this study. The search is being accomplished through Google Scholar and Scopus search engines. In addition, since the reliability coefficient most commonly reported by empirical studies is usually Cronbach's alpha, we focused on meta-analyses that reported this type of reliability. To compare the different results of the procedures, we established two comparison measures: the differences between the average alpha values obtained with the different procedures and the width of the confidence interval around the average reliability coefficient. The confidence intervals were calculated according to different models assumed: the fixed-effect (FE) model (Hedges & Olkin, 1985; Konstantopoulos & Hedges, 2009), the random-effects (RE) model (Hedges & Vevea, 1998; Raudenbush, 2009), the varying-coefficient (VC) model advocated by Bonett (2008, 2009, 2010) and the improved method proposed by Knapp and Hartung (2003) under the random-effects model. Conclusion: In order to be the most comprehensive as possible, the search for the RG meta-analyses to be included in this study will finish on December 31st 2018. Once finished the literature search, the results of applying the different methods for averaging reliability coefficients and for constructing confidence intervals will be compared. Finally, the results will be discussed and recommendations will be made for meta-analysts that can be interested in conducting RG meta-analyses. References: Bonett, D. G. (2002). Sample size requirements for testing and estimating coefficient alpha. Journal of Educational and Behavioral Statistics, 27(4), 335-340. Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15(4), 368-385. Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. Hakstian, A. R., & Whalen, T. E. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41(2), 219-231. López-Pina, J. A., Sánchez-Meca, J., & López-López, J. A. (2012). Métodos para promediar coeficientes alfa en los estudios de generalización de la fiabilidad. Psicothema, 24, 161-166. Sánchez-Meca, J., López-López, J.A. y López-Pina, J.A. (2013). Some recommended statistical analytic practices when reliability generalization studies are conducted. British Journal of Mathematical and Statistical Psychology, 66, 402-425. Sánchez-Meca, J., López-Pina, J. A. y López, J. A. (2009). Generalización de la fiabilidad: un enfoque metaanalítico aplicado a la fiabilidad. Fisioterapia, 31(6), 262-270. Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58(1), 6-20.

Persistent Identifier

Date of first publication

2019-05-30

Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

López-Ibáñez, C., Blázquez-Rincón, D., & Sánchez-Meca, J. (2019). Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2474
  • Author(s) / Creator(s)
    López-Ibáñez, Carmen
  • Author(s) / Creator(s)
    Blázquez-Rincón, Desirée
  • Author(s) / Creator(s)
    Sánchez-Meca, Julio
  • PsychArchives acquisition timestamp
    2019-06-11T14:18:18Z
  • Made available on
    2019-06-11T14:18:18Z
  • Date of first publication
    2019-05-30
  • Abstract / Description
    Background: An important psychometric property of the test is reliability which is defined as the scores’ replicability. A common issue is to interpret it assuming that reliability is inherent to test instead of to consider reliability as a property of the sample data (Sánchez-Meca, López-Pina, & López-López, 2009; Sánchez-Meca, López-Pina, & López-López, 2012). The Reliability Generalization Meta-Analytical (hereafter RG) approach has proven to solve that question (Vacha-Haase, 1998). RG aims to analyze the variability of reliability coefficients in the different applications of a test, with the objective of investigating the extent to which reliability of a test scores can be generalized to different applications (Sánchez-Meca et al., 2012). Specifically, an RG research comprises both the reliability coefficients found in different studies about the same test, and study characteristics of the study as predictors of variability of reliability coefficients (dependent variable) (Sánchez-Meca et al., 2012). Thus, one of the main objectives of the RG studies is to obtain an average reliability coefficient. Feldt and Charter (2006) presented six different procedures to obtain it. All of them can be applied as unweighted or weighted by the sample size, so we have twelve different procedures for averaging reliability coefficients (Sánchez-Meca et al., 2012). The first one is to average of the alpha coefficients directly untransforming them. The second, Feldt and Charter (2006) defined it as the value that doubles the average of typical measurement errors. Third method consists in transforming it to Fisher's Z to obtain the weighted average and then transforming it back to alpha coefficients (assuming that the alpha value is equivalent to that obtained by parallel forms). The fourth proposed by Hakstian and Whalen (1976), consists of transforming it to the cubic root, normalizing the distribution. In the fifth procedure, the reliability index is used, making the square root of the reliability coefficient. By last, the sixth method uses Fisher’s Z transformation of the reliability index, and then it is transformed back again, as in procedure 3. To prove the variations between the different methods, Sánchez-Meca et al. (2012) carried out a simulation study where they tested each procedure in its weighted and unweighted form, finding differences among them: regarding both the mean square error and the bias of the estimator, the methods that yielded better results were the procedures 2 and 4. In addition, they also observed better results when the coefficients were weighted by sample size of the empirical studies than when the coefficients were unweighted. Objectives: This study aims to determine whether these differences are also found when applying these procedures to real RG meta-analyses. In addition, we also included a seventh transformation proposed by Bonett (2002), which consists of calculating the natural logarithm of the supplementary coefficient. We hope to find differences among the different methods to pool reliability coefficients and their corresponding 95% confidence intervals (Sánchez-Meca, López-López, & López-Pina, 2013). Method: To carry out this study, all RG meta-analyses, published or not, that reported the database with the individual reliability coefficients, were selected for this study. The search is being accomplished through Google Scholar and Scopus search engines. In addition, since the reliability coefficient most commonly reported by empirical studies is usually Cronbach's alpha, we focused on meta-analyses that reported this type of reliability. To compare the different results of the procedures, we established two comparison measures: the differences between the average alpha values obtained with the different procedures and the width of the confidence interval around the average reliability coefficient. The confidence intervals were calculated according to different models assumed: the fixed-effect (FE) model (Hedges & Olkin, 1985; Konstantopoulos & Hedges, 2009), the random-effects (RE) model (Hedges & Vevea, 1998; Raudenbush, 2009), the varying-coefficient (VC) model advocated by Bonett (2008, 2009, 2010) and the improved method proposed by Knapp and Hartung (2003) under the random-effects model. Conclusion: In order to be the most comprehensive as possible, the search for the RG meta-analyses to be included in this study will finish on December 31st 2018. Once finished the literature search, the results of applying the different methods for averaging reliability coefficients and for constructing confidence intervals will be compared. Finally, the results will be discussed and recommendations will be made for meta-analysts that can be interested in conducting RG meta-analyses. References: Bonett, D. G. (2002). Sample size requirements for testing and estimating coefficient alpha. Journal of Educational and Behavioral Statistics, 27(4), 335-340. Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods, 15(4), 368-385. Feldt, L. S., & Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66(2), 215-227. Hakstian, A. R., & Whalen, T. E. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41(2), 219-231. López-Pina, J. A., Sánchez-Meca, J., & López-López, J. A. (2012). Métodos para promediar coeficientes alfa en los estudios de generalización de la fiabilidad. Psicothema, 24, 161-166. Sánchez-Meca, J., López-López, J.A. y López-Pina, J.A. (2013). Some recommended statistical analytic practices when reliability generalization studies are conducted. British Journal of Mathematical and Statistical Psychology, 66, 402-425. Sánchez-Meca, J., López-Pina, J. A. y López, J. A. (2009). Generalización de la fiabilidad: un enfoque metaanalítico aplicado a la fiabilidad. Fisioterapia, 31(6), 262-270. Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58(1), 6-20.
    en_US
  • Citation
    López-Ibáñez, C., Blázquez-Rincón, D., & Sánchez-Meca, J. (2019). Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2474
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/2100
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.2474
  • Language of content
    eng
    en_US
  • Publisher
    ZPID (Leibniz Institute for Psychology Information)
    en_US
  • Is part of
    Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Reliability Generalization Meta-Analysis: A comparison of statistical analytic strategies
    en_US
  • DRO type
    conferenceObject
    en_US
  • Visible tag(s)
    ZPID Conferences and Workshops