Conference Object

Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions

Author(s) / Creator(s)

Fernández-Castilla, Belén
Beretvas, S. Natasha
Onghena, Patrick
Van den Noortgate, Wim

Abstract / Description

Introduction: Meta-analysis can be conceptualized as a multilevel analysis: effect sizes are nested within studies. Effect sizes vary due to sampling variance at Level 1, and possibly also due to systematic differences across studies at Level 2. Therefore, multilevel models and software can be used to perform meta-analysis. An advantage of using the multilevel framework for doing meta-analyses is the flexibility of multilevel models. For instance, additional levels can be added to deal with dependent effect sizes within and between studies. In primary studies, it is common to report multiple effect sizes extracted from the same sample. Also, studies might belong to different higher-level clusters, as countries or research groups. These two scenarios generate dependency among effect sizes, and for appropriately accounting for this dependency (and therefore avoid inflated Type I errors), additional levels can be added that explicitly account for the variation among effect sizes within and/or between studies. Besides hierarchical models, other non-purely hierarchical models have been also proposed for meta-analysis, such as Cross-Classified Random Effects models (CCREMs, Fernández-Castilla et al., 2018). Although multilevel models are very flexible, we suspect that applied researchers do not take advantage of all possibilities that these models offer. In fact, most published meta-analyses are restricted to three-level models despite some meta-analytic data require other model specifications, such as four- or five- level models or CCREMs. Therefore, the goal of this study is to describe how multilevel models are typically applied in meta-analysis and to illustrate how, in some meta-analyses, more sophisticated models could have been applied that accounts better for the (non) hierarchical data structure. Method: Meta-analyses that applied multilevel models with more than one random component were searched in June, 2018. We looked at the meta-analyses that cited the studies of Cheung (2014), Hox and De Leeuw (2003), Konstantopoulos (2011), Raudenbush and Bryk (1985), and Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca (2013, 2014). We also searched in six electronic databases, using the strings “three-level meta-analysis” OR “multilevel meta-analysis” OR “multilevel meta-analytic review”. No date restriction was imposed. Meta-analysis were included if: a) effect sizes were combined using a multilevel model with more than one random component; b) The meta-analysis was included in a journal article, conference presentation or a dissertation; c) The meta-analysis was written in English, Spanish or Dutch. Results: The initial search resulted in 1,286 studies. After applying the inclusion criteria, we finally retrieved 178 meta-analyses. From these, 162 meta-analysis fitted a three-level model, 9 fitted a four-level model, 5 applied CCREMs, and 2 reported a five-level model. We could distinguish five situations in which other models different from the three-level model would have been more appropriate given the (non) hierarchical data structure: 1. A fourth level could have been added to model dependency within studies. For instance, Fischer and Boer (2011) specified a three-level model, were effect sizes (Level 1) were nested within studies (Level 2), nested within countries (Level 3). There were several effect sizes within studies, but this within-study variance was ignored. Therefore, it would have been appropriate to add an additional level to model between-outcomes (within-study) variance. 2. A fourth level could have been specified to deal with more sources of within-study dependencies. For instance, in O’Mara (2006), there were several interventions within studies, and that is why a three-level model was specified: Sampling variance (Level 1), between-interventions variance (Level 2), and between-studies variance (Level 3). However, there were 200 interventions and 460 effect sizes in total, meaning that each intervention led to multiple effect sizes, and that the dependency between these outcomes (within interventions) was not taken into account. A more appropriate model would have been a four-level model: Sampling variance (Level 1), between-outcomes variance (Level 2), between-comparisons variance (Level 3) and between-studies variance (Level 4). 3. A fourth level could have been added to take into account dependency across studies. In the study of Klomp and Valckx (2014), a three-level model was fitted because there were multiple outcomes within studies. In this case, some studies made use of the same big dataset, so a fourth level could have been added to model between-datasets variance. 4. A five-level model could have been applied to model additional within-study and between-study dependencies. In Rabl, Jayasinghe, Gerhart, and Kühlmann (2014), a three-level model was fitted, where effect sizes were nested within studies, nested within countries. There were several effect sizes within studies, so an additional level could have been added to model within-study variance. Furthermore, some studies used the same dataset, so another level could have been specified to estimate the between-datasets variance. The inclusion of these two additional levels would have led to a five-level model. 5. CCREM’s could have been applied instead of three-level models. In the study of Fisher, Hanke and Sibley (2012), effect sizes were nested within studies, nested within countries. However, studies were not completely nested within countries, but rather studies and countries were two cross-classified factors: in one study, effect sizes could come from different countries, and effect sizes from the same country could belong to different studies. Therefore, a CCREM model would have accounted better for this cross-classified data structure. Discussion: This systematic review shows how researchers using multilevel model typically apply three-level models to account for dependent effect sizes, although alternative model specifications, such as four- or five- level models or CCREMs, might be more correct given the nature of the data. We have given some examples of how alternative models could have been used for meta-analysis, and we encourage researchers to carefully consider the underlying data structure before selecting a specific multilevel model. Omitting levels in a multilevel analysis might increase the possibility of committing a Type I error. Therefore, the proper specification of the model is the only way to guarantee appropriate estimates of the combined effect size, standard errors, and variance components. References: Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19, 211-229. Fernández-Castilla, B., Maes, M., Declercq, L., Jamshidi, L., Beretvas, S. N., Onghena, P., & Van den Noortgate, W. (2018). A demonstration and evaluation of the use of cross-classified random-effects models for meta-analysis. Behavior Research Methods, 1-19. Fischer, R., & Boer, D. (2011). What is more important for national well-being: money or autonomy? A meta-analysis of well-being, burnout, and anxiety across 63 societies. Journal of Personality and Social Psychology, 101, 164-184. Fischer, R., Hanke, K., & Sibley, C. G. (2012). Cultural and institutional determinants of social dominance orientation: A cross‐cultural meta‐analysis of 27 societies. Political Psychology, 33, 437-467. Hox, J. J., & de Leeuw, E. D. (2003). Multilevel models for meta-analysis. In S. P. Reise & N. Duan (Eds.), Multilevel modeling: Methodological advances, issues, and applications (pp. 90–111). Mahwah, NJ: Erlbaum. Klomp, J., & Valckx, K. (2014). Natural disasters and economic growth: A meta-analysis. Global Environmental Change, 26, 183-195. Konstantopoulos, S. (2011). Fixed effects and variance components estimation in three-level meta-analysis. Research Synthesis Methods, 2, 61-76. O’Mara, A. J., Marsh, H. W., & Craven, R. G. (July, 2006). A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions. In Fourth International Biennial SELF Research Conference, Ann Arbor. Rabl, T., Jayasinghe, M., Gerhart, B., & Kühlmann, T. M. (2014). A meta-analysis of country differences in the high-performance work system–business performance relationship: The roles of national culture and managerial discretion. Journal of Applied Psychology, 99, 1011-1041. Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10, 75-98. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45, 576-594. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2014). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47, 1274-1294.

Persistent Identifier

Date of first publication

2019-05-31

Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

Fernández-Castilla, B., Beretvas, S. N., Onghena, P., & Van Den Noortgate, W. (2019, May 31). Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2478
  • Author(s) / Creator(s)
    Fernández-Castilla, Belén
  • Author(s) / Creator(s)
    Beretvas, S. Natasha
  • Author(s) / Creator(s)
    Onghena, Patrick
  • Author(s) / Creator(s)
    Van den Noortgate, Wim
  • PsychArchives acquisition timestamp
    2019-06-14T08:34:24Z
  • Made available on
    2019-06-14T08:34:24Z
  • Date of first publication
    2019-05-31
  • Abstract / Description
    Introduction: Meta-analysis can be conceptualized as a multilevel analysis: effect sizes are nested within studies. Effect sizes vary due to sampling variance at Level 1, and possibly also due to systematic differences across studies at Level 2. Therefore, multilevel models and software can be used to perform meta-analysis. An advantage of using the multilevel framework for doing meta-analyses is the flexibility of multilevel models. For instance, additional levels can be added to deal with dependent effect sizes within and between studies. In primary studies, it is common to report multiple effect sizes extracted from the same sample. Also, studies might belong to different higher-level clusters, as countries or research groups. These two scenarios generate dependency among effect sizes, and for appropriately accounting for this dependency (and therefore avoid inflated Type I errors), additional levels can be added that explicitly account for the variation among effect sizes within and/or between studies. Besides hierarchical models, other non-purely hierarchical models have been also proposed for meta-analysis, such as Cross-Classified Random Effects models (CCREMs, Fernández-Castilla et al., 2018). Although multilevel models are very flexible, we suspect that applied researchers do not take advantage of all possibilities that these models offer. In fact, most published meta-analyses are restricted to three-level models despite some meta-analytic data require other model specifications, such as four- or five- level models or CCREMs. Therefore, the goal of this study is to describe how multilevel models are typically applied in meta-analysis and to illustrate how, in some meta-analyses, more sophisticated models could have been applied that accounts better for the (non) hierarchical data structure. Method: Meta-analyses that applied multilevel models with more than one random component were searched in June, 2018. We looked at the meta-analyses that cited the studies of Cheung (2014), Hox and De Leeuw (2003), Konstantopoulos (2011), Raudenbush and Bryk (1985), and Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca (2013, 2014). We also searched in six electronic databases, using the strings “three-level meta-analysis” OR “multilevel meta-analysis” OR “multilevel meta-analytic review”. No date restriction was imposed. Meta-analysis were included if: a) effect sizes were combined using a multilevel model with more than one random component; b) The meta-analysis was included in a journal article, conference presentation or a dissertation; c) The meta-analysis was written in English, Spanish or Dutch. Results: The initial search resulted in 1,286 studies. After applying the inclusion criteria, we finally retrieved 178 meta-analyses. From these, 162 meta-analysis fitted a three-level model, 9 fitted a four-level model, 5 applied CCREMs, and 2 reported a five-level model. We could distinguish five situations in which other models different from the three-level model would have been more appropriate given the (non) hierarchical data structure: 1. A fourth level could have been added to model dependency within studies. For instance, Fischer and Boer (2011) specified a three-level model, were effect sizes (Level 1) were nested within studies (Level 2), nested within countries (Level 3). There were several effect sizes within studies, but this within-study variance was ignored. Therefore, it would have been appropriate to add an additional level to model between-outcomes (within-study) variance. 2. A fourth level could have been specified to deal with more sources of within-study dependencies. For instance, in O’Mara (2006), there were several interventions within studies, and that is why a three-level model was specified: Sampling variance (Level 1), between-interventions variance (Level 2), and between-studies variance (Level 3). However, there were 200 interventions and 460 effect sizes in total, meaning that each intervention led to multiple effect sizes, and that the dependency between these outcomes (within interventions) was not taken into account. A more appropriate model would have been a four-level model: Sampling variance (Level 1), between-outcomes variance (Level 2), between-comparisons variance (Level 3) and between-studies variance (Level 4). 3. A fourth level could have been added to take into account dependency across studies. In the study of Klomp and Valckx (2014), a three-level model was fitted because there were multiple outcomes within studies. In this case, some studies made use of the same big dataset, so a fourth level could have been added to model between-datasets variance. 4. A five-level model could have been applied to model additional within-study and between-study dependencies. In Rabl, Jayasinghe, Gerhart, and Kühlmann (2014), a three-level model was fitted, where effect sizes were nested within studies, nested within countries. There were several effect sizes within studies, so an additional level could have been added to model within-study variance. Furthermore, some studies used the same dataset, so another level could have been specified to estimate the between-datasets variance. The inclusion of these two additional levels would have led to a five-level model. 5. CCREM’s could have been applied instead of three-level models. In the study of Fisher, Hanke and Sibley (2012), effect sizes were nested within studies, nested within countries. However, studies were not completely nested within countries, but rather studies and countries were two cross-classified factors: in one study, effect sizes could come from different countries, and effect sizes from the same country could belong to different studies. Therefore, a CCREM model would have accounted better for this cross-classified data structure. Discussion: This systematic review shows how researchers using multilevel model typically apply three-level models to account for dependent effect sizes, although alternative model specifications, such as four- or five- level models or CCREMs, might be more correct given the nature of the data. We have given some examples of how alternative models could have been used for meta-analysis, and we encourage researchers to carefully consider the underlying data structure before selecting a specific multilevel model. Omitting levels in a multilevel analysis might increase the possibility of committing a Type I error. Therefore, the proper specification of the model is the only way to guarantee appropriate estimates of the combined effect size, standard errors, and variance components. References: Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19, 211-229. Fernández-Castilla, B., Maes, M., Declercq, L., Jamshidi, L., Beretvas, S. N., Onghena, P., & Van den Noortgate, W. (2018). A demonstration and evaluation of the use of cross-classified random-effects models for meta-analysis. Behavior Research Methods, 1-19. Fischer, R., & Boer, D. (2011). What is more important for national well-being: money or autonomy? A meta-analysis of well-being, burnout, and anxiety across 63 societies. Journal of Personality and Social Psychology, 101, 164-184. Fischer, R., Hanke, K., & Sibley, C. G. (2012). Cultural and institutional determinants of social dominance orientation: A cross‐cultural meta‐analysis of 27 societies. Political Psychology, 33, 437-467. Hox, J. J., & de Leeuw, E. D. (2003). Multilevel models for meta-analysis. In S. P. Reise & N. Duan (Eds.), Multilevel modeling: Methodological advances, issues, and applications (pp. 90–111). Mahwah, NJ: Erlbaum. Klomp, J., & Valckx, K. (2014). Natural disasters and economic growth: A meta-analysis. Global Environmental Change, 26, 183-195. Konstantopoulos, S. (2011). Fixed effects and variance components estimation in three-level meta-analysis. Research Synthesis Methods, 2, 61-76. O’Mara, A. J., Marsh, H. W., & Craven, R. G. (July, 2006). A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions. In Fourth International Biennial SELF Research Conference, Ann Arbor. Rabl, T., Jayasinghe, M., Gerhart, B., & Kühlmann, T. M. (2014). A meta-analysis of country differences in the high-performance work system–business performance relationship: The roles of national culture and managerial discretion. Journal of Applied Psychology, 99, 1011-1041. Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10, 75-98. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45, 576-594. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2014). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47, 1274-1294.
    en_US
  • Citation
    Fernández-Castilla, B., Beretvas, S. N., Onghena, P., & Van Den Noortgate, W. (2019, May 31). Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2478
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/2104
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.2478
  • Language of content
    eng
    en_US
  • Publisher
    ZPID (Leibniz Institute for Psychology Information)
    en_US
  • Is part of
    Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions
    en_US
  • DRO type
    conferenceObject
    en_US
  • Visible tag(s)
    ZPID Conferences and Workshops