Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions

Fernández-Castilla, Belén; Beretvas, S. Natasha; Onghena, Patrick; Van den Noortgate, Wim

Conference Object

Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions

Author(s) / Creator(s)

Fernández-Castilla, Belén

Beretvas, S. Natasha

Onghena, Patrick

Van den Noortgate, Wim

Abstract / Description

Introduction: Meta-analysis can be conceptualized as a multilevel analysis: effect sizes are nested within studies. Effect sizes vary due to sampling variance at Level 1, and possibly also due to systematic differences across studies at Level 2. Therefore, multilevel models and software can be used to perform meta-analysis. An advantage of using the multilevel framework for doing meta-analyses is the flexibility of multilevel models. For instance, additional levels can be added to deal with dependent effect sizes within and between studies. In primary studies, it is common to report multiple effect sizes extracted from the same sample. Also, studies might belong to different higher-level clusters, as countries or research groups. These two scenarios generate dependency among effect sizes, and for appropriately accounting for this dependency (and therefore avoid inflated Type I errors), additional levels can be added that explicitly account for the variation among effect sizes within and/or between studies. Besides hierarchical models, other non-purely hierarchical models have been also proposed for meta-analysis, such as Cross-Classified Random Effects models (CCREMs, Fernández-Castilla et al., 2018). Although multilevel models are very flexible, we suspect that applied researchers do not take advantage of all possibilities that these models offer. In fact, most published meta-analyses are restricted to three-level models despite some meta-analytic data require other model specifications, such as four- or five- level models or CCREMs. Therefore, the goal of this study is to describe how multilevel models are typically applied in meta-analysis and to illustrate how, in some meta-analyses, more sophisticated models could have been applied that accounts better for the (non) hierarchical data structure. Method: Meta-analyses that applied multilevel models with more than one random component were searched in June, 2018. We looked at the meta-analyses that cited the studies of Cheung (2014), Hox and De Leeuw (2003), Konstantopoulos (2011), Raudenbush and Bryk (1985), and Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca (2013, 2014). We also searched in six electronic databases, using the strings “three-level meta-analysis” OR “multilevel meta-analysis” OR “multilevel meta-analytic review”. No date restriction was imposed. Meta-analysis were included if: a) effect sizes were combined using a multilevel model with more than one random component; b) The meta-analysis was included in a journal article, conference presentation or a dissertation; c) The meta-analysis was written in English, Spanish or Dutch. Results: The initial search resulted in 1,286 studies. After applying the inclusion criteria, we finally retrieved 178 meta-analyses. From these, 162 meta-analysis fitted a three-level model, 9 fitted a four-level model, 5 applied CCREMs, and 2 reported a five-level model. We could distinguish five situations in which other models different from the three-level model would have been more appropriate given the (non) hierarchical data structure: 1. A fourth level could have been added to model dependency within studies. For instance, Fischer and Boer (2011) specified a three-level model, were effect sizes (Level 1) were nested within studies (Level 2), nested within countries (Level 3). There were several effect sizes within studies, but this within-study variance was ignored. Therefore, it would have been appropriate to add an additional level to model between-outcomes (within-study) variance. 2. A fourth level could have been specified to deal with more sources of within-study dependencies. For instance, in O’Mara (2006), there were several interventions within studies, and that is why a three-level model was specified: Sampling variance (Level 1), between-interventions variance (Level 2), and between-studies variance (Level 3). However, there were 200 interventions and 460 effect sizes in total, meaning that each intervention led to multiple effect sizes, and that the dependency between these outcomes (within interventions) was not taken into account. A more appropriate model would have been a four-level model: Sampling variance (Level 1), between-outcomes variance (Level 2), between-comparisons variance (Level 3) and between-studies variance (Level 4). 3. A fourth level could have been added to take into account dependency across studies. In the study of Klomp and Valckx (2014), a three-level model was fitted because there were multiple outcomes within studies. In this case, some studies made use of the same big dataset, so a fourth level could have been added to model between-datasets variance. 4. A five-level model could have been applied to model additional within-study and between-study dependencies. In Rabl, Jayasinghe, Gerhart, and Kühlmann (2014), a three-level model was fitted, where effect sizes were nested within studies, nested within countries. There were several effect sizes within studies, so an additional level could have been added to model within-study variance. Furthermore, some studies used the same dataset, so another level could have been specified to estimate the between-datasets variance. The inclusion of these two additional levels would have led to a five-level model. 5. CCREM’s could have been applied instead of three-level models. In the study of Fisher, Hanke and Sibley (2012), effect sizes were nested within studies, nested within countries. However, studies were not completely nested within countries, but rather studies and countries were two cross-classified factors: in one study, effect sizes could come from different countries, and effect sizes from the same country could belong to different studies. Therefore, a CCREM model would have accounted better for this cross-classified data structure. Discussion: This systematic review shows how researchers using multilevel model typically apply three-level models to account for dependent effect sizes, although alternative model specifications, such as four- or five- level models or CCREMs, might be more correct given the nature of the data. We have given some examples of how alternative models could have been used for meta-analysis, and we encourage researchers to carefully consider the underlying data structure before selecting a specific multilevel model. Omitting levels in a multilevel analysis might increase the possibility of committing a Type I error. Therefore, the proper specification of the model is the only way to guarantee appropriate estimates of the combined effect size, standard errors, and variance components. References: Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19, 211-229. Fernández-Castilla, B., Maes, M., Declercq, L., Jamshidi, L., Beretvas, S. N., Onghena, P., & Van den Noortgate, W. (2018). A demonstration and evaluation of the use of cross-classified random-effects models for meta-analysis. Behavior Research Methods, 1-19. Fischer, R., & Boer, D. (2011). What is more important for national well-being: money or autonomy? A meta-analysis of well-being, burnout, and anxiety across 63 societies. Journal of Personality and Social Psychology, 101, 164-184. Fischer, R., Hanke, K., & Sibley, C. G. (2012). Cultural and institutional determinants of social dominance orientation: A cross‐cultural meta‐analysis of 27 societies. Political Psychology, 33, 437-467. Hox, J. J., & de Leeuw, E. D. (2003). Multilevel models for meta-analysis. In S. P. Reise & N. Duan (Eds.), Multilevel modeling: Methodological advances, issues, and applications (pp. 90–111). Mahwah, NJ: Erlbaum. Klomp, J., & Valckx, K. (2014). Natural disasters and economic growth: A meta-analysis. Global Environmental Change, 26, 183-195. Konstantopoulos, S. (2011). Fixed effects and variance components estimation in three-level meta-analysis. Research Synthesis Methods, 2, 61-76. O’Mara, A. J., Marsh, H. W., & Craven, R. G. (July, 2006). A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions. In Fourth International Biennial SELF Research Conference, Ann Arbor. Rabl, T., Jayasinghe, M., Gerhart, B., & Kühlmann, T. M. (2014). A meta-analysis of country differences in the high-performance work system–business performance relationship: The roles of national culture and managerial discretion. Journal of Applied Psychology, 99, 1011-1041. Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10, 75-98. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45, 576-594. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2014). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47, 1274-1294.

Persistent Identifier

https://doi.org/10.23668/psycharchives.2478

Date of first publication

2019-05-31

Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

Fernández-Castilla, B., Beretvas, S. N., Onghena, P., & Van Den Noortgate, W. (2019, May 31). Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2478

1_190529_Belén_Reserach Synthesis Methods.pdf

Adobe PDF - 365.59KB

MD5: adf351c0cf24e50b99558f092635b17f

Sharing Level 0 (Public Use) CC-BY-SA 4.0

Download

Description: Conference Talk

There are no other versions of this object.

Author(s) / Creator(s)

Fernández-Castilla, Belén
Author(s) / Creator(s)

Beretvas, S. Natasha
Author(s) / Creator(s)

Onghena, Patrick
Author(s) / Creator(s)

Van den Noortgate, Wim
PsychArchives acquisition timestamp

2019-06-14T08:34:24Z
Made available on

2019-06-14T08:34:24Z
Date of first publication

2019-05-31
Abstract / Description

Introduction: Meta-analysis can be conceptualized as a multilevel analysis: effect sizes are nested within studies. Effect sizes vary due to sampling variance at Level 1, and possibly also due to systematic differences across studies at Level 2. Therefore, multilevel models and software can be used to perform meta-analysis. An advantage of using the multilevel framework for doing meta-analyses is the flexibility of multilevel models. For instance, additional levels can be added to deal with dependent effect sizes within and between studies. In primary studies, it is common to report multiple effect sizes extracted from the same sample. Also, studies might belong to different higher-level clusters, as countries or research groups. These two scenarios generate dependency among effect sizes, and for appropriately accounting for this dependency (and therefore avoid inflated Type I errors), additional levels can be added that explicitly account for the variation among effect sizes within and/or between studies. Besides hierarchical models, other non-purely hierarchical models have been also proposed for meta-analysis, such as Cross-Classified Random Effects models (CCREMs, Fernández-Castilla et al., 2018). Although multilevel models are very flexible, we suspect that applied researchers do not take advantage of all possibilities that these models offer. In fact, most published meta-analyses are restricted to three-level models despite some meta-analytic data require other model specifications, such as four- or five- level models or CCREMs. Therefore, the goal of this study is to describe how multilevel models are typically applied in meta-analysis and to illustrate how, in some meta-analyses, more sophisticated models could have been applied that accounts better for the (non) hierarchical data structure. Method: Meta-analyses that applied multilevel models with more than one random component were searched in June, 2018. We looked at the meta-analyses that cited the studies of Cheung (2014), Hox and De Leeuw (2003), Konstantopoulos (2011), Raudenbush and Bryk (1985), and Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca (2013, 2014). We also searched in six electronic databases, using the strings “three-level meta-analysis” OR “multilevel meta-analysis” OR “multilevel meta-analytic review”. No date restriction was imposed. Meta-analysis were included if: a) effect sizes were combined using a multilevel model with more than one random component; b) The meta-analysis was included in a journal article, conference presentation or a dissertation; c) The meta-analysis was written in English, Spanish or Dutch. Results: The initial search resulted in 1,286 studies. After applying the inclusion criteria, we finally retrieved 178 meta-analyses. From these, 162 meta-analysis fitted a three-level model, 9 fitted a four-level model, 5 applied CCREMs, and 2 reported a five-level model. We could distinguish five situations in which other models different from the three-level model would have been more appropriate given the (non) hierarchical data structure: 1. A fourth level could have been added to model dependency within studies. For instance, Fischer and Boer (2011) specified a three-level model, were effect sizes (Level 1) were nested within studies (Level 2), nested within countries (Level 3). There were several effect sizes within studies, but this within-study variance was ignored. Therefore, it would have been appropriate to add an additional level to model between-outcomes (within-study) variance. 2. A fourth level could have been specified to deal with more sources of within-study dependencies. For instance, in O’Mara (2006), there were several interventions within studies, and that is why a three-level model was specified: Sampling variance (Level 1), between-interventions variance (Level 2), and between-studies variance (Level 3). However, there were 200 interventions and 460 effect sizes in total, meaning that each intervention led to multiple effect sizes, and that the dependency between these outcomes (within interventions) was not taken into account. A more appropriate model would have been a four-level model: Sampling variance (Level 1), between-outcomes variance (Level 2), between-comparisons variance (Level 3) and between-studies variance (Level 4). 3. A fourth level could have been added to take into account dependency across studies. In the study of Klomp and Valckx (2014), a three-level model was fitted because there were multiple outcomes within studies. In this case, some studies made use of the same big dataset, so a fourth level could have been added to model between-datasets variance. 4. A five-level model could have been applied to model additional within-study and between-study dependencies. In Rabl, Jayasinghe, Gerhart, and Kühlmann (2014), a three-level model was fitted, where effect sizes were nested within studies, nested within countries. There were several effect sizes within studies, so an additional level could have been added to model within-study variance. Furthermore, some studies used the same dataset, so another level could have been specified to estimate the between-datasets variance. The inclusion of these two additional levels would have led to a five-level model. 5. CCREM’s could have been applied instead of three-level models. In the study of Fisher, Hanke and Sibley (2012), effect sizes were nested within studies, nested within countries. However, studies were not completely nested within countries, but rather studies and countries were two cross-classified factors: in one study, effect sizes could come from different countries, and effect sizes from the same country could belong to different studies. Therefore, a CCREM model would have accounted better for this cross-classified data structure. Discussion: This systematic review shows how researchers using multilevel model typically apply three-level models to account for dependent effect sizes, although alternative model specifications, such as four- or five- level models or CCREMs, might be more correct given the nature of the data. We have given some examples of how alternative models could have been used for meta-analysis, and we encourage researchers to carefully consider the underlying data structure before selecting a specific multilevel model. Omitting levels in a multilevel analysis might increase the possibility of committing a Type I error. Therefore, the proper specification of the model is the only way to guarantee appropriate estimates of the combined effect size, standard errors, and variance components. References: Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19, 211-229. Fernández-Castilla, B., Maes, M., Declercq, L., Jamshidi, L., Beretvas, S. N., Onghena, P., & Van den Noortgate, W. (2018). A demonstration and evaluation of the use of cross-classified random-effects models for meta-analysis. Behavior Research Methods, 1-19. Fischer, R., & Boer, D. (2011). What is more important for national well-being: money or autonomy? A meta-analysis of well-being, burnout, and anxiety across 63 societies. Journal of Personality and Social Psychology, 101, 164-184. Fischer, R., Hanke, K., & Sibley, C. G. (2012). Cultural and institutional determinants of social dominance orientation: A cross‐cultural meta‐analysis of 27 societies. Political Psychology, 33, 437-467. Hox, J. J., & de Leeuw, E. D. (2003). Multilevel models for meta-analysis. In S. P. Reise & N. Duan (Eds.), Multilevel modeling: Methodological advances, issues, and applications (pp. 90–111). Mahwah, NJ: Erlbaum. Klomp, J., & Valckx, K. (2014). Natural disasters and economic growth: A meta-analysis. Global Environmental Change, 26, 183-195. Konstantopoulos, S. (2011). Fixed effects and variance components estimation in three-level meta-analysis. Research Synthesis Methods, 2, 61-76. O’Mara, A. J., Marsh, H. W., & Craven, R. G. (July, 2006). A Comprehensive Multilevel Model Meta-Analysis of Self-Concept Interventions. In Fourth International Biennial SELF Research Conference, Ann Arbor. Rabl, T., Jayasinghe, M., Gerhart, B., & Kühlmann, T. M. (2014). A meta-analysis of country differences in the high-performance work system–business performance relationship: The roles of national culture and managerial discretion. Journal of Applied Psychology, 99, 1011-1041. Raudenbush, S. W., & Bryk, A. S. (1985). Empirical Bayes meta-analysis. Journal of Educational Statistics, 10, 75-98. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45, 576-594. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2014). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47, 1274-1294.

en_US
Citation

Fernández-Castilla, B., Beretvas, S. N., Onghena, P., & Van Den Noortgate, W. (2019, May 31). Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2478

en
Persistent Identifier

https://hdl.handle.net/20.500.12034/2104
Persistent Identifier

https://doi.org/10.23668/psycharchives.2478
Language of content

eng

en_US
Publisher

ZPID (Leibniz Institute for Psychology Information)

en_US
Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

en_US
Dewey Decimal Classification number(s)

150
Title

Multilevel Models in Meta-Analysis: A Systematic Review of Their Application and Suggestions

en_US
DRO type

conferenceObject

en_US
Visible tag(s)

ZPID Conferences and Workshops