Quantifying Replication Value: A formula-based approach to study selection in replication research.
Author(s) / Creator(s)
Isager, Peder Mortvedt
Abstract / Description
Background: The concept of replication is a central value of empirical science. At the same time scientists do not regard every replication as equally valuable. Even though replications are a cornerstone of empirical science (Bertamini & Munafò, 2012; Falk, 1998; Jasny, Chin, Chong, & Vignieri, 2011; Koole & Lakens, 2012; Moonesinghe, Khoury, & Janssens, 2007; Rosenthal, 1990; Schmidt, 2009), most researchers will agree that conducting 20 direct replications of the classic and extremely robust Stroop color-naming task (Stroop, 1935) would not be the best way to spend one’s grant money. This raises an important question: when is a replication of an empirical finding of sufficient valuable to the scientific community that it should be be performed? Given limited resources, one could also ask: which among currently published findings are the most valuable to replicate? Some discussion of the circumstances under which replication efforts are more or less beneficial has already occurred in the wake of increased replication efforts in psychology (Brandt et al., 2014; Coles, Tiokhin, Scheel, Isager, & Lakens, 2018), and recently suggestions have been put forward for how to select target studies for replication (Field, Hoekstra, Bringmann, & van Ravenzwaaij, 2018; Kuehberger & Schulte-Mecklenbeck, 2018). A comprehensive evaluation of the factors that could be used to quantify the replication value of a study is currently lacking, which is becoming increasingly important now that more replication studies are funded, performed, and published. Objectives: We propose a quantitative approach to help researchers, editors and funders evaluate and compare the replication value of original findings. Our approach rests on two fundamental assumptions: (1) That close replication (LeBel, Berger, Campbell, & Loving, 2017; LeBel, McCarthy, Earp, Elson, & Vanpaemel, 2018) is in principle a worthwhile endeavor, and (2) that there are more original observations worth replicating than we currently have the resources to replicate. In order to help researchers determine which among many findings might be the most promising candidates for replication, we outline a formula-based approach that can be relatively easily used to quantify the replication value of original findings. We propose that two components determine the replication value of empirical findings: (i) the impact of the effect, and (ii) the corroboration of the effect. Impact indicates the influence that the effect has had on scientific theory, research activities, or in society. All else being equal, findings that have had more impact are more important to replicate than findings that have had less impact. Corroboration indicates the empirical observations bearing on the finding, as well as the quality of these observations. As the corroboration of a particular finding increase, it becomes less important to replicate this finding, relative to a finding with little corroboration. The purpose of a replication value formula is to clarify how one intends to weigh the factors one considers important against one another, and to standardize parts of the study selection procedure. Because these formulas can be calculated quickly (and sometimes even automatically), they can be powerful tools for exploring a large set of studies to discover original findings that are particularly replication-worthy, assuming that the input to the formula is meaningful. Their ultimate goal is to make sure that resources spent on replication are efficiently utilized and that all relevant options for study choice can be considered when a replication effort is initialized. Research Questions: 1) What factors are considered important for determining the replication value of a particular finding or result? 2) Can we create a formula that is able to yield meaningful quantitative estimates of the relative replication value of empirical findings, based on metrics related to the impact and the corroboration of the finding? Approach & Preliminary results: To assess whether our conceptualization of replication value is in line with evaluations of replication value in the broader community of researchers, and to better understand how replicating researchers justify decisions of study choice, we conducted a literature review of justifications of study selection in 85 replication reports. The literature review suggests that researchers use many different information sources to assess replication value that could be subsumed under the categories of impact and corroboration (e.g. citation impact, theoretical importance, imprecise estimates, lack of prior replication). However, it is also clear that some types of information cannot easily be quantified (e.g. theoretical importance), and it is clear that factors other than the value of replication matter for the evaluation process as well (e.g. feasibility). We are currently in the process of constructing one version of a replication value formula that captures the impact and corroboration of a finding. Once a formula has been derived, we aim to evaluate whether the candidate studies returned by the formula track researchers’ qualitative judgements of relative replication value. We will pursue this through two lines of inquiry. First, we will calculate replication value for a large number of studies in the psychological literature and evaluate the face-validity of the recommendations produced, as well as inspect formula recommendations for examples where the true replication is known to be very high or very low (e.g. Stroop, 1935). Second, we will design a study to assess whether formula-based replication value estimate is able to predict researchers’ intuitive evaluation of replication value. Preliminary assessment of face-validity for data in the Curate Science database suggests that the formula yields sensible estimates of replication value for cases where true replication value is known. At the time of the conference, we expect to have completed a comprehensive evaluation of formula performance for both the Curate Science database and a large sample of published studies from the psychological literature. In addition, we will be able to present the planned experimental design for the study that will compare formula recommendations to researchers’ intuitive judgements. References: Bertamini, M., & Munafò, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7(1), 67–71. https://doi.org/10.1177/1745691611429353
Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., … Van’t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224.
Coles, N., Tiokhin, L., Scheel, A., Isager, P., & Lakens, D. (2018). The Costs and Benefits of Replication Studies. https://doi.org/10.17605/osf.io/c8akj
Falk, R. (1998). Replication-A Step in the Right Direction: Commentary on Sohn. Theory & Psychology, 8(3), 313–321. https://doi.org/10.1177/0959354398083002
Field, S., Hoekstra, R., Bringmann, L., & van Ravenzwaaij, D. (2018). When and Why to Replicate: As Easy as 1, 2, 3? Open Science Framework. https://doi.org/10.17605/osf.io/3rf8b
Jasny, B. R., Chin, G., Chong, L., & Vignieri, S. (2011). Again, and Again, and Again ... Science, 334(6060), 1225–1225. https://doi.org/10.1126/science.334.6060.1225
Koole, S. L., & Lakens, D. (2012). Rewarding Replications: A Sure and Simple Way to Improve Psychological Science. Perspectives on Psychological Science, 7(6), 608–614. https://doi.org/10.1177/1745691612462586
Kuehberger, A., & Schulte-Mecklenbeck, M. (2018). Selecting target papers for replication. Behavioral and Brain Sciences, 41. https://doi.org/10.1017/S0140525X18000742
LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology, 113(2), 254–261. https://doi.org/10.1037/pspi0000106
LeBel, E. P., McCarthy, R. J., Earp, B. D., Elson, M., & Vanpaemel, W. (2018). A Unified Framework to Quantify the Credibility of Scientific Findings. Advances in Methods and Practices in Psychological Science, 1(3), 389–402. https://doi.org/10.1177/2515245918787489
Moonesinghe, R., Khoury, M. J., & Janssens, A. C. J. W. (2007). Most Published Research Findings Are False—But a Little Replication Goes a Long Way. PLoS Medicine, 4(2), e28. https://doi.org/10.1371/journal.pmed.0040028
Rosenthal, R. (1990). Replication in behavioral research. Journal of Social Behavior & Personality, 5(4), 1–30.
Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100. https://doi.org/10.1037/a0015108
Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662. https://doi.org/10.1037/h0054651
Persistent Identifier
Date of first publication
2019-03-13
Is part of
Open Science 2019, Trier, Germany
Publisher
ZPID (Leibniz Institute for Psychology Information)
Citation
Isager, P. M. (2019, March 13). Quantifying Replication Value: A formula-based approach to study selection in replication research. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2392
-
e_Isager_1_ZPID Replication Value presentation.pdfAdobe PDF - 453.09KBMD5: 782939b4ef57b459b763e6848a316254Description: Conference Talk
-
There are no other versions of this object.
-
Author(s) / Creator(s)Isager, Peder Mortvedt
-
PsychArchives acquisition timestamp2019-04-01T15:22:10Z
-
Made available on2019-04-01T15:22:10Z
-
Date of first publication2019-03-13
-
Abstract / DescriptionBackground: The concept of replication is a central value of empirical science. At the same time scientists do not regard every replication as equally valuable. Even though replications are a cornerstone of empirical science (Bertamini & Munafò, 2012; Falk, 1998; Jasny, Chin, Chong, & Vignieri, 2011; Koole & Lakens, 2012; Moonesinghe, Khoury, & Janssens, 2007; Rosenthal, 1990; Schmidt, 2009), most researchers will agree that conducting 20 direct replications of the classic and extremely robust Stroop color-naming task (Stroop, 1935) would not be the best way to spend one’s grant money. This raises an important question: when is a replication of an empirical finding of sufficient valuable to the scientific community that it should be be performed? Given limited resources, one could also ask: which among currently published findings are the most valuable to replicate? Some discussion of the circumstances under which replication efforts are more or less beneficial has already occurred in the wake of increased replication efforts in psychology (Brandt et al., 2014; Coles, Tiokhin, Scheel, Isager, & Lakens, 2018), and recently suggestions have been put forward for how to select target studies for replication (Field, Hoekstra, Bringmann, & van Ravenzwaaij, 2018; Kuehberger & Schulte-Mecklenbeck, 2018). A comprehensive evaluation of the factors that could be used to quantify the replication value of a study is currently lacking, which is becoming increasingly important now that more replication studies are funded, performed, and published. Objectives: We propose a quantitative approach to help researchers, editors and funders evaluate and compare the replication value of original findings. Our approach rests on two fundamental assumptions: (1) That close replication (LeBel, Berger, Campbell, & Loving, 2017; LeBel, McCarthy, Earp, Elson, & Vanpaemel, 2018) is in principle a worthwhile endeavor, and (2) that there are more original observations worth replicating than we currently have the resources to replicate. In order to help researchers determine which among many findings might be the most promising candidates for replication, we outline a formula-based approach that can be relatively easily used to quantify the replication value of original findings. We propose that two components determine the replication value of empirical findings: (i) the impact of the effect, and (ii) the corroboration of the effect. Impact indicates the influence that the effect has had on scientific theory, research activities, or in society. All else being equal, findings that have had more impact are more important to replicate than findings that have had less impact. Corroboration indicates the empirical observations bearing on the finding, as well as the quality of these observations. As the corroboration of a particular finding increase, it becomes less important to replicate this finding, relative to a finding with little corroboration. The purpose of a replication value formula is to clarify how one intends to weigh the factors one considers important against one another, and to standardize parts of the study selection procedure. Because these formulas can be calculated quickly (and sometimes even automatically), they can be powerful tools for exploring a large set of studies to discover original findings that are particularly replication-worthy, assuming that the input to the formula is meaningful. Their ultimate goal is to make sure that resources spent on replication are efficiently utilized and that all relevant options for study choice can be considered when a replication effort is initialized. Research Questions: 1) What factors are considered important for determining the replication value of a particular finding or result? 2) Can we create a formula that is able to yield meaningful quantitative estimates of the relative replication value of empirical findings, based on metrics related to the impact and the corroboration of the finding? Approach & Preliminary results: To assess whether our conceptualization of replication value is in line with evaluations of replication value in the broader community of researchers, and to better understand how replicating researchers justify decisions of study choice, we conducted a literature review of justifications of study selection in 85 replication reports. The literature review suggests that researchers use many different information sources to assess replication value that could be subsumed under the categories of impact and corroboration (e.g. citation impact, theoretical importance, imprecise estimates, lack of prior replication). However, it is also clear that some types of information cannot easily be quantified (e.g. theoretical importance), and it is clear that factors other than the value of replication matter for the evaluation process as well (e.g. feasibility). We are currently in the process of constructing one version of a replication value formula that captures the impact and corroboration of a finding. Once a formula has been derived, we aim to evaluate whether the candidate studies returned by the formula track researchers’ qualitative judgements of relative replication value. We will pursue this through two lines of inquiry. First, we will calculate replication value for a large number of studies in the psychological literature and evaluate the face-validity of the recommendations produced, as well as inspect formula recommendations for examples where the true replication is known to be very high or very low (e.g. Stroop, 1935). Second, we will design a study to assess whether formula-based replication value estimate is able to predict researchers’ intuitive evaluation of replication value. Preliminary assessment of face-validity for data in the Curate Science database suggests that the formula yields sensible estimates of replication value for cases where true replication value is known. At the time of the conference, we expect to have completed a comprehensive evaluation of formula performance for both the Curate Science database and a large sample of published studies from the psychological literature. In addition, we will be able to present the planned experimental design for the study that will compare formula recommendations to researchers’ intuitive judgements. References: Bertamini, M., & Munafò, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7(1), 67–71. https://doi.org/10.1177/1745691611429353 Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., … Van’t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224. Coles, N., Tiokhin, L., Scheel, A., Isager, P., & Lakens, D. (2018). The Costs and Benefits of Replication Studies. https://doi.org/10.17605/osf.io/c8akj Falk, R. (1998). Replication-A Step in the Right Direction: Commentary on Sohn. Theory & Psychology, 8(3), 313–321. https://doi.org/10.1177/0959354398083002 Field, S., Hoekstra, R., Bringmann, L., & van Ravenzwaaij, D. (2018). When and Why to Replicate: As Easy as 1, 2, 3? Open Science Framework. https://doi.org/10.17605/osf.io/3rf8b Jasny, B. R., Chin, G., Chong, L., & Vignieri, S. (2011). Again, and Again, and Again ... Science, 334(6060), 1225–1225. https://doi.org/10.1126/science.334.6060.1225 Koole, S. L., & Lakens, D. (2012). Rewarding Replications: A Sure and Simple Way to Improve Psychological Science. Perspectives on Psychological Science, 7(6), 608–614. https://doi.org/10.1177/1745691612462586 Kuehberger, A., & Schulte-Mecklenbeck, M. (2018). Selecting target papers for replication. Behavioral and Brain Sciences, 41. https://doi.org/10.1017/S0140525X18000742 LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology, 113(2), 254–261. https://doi.org/10.1037/pspi0000106 LeBel, E. P., McCarthy, R. J., Earp, B. D., Elson, M., & Vanpaemel, W. (2018). A Unified Framework to Quantify the Credibility of Scientific Findings. Advances in Methods and Practices in Psychological Science, 1(3), 389–402. https://doi.org/10.1177/2515245918787489 Moonesinghe, R., Khoury, M. J., & Janssens, A. C. J. W. (2007). Most Published Research Findings Are False—But a Little Replication Goes a Long Way. PLoS Medicine, 4(2), e28. https://doi.org/10.1371/journal.pmed.0040028 Rosenthal, R. (1990). Replication in behavioral research. Journal of Social Behavior & Personality, 5(4), 1–30. Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100. https://doi.org/10.1037/a0015108 Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662. https://doi.org/10.1037/h0054651en_US
-
CitationIsager, P. M. (2019, March 13). Quantifying Replication Value: A formula-based approach to study selection in replication research. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2392en
-
Persistent Identifierhttps://hdl.handle.net/20.500.12034/2024
-
Persistent Identifierhttps://doi.org/10.23668/psycharchives.2392
-
Language of contentengen_US
-
PublisherZPID (Leibniz Institute for Psychology Information)en_US
-
Is part ofOpen Science 2019, Trier, Germanyen_US
-
Dewey Decimal Classification number(s)150
-
TitleQuantifying Replication Value: A formula-based approach to study selection in replication research.en_US
-
DRO typeconferenceObjecten_US
-
Visible tag(s)ZPID Conferences and Workshops