Conference Object

Quantifying Replication Value: A formula-based approach to study selection in replication research.

Author(s) / Creator(s)

Isager, Peder Mortvedt

Abstract / Description

Background: The concept of replication is a central value of empirical science. At the same time scientists do not regard every replication as equally valuable. Even though replications are a cornerstone of empirical science (Bertamini & Munafò, 2012; Falk, 1998; Jasny, Chin, Chong, & Vignieri, 2011; Koole & Lakens, 2012; Moonesinghe, Khoury, & Janssens, 2007; Rosenthal, 1990; Schmidt, 2009), most researchers will agree that conducting 20 direct replications of the classic and extremely robust Stroop color-naming task (Stroop, 1935) would not be the best way to spend one’s grant money. This raises an important question: when is a replication of an empirical finding of sufficient valuable to the scientific community that it should be be performed? Given limited resources, one could also ask: which among currently published findings are the most valuable to replicate? Some discussion of the circumstances under which replication efforts are more or less beneficial has already occurred in the wake of increased replication efforts in psychology (Brandt et al., 2014; Coles, Tiokhin, Scheel, Isager, & Lakens, 2018), and recently suggestions have been put forward for how to select target studies for replication (Field, Hoekstra, Bringmann, & van Ravenzwaaij, 2018; Kuehberger & Schulte-Mecklenbeck, 2018). A comprehensive evaluation of the factors that could be used to quantify the replication value of a study is currently lacking, which is becoming increasingly important now that more replication studies are funded, performed, and published. Objectives: We propose a quantitative approach to help researchers, editors and funders evaluate and compare the replication value of original findings. Our approach rests on two fundamental assumptions: (1) That close replication (LeBel, Berger, Campbell, & Loving, 2017; LeBel, McCarthy, Earp, Elson, & Vanpaemel, 2018) is in principle a worthwhile endeavor, and (2) that there are more original observations worth replicating than we currently have the resources to replicate. In order to help researchers determine which among many findings might be the most promising candidates for replication, we outline a formula-based approach that can be relatively easily used to quantify the replication value of original findings. We propose that two components determine the replication value of empirical findings: (i) the impact of the effect, and (ii) the corroboration of the effect. Impact indicates the influence that the effect has had on scientific theory, research activities, or in society. All else being equal, findings that have had more impact are more important to replicate than findings that have had less impact. Corroboration indicates the empirical observations bearing on the finding, as well as the quality of these observations. As the corroboration of a particular finding increase, it becomes less important to replicate this finding, relative to a finding with little corroboration. The purpose of a replication value formula is to clarify how one intends to weigh the factors one considers important against one another, and to standardize parts of the study selection procedure. Because these formulas can be calculated quickly (and sometimes even automatically), they can be powerful tools for exploring a large set of studies to discover original findings that are particularly replication-worthy, assuming that the input to the formula is meaningful. Their ultimate goal is to make sure that resources spent on replication are efficiently utilized and that all relevant options for study choice can be considered when a replication effort is initialized. Research Questions: 1) What factors are considered important for determining the replication value of a particular finding or result? 2) Can we create a formula that is able to yield meaningful quantitative estimates of the relative replication value of empirical findings, based on metrics related to the impact and the corroboration of the finding? Approach & Preliminary results: To assess whether our conceptualization of replication value is in line with evaluations of replication value in the broader community of researchers, and to better understand how replicating researchers justify decisions of study choice, we conducted a literature review of justifications of study selection in 85 replication reports. The literature review suggests that researchers use many different information sources to assess replication value that could be subsumed under the categories of impact and corroboration (e.g. citation impact, theoretical importance, imprecise estimates, lack of prior replication). However, it is also clear that some types of information cannot easily be quantified (e.g. theoretical importance), and it is clear that factors other than the value of replication matter for the evaluation process as well (e.g. feasibility). We are currently in the process of constructing one version of a replication value formula that captures the impact and corroboration of a finding. Once a formula has been derived, we aim to evaluate whether the candidate studies returned by the formula track researchers’ qualitative judgements of relative replication value. We will pursue this through two lines of inquiry. First, we will calculate replication value for a large number of studies in the psychological literature and evaluate the face-validity of the recommendations produced, as well as inspect formula recommendations for examples where the true replication is known to be very high or very low (e.g. Stroop, 1935). Second, we will design a study to assess whether formula-based replication value estimate is able to predict researchers’ intuitive evaluation of replication value. Preliminary assessment of face-validity for data in the Curate Science database suggests that the formula yields sensible estimates of replication value for cases where true replication value is known. At the time of the conference, we expect to have completed a comprehensive evaluation of formula performance for both the Curate Science database and a large sample of published studies from the psychological literature. In addition, we will be able to present the planned experimental design for the study that will compare formula recommendations to researchers’ intuitive judgements. References: Bertamini, M., & Munafò, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7(1), 67–71. https://doi.org/10.1177/1745691611429353 Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., … Van’t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224. Coles, N., Tiokhin, L., Scheel, A., Isager, P., & Lakens, D. (2018). The Costs and Benefits of Replication Studies. https://doi.org/10.17605/osf.io/c8akj Falk, R. (1998). Replication-A Step in the Right Direction: Commentary on Sohn. Theory & Psychology, 8(3), 313–321. https://doi.org/10.1177/0959354398083002 Field, S., Hoekstra, R., Bringmann, L., & van Ravenzwaaij, D. (2018). When and Why to Replicate: As Easy as 1, 2, 3? Open Science Framework. https://doi.org/10.17605/osf.io/3rf8b Jasny, B. R., Chin, G., Chong, L., & Vignieri, S. (2011). Again, and Again, and Again ... Science, 334(6060), 1225–1225. https://doi.org/10.1126/science.334.6060.1225 Koole, S. L., & Lakens, D. (2012). Rewarding Replications: A Sure and Simple Way to Improve Psychological Science. Perspectives on Psychological Science, 7(6), 608–614. https://doi.org/10.1177/1745691612462586 Kuehberger, A., & Schulte-Mecklenbeck, M. (2018). Selecting target papers for replication. Behavioral and Brain Sciences, 41. https://doi.org/10.1017/S0140525X18000742 LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology, 113(2), 254–261. https://doi.org/10.1037/pspi0000106 LeBel, E. P., McCarthy, R. J., Earp, B. D., Elson, M., & Vanpaemel, W. (2018). A Unified Framework to Quantify the Credibility of Scientific Findings. Advances in Methods and Practices in Psychological Science, 1(3), 389–402. https://doi.org/10.1177/2515245918787489 Moonesinghe, R., Khoury, M. J., & Janssens, A. C. J. W. (2007). Most Published Research Findings Are False—But a Little Replication Goes a Long Way. PLoS Medicine, 4(2), e28. https://doi.org/10.1371/journal.pmed.0040028 Rosenthal, R. (1990). Replication in behavioral research. Journal of Social Behavior & Personality, 5(4), 1–30. Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100. https://doi.org/10.1037/a0015108 Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662. https://doi.org/10.1037/h0054651

Persistent Identifier

Date of first publication

2019-03-13

Is part of

Open Science 2019, Trier, Germany

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

Isager, P. M. (2019, March 13). Quantifying Replication Value: A formula-based approach to study selection in replication research. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2392
  • Author(s) / Creator(s)
    Isager, Peder Mortvedt
  • PsychArchives acquisition timestamp
    2019-04-01T15:22:10Z
  • Made available on
    2019-04-01T15:22:10Z
  • Date of first publication
    2019-03-13
  • Abstract / Description
    Background: The concept of replication is a central value of empirical science. At the same time scientists do not regard every replication as equally valuable. Even though replications are a cornerstone of empirical science (Bertamini & Munafò, 2012; Falk, 1998; Jasny, Chin, Chong, & Vignieri, 2011; Koole & Lakens, 2012; Moonesinghe, Khoury, & Janssens, 2007; Rosenthal, 1990; Schmidt, 2009), most researchers will agree that conducting 20 direct replications of the classic and extremely robust Stroop color-naming task (Stroop, 1935) would not be the best way to spend one’s grant money. This raises an important question: when is a replication of an empirical finding of sufficient valuable to the scientific community that it should be be performed? Given limited resources, one could also ask: which among currently published findings are the most valuable to replicate? Some discussion of the circumstances under which replication efforts are more or less beneficial has already occurred in the wake of increased replication efforts in psychology (Brandt et al., 2014; Coles, Tiokhin, Scheel, Isager, & Lakens, 2018), and recently suggestions have been put forward for how to select target studies for replication (Field, Hoekstra, Bringmann, & van Ravenzwaaij, 2018; Kuehberger & Schulte-Mecklenbeck, 2018). A comprehensive evaluation of the factors that could be used to quantify the replication value of a study is currently lacking, which is becoming increasingly important now that more replication studies are funded, performed, and published. Objectives: We propose a quantitative approach to help researchers, editors and funders evaluate and compare the replication value of original findings. Our approach rests on two fundamental assumptions: (1) That close replication (LeBel, Berger, Campbell, & Loving, 2017; LeBel, McCarthy, Earp, Elson, & Vanpaemel, 2018) is in principle a worthwhile endeavor, and (2) that there are more original observations worth replicating than we currently have the resources to replicate. In order to help researchers determine which among many findings might be the most promising candidates for replication, we outline a formula-based approach that can be relatively easily used to quantify the replication value of original findings. We propose that two components determine the replication value of empirical findings: (i) the impact of the effect, and (ii) the corroboration of the effect. Impact indicates the influence that the effect has had on scientific theory, research activities, or in society. All else being equal, findings that have had more impact are more important to replicate than findings that have had less impact. Corroboration indicates the empirical observations bearing on the finding, as well as the quality of these observations. As the corroboration of a particular finding increase, it becomes less important to replicate this finding, relative to a finding with little corroboration. The purpose of a replication value formula is to clarify how one intends to weigh the factors one considers important against one another, and to standardize parts of the study selection procedure. Because these formulas can be calculated quickly (and sometimes even automatically), they can be powerful tools for exploring a large set of studies to discover original findings that are particularly replication-worthy, assuming that the input to the formula is meaningful. Their ultimate goal is to make sure that resources spent on replication are efficiently utilized and that all relevant options for study choice can be considered when a replication effort is initialized. Research Questions: 1) What factors are considered important for determining the replication value of a particular finding or result? 2) Can we create a formula that is able to yield meaningful quantitative estimates of the relative replication value of empirical findings, based on metrics related to the impact and the corroboration of the finding? Approach & Preliminary results: To assess whether our conceptualization of replication value is in line with evaluations of replication value in the broader community of researchers, and to better understand how replicating researchers justify decisions of study choice, we conducted a literature review of justifications of study selection in 85 replication reports. The literature review suggests that researchers use many different information sources to assess replication value that could be subsumed under the categories of impact and corroboration (e.g. citation impact, theoretical importance, imprecise estimates, lack of prior replication). However, it is also clear that some types of information cannot easily be quantified (e.g. theoretical importance), and it is clear that factors other than the value of replication matter for the evaluation process as well (e.g. feasibility). We are currently in the process of constructing one version of a replication value formula that captures the impact and corroboration of a finding. Once a formula has been derived, we aim to evaluate whether the candidate studies returned by the formula track researchers’ qualitative judgements of relative replication value. We will pursue this through two lines of inquiry. First, we will calculate replication value for a large number of studies in the psychological literature and evaluate the face-validity of the recommendations produced, as well as inspect formula recommendations for examples where the true replication is known to be very high or very low (e.g. Stroop, 1935). Second, we will design a study to assess whether formula-based replication value estimate is able to predict researchers’ intuitive evaluation of replication value. Preliminary assessment of face-validity for data in the Curate Science database suggests that the formula yields sensible estimates of replication value for cases where true replication value is known. At the time of the conference, we expect to have completed a comprehensive evaluation of formula performance for both the Curate Science database and a large sample of published studies from the psychological literature. In addition, we will be able to present the planned experimental design for the study that will compare formula recommendations to researchers’ intuitive judgements. References: Bertamini, M., & Munafò, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7(1), 67–71. https://doi.org/10.1177/1745691611429353 Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., … Van’t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224. Coles, N., Tiokhin, L., Scheel, A., Isager, P., & Lakens, D. (2018). The Costs and Benefits of Replication Studies. https://doi.org/10.17605/osf.io/c8akj Falk, R. (1998). Replication-A Step in the Right Direction: Commentary on Sohn. Theory & Psychology, 8(3), 313–321. https://doi.org/10.1177/0959354398083002 Field, S., Hoekstra, R., Bringmann, L., & van Ravenzwaaij, D. (2018). When and Why to Replicate: As Easy as 1, 2, 3? Open Science Framework. https://doi.org/10.17605/osf.io/3rf8b Jasny, B. R., Chin, G., Chong, L., & Vignieri, S. (2011). Again, and Again, and Again ... Science, 334(6060), 1225–1225. https://doi.org/10.1126/science.334.6060.1225 Koole, S. L., & Lakens, D. (2012). Rewarding Replications: A Sure and Simple Way to Improve Psychological Science. Perspectives on Psychological Science, 7(6), 608–614. https://doi.org/10.1177/1745691612462586 Kuehberger, A., & Schulte-Mecklenbeck, M. (2018). Selecting target papers for replication. Behavioral and Brain Sciences, 41. https://doi.org/10.1017/S0140525X18000742 LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology, 113(2), 254–261. https://doi.org/10.1037/pspi0000106 LeBel, E. P., McCarthy, R. J., Earp, B. D., Elson, M., & Vanpaemel, W. (2018). A Unified Framework to Quantify the Credibility of Scientific Findings. Advances in Methods and Practices in Psychological Science, 1(3), 389–402. https://doi.org/10.1177/2515245918787489 Moonesinghe, R., Khoury, M. J., & Janssens, A. C. J. W. (2007). Most Published Research Findings Are False—But a Little Replication Goes a Long Way. PLoS Medicine, 4(2), e28. https://doi.org/10.1371/journal.pmed.0040028 Rosenthal, R. (1990). Replication in behavioral research. Journal of Social Behavior & Personality, 5(4), 1–30. Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100. https://doi.org/10.1037/a0015108 Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643–662. https://doi.org/10.1037/h0054651
    en_US
  • Citation
    Isager, P. M. (2019, March 13). Quantifying Replication Value: A formula-based approach to study selection in replication research. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2392
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/2024
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.2392
  • Language of content
    eng
    en_US
  • Publisher
    ZPID (Leibniz Institute for Psychology Information)
    en_US
  • Is part of
    Open Science 2019, Trier, Germany
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Quantifying Replication Value: A formula-based approach to study selection in replication research.
    en_US
  • DRO type
    conferenceObject
    en_US
  • Visible tag(s)
    ZPID Conferences and Workshops