Conference Object

Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology?

Author(s) / Creator(s)

Kedzior-De Santis, Karina Karolina

Abstract / Description

Background: Systematic reviews are frequently used in psychology to guide future research and to summarise the empirical evidence for decision making. However, the quality of such reviews is not always acceptable (Kedzior & Seehoff, 2018) leading to poor reproducibility of conclusions and outcomes of statistical meta-analysis (Lakens et al., 2016). One method of assessing the quality of systematic reviews is ‘A MeaSurement Tool to Assess Systematic Reviews’ (AMSTAR) (Shea et al., 2007). AMSTAR is an 11-item scale designed to evaluate the quality of various aspects of systematic reviews, including the literature search, the data coding, the risk of bias assessment, and the data synthesis. Although frequently used, the psychometric properties of AMSTAR were criticised (Wegewitz et al., 2016) and a new version of the instrument (AMSTAR2) was developed (Shea et al., 2017). AMSTAR 2 consists of 16 items, including seven being critical for high quality. Objective: The objective of the current study is to investigate if AMSTAR2 is a better tool to assess the quality of systematic reviews than AMSTAR. For this purpose we compare the scores on both scales that we have applied to the same systematic reviews in one specific field (the effects of Tai Chi on psychological well-being in Parkinson’s Disease, PD). Research question: The research question in the current study is: Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology? Method: The literature search, selection of systematic reviews, and quality assessment using AMSTAR and AMSTAR2 were done by each author independently and any inconsistencies were resolved by consensus during discussion. Inclusion and exclusion criteria. We have searched for systematic reviews (with or without meta-analysis) regarding the effects of Tai Chi on symptoms of PD. The exclusion criteria for the current study were: 1) narrative (non-systematic) review, 2) primary study. Search strategy. The search strategy is already described elsewhere (Kedzior & Kaplan, 2018). Briefly, the electronic literature search of PubMed and PsycInfo (on 14.02.2018) identified k=21 studies (Title/Abstract: ‘Parkinson’s Disease’ AND Tai Chi AND review). Inclusion criteria were met by k=10 systematic reviews that were included in the current study. Coding procedures. The data in the k=10 systematic reviews were coded using a self-developed form and the review quality was assessed using AMSTAR (in March 2018) and AMSTAR2 (in June 2018). AMSTAR outcomes vary between 11 (maximum quality) to 0 (minimum quality). AMSTAR2 outcomes vary between high quality (no critical weaknesses) to critically low quality (> one critical weakness). Results: Overall quality assessment. The k=10 systematic reviews on Tai Chi in PD had a mean (±SD) AMSTAR score of 7±2 (range: 3-9, mode: 9, score<6 in 3/10 reviews). Therefore, most reviews (70%) had acceptable to high quality on AMSTAR. However, AMSTAR2 evaluation showed that the same reviews had 1-5/7 critical weaknesses. Therefore, all reviews had a low to critically low quality according to AMSTAR2. Agreement between AMSTAR and AMSTAR2. The inspection of individual items revealed that there was a high agreement between both scales regarding the assessment of most items, including the review protocol, the literature search, the duplicate data extraction, the data coding and synthesis, the risk of bias assessment, the publication bias assessment, and the conflict of interest in the review. Our results also confirm that the quality of AMSTAR2 items has improved. For example, two double-barrelled items on AMSTAR (Item 2 regarding the duplicate study selection and data coding and Item 5 regarding the list of included and excluded studies) are listed as four separate items on AMSTAR2 (Items 5-6 and Items 7-8, respectively). Disagreement between AMSTAR and AMSTAR2. The disagreement between the scales is due to the interpretation of the overall scores (too lenient in AMSTAR and too conservative in AMSTAR2) as well as the focus on critical items that may not have been routinely required/reported in the past reviews. Such items include the presence of the review protocol and the list of excluded studies with justification for exclusion. Since all k=10 systematic reviews had at least one critical weakness (either did not have a priori protocol and/or have not reported the list of excluded studies), they were classified as having low to critical low quality on AMSTAR2. Conclusions and implications: AMSTAR2 may not be a valid tool for assessing the quality of the past systematic reviews because some critical items required for high quality have not been routinely included in journal requirements in the past. However, AMSTAR2 provides excellent guidelines for conducting of future systematic reviews and should be incorporated in journal guidelines for authors. Providing the AMSTAR2 evaluation of own systematic reviews (including the locations where specific items were addressed in own review) could help the authors to conduct high quality reviews and the journal editors and readers to quickly assess the quality of such reviews. References: Kedzior, K., & Kaplan, I. (2018). Scientific quality of systematic reviews on the effects of Tai Chi on well-being in Parkinson’s disease (PD). Systematic Reviews (submitted). Kedzior, K. K., & Seehoff, H. (2018). Common problems with meta-analysis in published reviews on major depressive disorders (MDD): a systematic review. Paper presented at the Research Synthesis Conference 2018 (June 10-12, 2018, Trier, Germany). Lakens, D., Hilgard, J., & Staaks, J. (2016). On the reproducibility of meta-analyses: six practical recommendations. [journal article]. BMC Psychology, 4(1), 24. Shea, B. J., Grimshaw, J. M., Wells, G. A., Boers, M., Andersson, N., Hamel, C., Porter, A. C., Tugwell, P., Moher, D., & Bouter, L. M. (2007). Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Medical Research Methodology, 7(1), 1-7. Shea, B. J., Reeves, B. C., Wells, G., Thuku, M., Hamel, C., Moran, J., Moher, D., Tugwell, P., Welch, V., Kristjansson, E., & Henry, D. A. (2017). AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ, 358, j4008. Wegewitz, U., Weikert, B., Fishta, A., Jacobs, A., & Pieper, D. (2016). Resuming the discussion of AMSTAR: What can (should) be made better? BMC Medical Research Methodology, 16(1), 111.

Persistent Identifier

Date of first publication

2019-05-30

Is part of

Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

Kedzior-De Santis, K. K. (2019, May 30). Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology? ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2479
  • Author(s) / Creator(s)
    Kedzior-De Santis, Karina Karolina
  • PsychArchives acquisition timestamp
    2019-06-14T08:46:07Z
  • Made available on
    2019-06-14T08:46:07Z
  • Date of first publication
    2019-05-30
  • Abstract / Description
    Background: Systematic reviews are frequently used in psychology to guide future research and to summarise the empirical evidence for decision making. However, the quality of such reviews is not always acceptable (Kedzior & Seehoff, 2018) leading to poor reproducibility of conclusions and outcomes of statistical meta-analysis (Lakens et al., 2016). One method of assessing the quality of systematic reviews is ‘A MeaSurement Tool to Assess Systematic Reviews’ (AMSTAR) (Shea et al., 2007). AMSTAR is an 11-item scale designed to evaluate the quality of various aspects of systematic reviews, including the literature search, the data coding, the risk of bias assessment, and the data synthesis. Although frequently used, the psychometric properties of AMSTAR were criticised (Wegewitz et al., 2016) and a new version of the instrument (AMSTAR2) was developed (Shea et al., 2017). AMSTAR 2 consists of 16 items, including seven being critical for high quality. Objective: The objective of the current study is to investigate if AMSTAR2 is a better tool to assess the quality of systematic reviews than AMSTAR. For this purpose we compare the scores on both scales that we have applied to the same systematic reviews in one specific field (the effects of Tai Chi on psychological well-being in Parkinson’s Disease, PD). Research question: The research question in the current study is: Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology? Method: The literature search, selection of systematic reviews, and quality assessment using AMSTAR and AMSTAR2 were done by each author independently and any inconsistencies were resolved by consensus during discussion. Inclusion and exclusion criteria. We have searched for systematic reviews (with or without meta-analysis) regarding the effects of Tai Chi on symptoms of PD. The exclusion criteria for the current study were: 1) narrative (non-systematic) review, 2) primary study. Search strategy. The search strategy is already described elsewhere (Kedzior & Kaplan, 2018). Briefly, the electronic literature search of PubMed and PsycInfo (on 14.02.2018) identified k=21 studies (Title/Abstract: ‘Parkinson’s Disease’ AND Tai Chi AND review). Inclusion criteria were met by k=10 systematic reviews that were included in the current study. Coding procedures. The data in the k=10 systematic reviews were coded using a self-developed form and the review quality was assessed using AMSTAR (in March 2018) and AMSTAR2 (in June 2018). AMSTAR outcomes vary between 11 (maximum quality) to 0 (minimum quality). AMSTAR2 outcomes vary between high quality (no critical weaknesses) to critically low quality (> one critical weakness). Results: Overall quality assessment. The k=10 systematic reviews on Tai Chi in PD had a mean (±SD) AMSTAR score of 7±2 (range: 3-9, mode: 9, score<6 in 3/10 reviews). Therefore, most reviews (70%) had acceptable to high quality on AMSTAR. However, AMSTAR2 evaluation showed that the same reviews had 1-5/7 critical weaknesses. Therefore, all reviews had a low to critically low quality according to AMSTAR2. Agreement between AMSTAR and AMSTAR2. The inspection of individual items revealed that there was a high agreement between both scales regarding the assessment of most items, including the review protocol, the literature search, the duplicate data extraction, the data coding and synthesis, the risk of bias assessment, the publication bias assessment, and the conflict of interest in the review. Our results also confirm that the quality of AMSTAR2 items has improved. For example, two double-barrelled items on AMSTAR (Item 2 regarding the duplicate study selection and data coding and Item 5 regarding the list of included and excluded studies) are listed as four separate items on AMSTAR2 (Items 5-6 and Items 7-8, respectively). Disagreement between AMSTAR and AMSTAR2. The disagreement between the scales is due to the interpretation of the overall scores (too lenient in AMSTAR and too conservative in AMSTAR2) as well as the focus on critical items that may not have been routinely required/reported in the past reviews. Such items include the presence of the review protocol and the list of excluded studies with justification for exclusion. Since all k=10 systematic reviews had at least one critical weakness (either did not have a priori protocol and/or have not reported the list of excluded studies), they were classified as having low to critical low quality on AMSTAR2. Conclusions and implications: AMSTAR2 may not be a valid tool for assessing the quality of the past systematic reviews because some critical items required for high quality have not been routinely included in journal requirements in the past. However, AMSTAR2 provides excellent guidelines for conducting of future systematic reviews and should be incorporated in journal guidelines for authors. Providing the AMSTAR2 evaluation of own systematic reviews (including the locations where specific items were addressed in own review) could help the authors to conduct high quality reviews and the journal editors and readers to quickly assess the quality of such reviews. References: Kedzior, K., & Kaplan, I. (2018). Scientific quality of systematic reviews on the effects of Tai Chi on well-being in Parkinson’s disease (PD). Systematic Reviews (submitted). Kedzior, K. K., & Seehoff, H. (2018). Common problems with meta-analysis in published reviews on major depressive disorders (MDD): a systematic review. Paper presented at the Research Synthesis Conference 2018 (June 10-12, 2018, Trier, Germany). Lakens, D., Hilgard, J., & Staaks, J. (2016). On the reproducibility of meta-analyses: six practical recommendations. [journal article]. BMC Psychology, 4(1), 24. Shea, B. J., Grimshaw, J. M., Wells, G. A., Boers, M., Andersson, N., Hamel, C., Porter, A. C., Tugwell, P., Moher, D., & Bouter, L. M. (2007). Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Medical Research Methodology, 7(1), 1-7. Shea, B. J., Reeves, B. C., Wells, G., Thuku, M., Hamel, C., Moran, J., Moher, D., Tugwell, P., Welch, V., Kristjansson, E., & Henry, D. A. (2017). AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ, 358, j4008. Wegewitz, U., Weikert, B., Fishta, A., Jacobs, A., & Pieper, D. (2016). Resuming the discussion of AMSTAR: What can (should) be made better? BMC Medical Research Methodology, 16(1), 111.
    en_US
  • Citation
    Kedzior-De Santis, K. K. (2019, May 30). Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology? ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2479
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/2105
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.2479
  • Language of content
    eng
    en_US
  • Publisher
    ZPID (Leibniz Institute for Psychology Information)
    en_US
  • Is part of
    Research Synthesis 2019 incl. Pre-Conference Symposium Big Data in Psychology, Dubrovnik, Croatia
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Is AMSTAR2 an appropriate tool to assess the quality of systematic reviews in psychology?
    en_US
  • DRO type
    conferenceObject
    en_US
  • Visible tag(s)
    ZPID Conferences and Workshops