Article Version of Record

Which robust regression technique is appropriate under violated assumptions? A simulation study

Author(s) / Creator(s)

Kim, Jaejin
Li, Johnson Ching-Hong

Abstract / Description

Ordinary least squares (OLS) regression is widely employed for statistical prediction and theoretical explanation in psychology studies. However, OLS regression has a critical drawback: it becomes less accurate in the presence of outliers and non-random error distribution. Several robust regression methods have been proposed as alternatives. However, each robust regression has its own strengths and limitations. Consequently, researchers are often at a loss as to which robust regression method to use for their studies. This study uses a Monte Carlo experiment to compare different types of robust regression methods with OLS regression based on relative efficiency (RE), bias, root mean squared error (RMSE), Type 1 error, power, coverage probability of the 95% confidence intervals (CIs), and the width of the CIs. The results show that, with sufficient samples per predictor (n = 100), the robust regression methods are as efficient as OLS regression. When errors follow non-normal distributions, i.e., mixed-normal, symmetric and heavy-tailed (SH), asymmetric and relatively light-tailed (AL), asymmetric and heavy-tailed (AH), and heteroscedastic, the robust method (GM-estimation) seems to consistently outperform OLS regression.

Keyword(s)

robust regression OLS regression outliers Type I error power

Persistent Identifier

Date of first publication

2023-12-22

Journal title

Methodology

Volume

19

Issue

4

Page numbers

323–347

Publisher

PsychOpen GOLD

Publication status

publishedVersion

Review status

peerReviewed

Is version of

Citation

Kim, J. & Li, J. C. (2023). Which robust regression technique is appropriate under violated assumptions? A simulation study. Methodology, 19(4), 323-347. https://doi.org/10.5964/meth.8285
  • Author(s) / Creator(s)
    Kim, Jaejin
  • Author(s) / Creator(s)
    Li, Johnson Ching-Hong
  • PsychArchives acquisition timestamp
    2024-03-19T11:02:03Z
  • Made available on
    2024-03-19T11:02:03Z
  • Date of first publication
    2023-12-22
  • Abstract / Description
    Ordinary least squares (OLS) regression is widely employed for statistical prediction and theoretical explanation in psychology studies. However, OLS regression has a critical drawback: it becomes less accurate in the presence of outliers and non-random error distribution. Several robust regression methods have been proposed as alternatives. However, each robust regression has its own strengths and limitations. Consequently, researchers are often at a loss as to which robust regression method to use for their studies. This study uses a Monte Carlo experiment to compare different types of robust regression methods with OLS regression based on relative efficiency (RE), bias, root mean squared error (RMSE), Type 1 error, power, coverage probability of the 95% confidence intervals (CIs), and the width of the CIs. The results show that, with sufficient samples per predictor (n = 100), the robust regression methods are as efficient as OLS regression. When errors follow non-normal distributions, i.e., mixed-normal, symmetric and heavy-tailed (SH), asymmetric and relatively light-tailed (AL), asymmetric and heavy-tailed (AH), and heteroscedastic, the robust method (GM-estimation) seems to consistently outperform OLS regression.
    en_US
  • Publication status
    publishedVersion
  • Review status
    peerReviewed
  • Citation
    Kim, J. & Li, J. C. (2023). Which robust regression technique is appropriate under violated assumptions? A simulation study. Methodology, 19(4), 323-347. https://doi.org/10.5964/meth.8285
    en_US
  • ISSN
    1614-2241
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/9788
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.14329
  • Language of content
    eng
  • Publisher
    PsychOpen GOLD
  • Is version of
    https://doi.org/10.5964/meth.8285
  • Is related to
    https://doi.org/10.23668/psycharchives.13979
  • Is related to
    https://doi.org/10.23668/psycharchives.13980
  • Keyword(s)
    robust regression
    en_US
  • Keyword(s)
    OLS regression
    en_US
  • Keyword(s)
    outliers
    en_US
  • Keyword(s)
    Type I error
    en_US
  • Keyword(s)
    power
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Which robust regression technique is appropriate under violated assumptions? A simulation study
    en_US
  • DRO type
    article
  • Issue
    4
  • Journal title
    Methodology
  • Page numbers
    323–347
  • Volume
    19
  • Visible tag(s)
    Version of Record
    en_US