As always, there are some very good comments here.
Rather than recommending a specific technique in lieu of stepwise regression, I will merely observe that before choosing a technique one must be sure that they understand the desired purpose of the model to be built. Many researchers plunge ahead with a "multivariable model" without a clear understanding of the model's purpose.
For example, in some medical research papers, the primary objective is to determine whether a certain group of patients is at higher risk of some event (say, cardiovascular mortality) than those not in the group, in which case a multivariable model may be used to loosely "adjust" for some potential confounders. In that case, we are less concerned about the model's overall fit and properties, and more concerned about whether the additional covariates in the model are those which would confound the primary relationship of interest - since the primary goal is to answer the question "Is (Group X) at higher risk of (Outcome Y)?"
In other medical research papers, the primary objective is to develop a comprehensive "risk score" - in which case the principal concern is ensuring strong model fit and good predictive accuracy, and the individual factors' effects may matter less than the overall model fit. With the primary goal being accurate prediction of outcome, we must structure our modeling in such a way to achieve that goal (although reasonable people will disagree on how best to achieve that!)
------------------------------
Andrew D. Althouse, PhD
Supervisor of Statistical Projects
UPMC Heart & Vascular Institute
Presbyterian Hospital, Office C701
Phone: 412-802-6811
Email:
althousead@upmc.edu
Original Message:
Sent: 07-13-2016 12:59
From: Sunita Ghosh
Subject: Stepwise Regression
I'm often asked the question about the use of stepwise (backward or forward) regression method. These methods are very popular in building a model by successively adding or removing variables, especially when the researcher is not sure about which variables to keep or remove. It is not highly recommended method by statisticians. I would be very much interested to know your thoughts about this method and what method should be used alternate to this approach.
Thanks,
Sunita
------------------------------
Sunita Ghosh
Research Scientist
Alberta Health Services Cancer Care
------------------------------