Hi folks,
Many of us remember back to our "introduction to regression" classes, in which the instructor had many rules/reminders/tips of how to perform sound regression.
These rules/reminders/tips were typically things to do with checking model assumptions or practical rules of thumb.
Examples include...
- Check if your errors are normally distributed.
- Check for homoscedasticity
- Are your coefficients practically significant? Or just statistically significant?
- Have at least X events for each predictive variable. (https://en.wikipedia.org/wiki/One_in_ten_rule)
- Look for outliers in the error terms.
- If you have interaction terms, include the base terms as well. (Unless you have a good reason not too.)
- Compare your chosen model to the "kitchen sink" model
- Stop adding variables when ________.
- Remove some variables when ________.
No doubt that I'm missing quite a few!
While sticking to the most basic regression methods (ex. linear, logistic, survival), would people mind listing their favorite regression rules-of-thumb?No rule is too trivial or too pithy! (I'm interested in the breadth of advice given to beginners to better understand the priorities teachers have for beginning students.)
Thank you!
------------------------------
Glen Wright Colopy
DPhil Oxon
Data Scientist at Cenduit LLC, Durham, NC
------------------------------