Hi everyone,
How does one construct bootstrap prediction intervals for a linear regression model of the form Y = alpha + beta*X + error when "case resampling" is used (i.e., X is assumed random)? "Case resampling" involves creating bootstrap samples by sampling the (Y,X) observations with replacement.
The reason I ask this question is because several of the literature sources I read argue that using "case resampling" should be preferred in the presence of heteroskedasticity and/or residual correlation (problems of concern in my specific situation).
As I understand it, constructing a prediction interval for a new value of Y should use a double bootstrap loop.
The first loop would be straightforward: generate B bootstrap samples using "case resampling" and use them to produce B point forecasts of the new value of Y.
The second part of the loop is what I am trying to figure out: how do we factor in the additional variability associated with a single observation without "defeating" the "robustness" provided by "case resampling"? (When we choose to re-sample residuals, which we would have to do for this second loop where a resampled residual would be added to the point forecast produced by the first loop, we implicitly assume that the fitted model is correct - something we didn't necessarily have to assume with "case resampling".)
While "case resampling" in the first loop could ignore heteroskedasticity and/or residual correlation, can "residual resampling" in the second loop really ignore these issues?
Any thoughts or references on how to best deal with this would be greatly appreciated.
Thanks,
Isabella
-------------------------------------------
Isabella Ghement
Ghement Statistical Consulting Company Ltd.
E-mail:
Isabella@ghement.ca -------------------------------------------