Hi everyone,
I am working on a consulting problem which requires the simulation of time series data via bootstrapping and would like to get some feedback from the consulting list regarding the suitability of the method I would like to use to simulate these time series.
The simulations will revolve around the following ideas:
1) We have an original time series y which includes missing values, left-censored values and outliers.
2) We will fit a regression model of the form "linear trend + seasonal component" to the y data. (Presumably, this type of model will accommodate the left-censored values and the outliers.)
3) We will extract the residuals from this model fit and use some form of bootstrap to create B replicate sets of residuals.
4) We need to create bootstrap time series of the form "linear trend + bootstrap residuals" and feed them into the simulations. (The simulations will eventually look at the power to detect a linear trend based on the bootstrap time series.)
For step 2), I am planning on using a nonparametric regression method (e.g., rank regression) in order to extract the residuals required by step 3). Because the original time series has missing values, I will apply this method only to the non-missing values of the series and then make sure I construct a set of augmented residuals that will consist of the residuals obtained from the complete data with missing values inserted wherever we had a missing observation in the original time series. (Is there a better way to do this?)
For step 3), I could use either ARMA bootstrap or maximum entropy bootstrap to take into account the potential serial correlation of the residuals. (Block bootstrap could also be an option.) As far as I know, neither of these two methods was designed to accommodate missing values. Is it OK to apply either of these two bootstrapping methods to the residuals corresponding to the complete observations and then insert missing values for the missing observations? If not, what other approach would be suitable in this situation? For both bootstrap methods, there is a concern that missing values, outliers and censored data values may distort the results - but, at the end of the day, it is important to create time series that include all of the special features of the original time series.
Thank you in advance for any insights you may be able to provide.
Kind regards,
Isabella
-------------------------------------------
Isabella Ghement
Ghement Statistical Consulting Co.
-------------------------------------------