Hi all,
I was trying to calculate confidience intervals for predicted values in GLM with log as the link function. The model in a simple form is built as below (using the gam library for na.gam.replace)
mdl <- glm(FEES ~ VAR1 + CAT2 + IND3,
family = quasi(link = "log", var = "mu"), data = test, na.action = na.gam.replace)
I can think of two ways to calculate the CI for the fitted values.
1) to get SE on response level
yyy <- predict(mdl, type="response", se.fit=TRUE)
lower <- yyy$fit-1.645*yyy$se.fit
upper <- yyy$fit+1.645*yyy$se.fit
2) to get SE on linear level and transform (exp) back. Of course, this won't produce symmetric CIs.
yy <- predict(mdl, type="link", se.fit=TRUE)
ll <- exp(yy$fit-1.645*yy$se.fit)
lu <- exp(yy$fit+1.645*yy$se.fit)
The results differ but not by much. In the particular example I created, CI bounds from 2) shift towards larger values from 1), i.e. both lower bounds and upper bounds are larger.
So which one is the more appropriate way to do it?
Also this is the CI's for the mean predicted value (I think), how do I construct a CI for a single predicted value?
Any comments and suggestions are highly appreciated!
Have a nice weekend,
Ru
-------------------------------------------
Ru Sun
Ernst & Young LLP
-------------------------------------------