Hi everyone,
Recently, I came upon a seemingly simple issue in the context of a Poisson regression model with a log link, but there is one other related issue I am seeking advice on.
Issue No. 1: Poisson regression model with a log linkThe Poisson regression model with a log link model has a count response variable y and a predictor variable x, which is log-transformed prior to being included in the model. If mu denotes the mean value of y at a given x, then mu is modelled as:
log(mu) = beta0 + beta1 * log(x).
After fitting the model, let's say I get estimated values for beta0 and beta1 to be equal to b0 = 0.99, b1 = 0.51 and a really small p-value p for testing the null hypothesis
Ho: beta1 is equal to 0 versus
Ha: beta1 different from 0, which I will report as p < 0.001.
While the relationship between log(mu) and log(x) is linear, that's not what I really care about. What I really care about is the relationship between mu and x (which involves getting rid of the logs on both sides of the above stated model equation).
The graph below shows the linear relationship between the estimated log(mu) and log(x) in the left panel, the relationship between mu and log(x) in the middle panel and the relationship between mu and x in the right panel. This last panel captures the relationship I really care about.

For the panel on the left, let's say that I feel comfortable stating that there is a statistically significant relationship between the estimated log(mu) and log(x) (then quote the p-value in brackets after my statement:
p < 0.001):
1) There is a statistically significant
linear relationship between estimated log(mu) and log(x) (
p < 0.001);
Ignoring the whole controversy about statistical significance for now, what I want to know is whether the same p-value is applicable to statements like the ones below:
2) There is a statistically significant
non-linear relationship between the estimated mu and log(x) (
p < 0.001);
3) There is a statistically significant
non-linear relationship between the estimated mu and log(x) (
p < 0.001);
If 1), 2) and 3) make sense, does the reverse also hold should p come out to be > 0.05? In other words, if the estimated linear relationship between log(mu) and log(x) is statistically non-significant, can we claim the same is the case for the other two non-linear relationships and quote the exact same p-value?
A variation of this question would refer to a Gaussian regression, where mu = beta0 + beta1*log(X) and the relationship of interest would be the one between mu and X (rather than mu and log(X)). In this setting, could we use the same p-value p when talking about the relationship between mu and log(X) and the relationship between mu and X?
Dr. Ben Bolker mentioned on Twitter something about p-values being invariant to monotone transformations but I couldn't find any reference to a result like this and I am also unsure when this result would apply - if we back-transform mu after modelling log(mu)? if we back-transform X after modelling either mu or log(mu) as a function of log(X)?
I know we use this type of reasoning in other models too - for example, in binary logistic regression, we find a statistically significant linear relationship between the log odds of an event and X and then we claim that a statistically significant non-linear relationship exists between the probability of an event and X.
Ultimately, I don't necessarily want to add p-values to statements about the relationships I really care about (especially if those p-values change their value when they are subjected to transformation) - what I want is to make sure I don't make invalid statements and don't draw inaccurate conclusions.
Issue No. 2: Gaussian regression with log(X)Let's say I have a Gaussian regression model of the form: mu = beta0 + beta1 * log(X) and then I plot mu against the untransformed X. What language would you use to describe the resulting model? It is a non-linear model but I am not sure if it can be described as a back-transformed semi-log model (predictor)? Something better?
Similarly for the model log(mu) = beta0 + beta1 * log(X): If we plot mu versus the untransformed X, what language would you use to describe the resulting model? Back-transformed double-log model? Something better?
Thanks in advance for your insights and if the answer is obvious, go easy on me. (:
Isabella
------------------------------
Isabella R. Ghement, Ph.D.
Ghement Statistical Consulting Company Ltd.
E-mail:
isabella@ghement.caTel: 604-767-1250
Web:
www.ghement.ca------------------------------