Discussion: View Thread

Back to discussions

Expand all | Collapse all

Would like some advice on model setup

1. Would like some advice on model setup

Recommend
Gabriel Farkas
Posted 02-14-2011 14:19
Hi all,

I was wondering if anyone on this mailing list might be able to give me some advice on a certain problem I've been tasked with. I have a particularly complex and challenging study design, and I'm trying to determine the best way to parameterize and set up the model.

The study design involves each subject getting a number of measurements repeated over the course of 1 or 2 periods. In Period 1, all subjects are observed the same number of times. At each measurement, the presence/absence of a particular Outcome Variable of interest is obtained (as a binary Yes/No), along with several (continuous) Factors believed to be predictive of the Outcome. Note that the presence of the Outcome at time t does not automatically imply it will be present at time t+1 or later times.

If a subject has exhibited presence of the Outcome frequently enough in Period 1 to meet a specified threshold, they proceed to Period 2. Otherwise, they are discontinued and don't have any observations in Period 2. For subjects who proceed to Period 2, there is not a set number of observations like there is in Period 1, but a small range of possible numbers of observations. However, even the subject(s) who end(s) up with the most observations in Period 2 will still have far fewer observations than the number of times they were measured in Period 1.

So, there are two questions I would like to investigate. First, without considering the Outcome at all, I would like to examine if there are any significant differences in the values of any of the Factor_1 through Factor_j, in general, between observations in Period 1 and Period 2, taking into account the multiple observations for the same subject. This might be as simple as a GLM with MANOVA, with each of the Factors as the response and Period as the explanatory? (If I didn't have to account for the multiple observations for the same subject, and there was only 1 observation per subject in each Period, I would probably consider something like a paired t-test.)

Second, if there are any significant differences in how Factor_1 through Factor_j model the Outcome, between observations in Period 1 and Period 2, again taking into account that there are multiple observations for the same subject. If it were just one period, I think it would probably be fairly straightforward, with something like a logistic regression that has Factor_1 through Factor_j as the explanatory variables, the Outcome Variable as the response, a stratified study design (stratified by subject), and looking at which of the Factors are predictive. However, here I'm not interested in which of the Factors have a significant relationship with the Outcome, but rather what (if any) are the significant differences between Period 1 and Period 2 in how the Factors model the Outcome. In other words, I might get a certain value for the coefficient for Factor_2 based on Period 1, and another value for the coefficient for Factor_2 based on Period 2, and would like to know if the difference in these is meaningful or not. One method I was thinking about was a proportional hazards model, with conditional logistic regression, stratifying on a subject level. Another idea was to setup a GEE, but was having trouble coming up with a model that properly accounted for everything.

Any advice you might be able to offer would be most appreciated!

Best Regards,
Gabriel Farkas
2. RE:Would like some advice on model setup

Recommend
Richard Browne
Posted 02-14-2011 14:58
-------------------------------------------
Richard Browne
Texas Scottish Rite Hospital for Children
-------------------------------------------
You must think we're smart.
3. RE:Would like some advice on model setup

Recommend
Susanne Aref
Posted 02-16-2011 14:38
For the first question:
I am a big proc mixed fan and do not use glm if I can avoid it. Not sure if this is something you can use, or I am missing the boat completely:
If you just want to compare periods for each factor accounting for each of the other factors you could do this for the subjects who have observations in both periods:
proc mixed;
class subject period time;
model factor_k = factor_1 ... factor_k-1 factor_k+1 ... factor_j period time/ddfm=sat;
random subject;
repeated/ subject=subject*period type=ar(1);
run;

This is assuming the times repeated are equal distanced, otherwise you need another type. You need ddfm=sat (I used sat rather than kr because I have yet to see them differ and sat runs faster) option in the model line but may have to adjust using ddf= proper ddfs if the subject covariance parameter is estimated to be 0.

Another repeated structure is the cs, easiest incorporated like this:

proc mixed;
class subject period time;
model factor_k = factor_1 ... factor_k-1 factor_k+1 ... factor_j period time;
random subject subject*period;
run;

which is just type=cs in the repeated statement. You won't need or want the ddfm statement here.

Is it of interest to see if there is a difference in the factors for the subjects who only are in period 1 and those who are in both?
Best

Susanne

-------------------------------------------
Susanne Aref
Aref Consulting Group LLC
-------------------------------------------

Discussion: View Thread

Would like some advice on model setup

Gabriel Farkas02-14-2011 14:19

Richard Browne02-14-2011 14:58

Susanne Aref02-16-2011 14:38

1. Would like some advice on model setup

2. RE:Would like some advice on model setup

3. RE:Would like some advice on model setup