Another colleague has confirmed that GENLIN is likely to be usable for some of your analysis. If you have SPSS, tweak the simulation so that it more closely fits your situation.
Rayan Black said.
Art,
A GEE/Generalized Linear Model can be fit employing the GENLIN procedure in SPSS. GENLIN is capable of fitting various generalized linear models that account for correlation among repeated measures.
I provide a simple example below which assumes data are derived from a logistic regression equation with correlation among repeated observations. Note that data for 1000 subjects are generated, all of whom are measured 50 times. Also note that a time-varying covariate, x1, was incorporated into the model.
**I wrote the code below very fast. Apologies if there are any typos.
HTH,
Ryan
*Generate Data.
set seed 98765432.
new file.
inp pro.
comp ID = -99.
comp x1 = -99.
comp b0 = -99.
comp b1 = -99.
comp rand_eff = -99.
comp time = -99.
leave ID to time.
loop ID = 1 to 1000.
comp b0 = -0.5.
comp b1 = 1.0.
comp rand_eff = sqrt(.3)*rv.normal(0,1).
loop time = 1 to 50.
comp x1 = rv.normal(2,1).
comp eta = b0 + b1*x1 + rand_eff.
comp p = exp(eta) / (1 + exp(eta)).
comp y = rv.bernoulli(p).
end case.
end loop.
end loop.
end file.
end inp pro.
exe.
Delete variables b0 b1 rand_eff eta p.
GENLIN y (REFERENCE=FIRST) WITH x1
/MODEL x1 INTERCEPT=YES
DISTRIBUTION=BINOMIAL LINK=LOGIT
/REPEATED SUBJECT=ID WITHINSUBJECT=time SORT=YES CORRTYPE=EXCHANGEABLE ADJUSTCORR=YES
COVB=ROBUST MAXITERATIONS=100 PCONVERGE=1e-006(ABSOLUTE) UPDATECORR=1
/PRINT CPS DESCRIPTIVES MODELINFO FIT SUMMARY SOLUTION.
-------------------------------------------
Arthur Kendall
Social Research Consultants
-------------------------------------------
Original Message:
Sent: 05-08-2011 11:37
From: Arthur Kendall
Subject: modeling a binomial outcome measured multiple days
I have not done such an analysis. I checked the documentation for version 19. It appears that the GENLIN procedure in SPSS would do at least some of what you want. It does handle repeated measures with different numbers of repeats. It also has several kinds of "links".
Does your variable about testing have 3 values 1) ordered 2) not ordered 3) don' t know?
I suggest you think about whether there are a series of questions and whether it is necessary that one model answer all of the questions.
Two very experienced colleagues who are not on this list had some reactions when I forwarded the post to them:
Bruce Weaver said.
The data Nancy describes have a multilevel structure, with daily data at level 1 and "one-time" (or patient level) data at level 2. I'm still on v18, but I believe the new multilevel GENLIN procedure in v19 can perform multilevel regression with a binomial error distribution and a variety of link functions (e.g., logit link if you want mulilevel logistic and odds ratios, or an identity link function if you want risk differences, etc). Someone who has v19 may be able to comment further.
Rich Ulrich said.
I'll offer a massive simplification of the statistical problem. I notice that there are 1000 patients, and 502 transfusion events. Since some patients had more than one, there are fewer than 500 patients with a transfusion.... Presumably the others were selected by criteria which do need to be noted. The model should be divided into two questions - "Transfusion: Yes/no"; and "When". Or perhaps the main interest will be satisfied by the first question alone. Or, the answers to the first question should be the starting point for looking at the second question. The question of "When" might be simplified, also, by deciding to model the occurrence of "First transfusion". Whether it is worth looking at additional transfusions could depend on the amount of data available.
-------------------------------------------
Arthur Kendall
Social Research Consultants
-------------------------------------------