Thanks for your detailed answer, Keith! Setting aside your power question for the time being, here is what I am thinking.
Patients in your study get either treatment A or treatment B for their disease. Not knowing anything about the disease you are studying, I wonder how quickly after they receive treatment these patients die. I also wonder whether, before they die, these patients can receive a single treatment once (e.g., A), a single treatment repeatedly over time (e.g., A, A, A) or various sequences of combined treatments A and B over time (e.g., A, B, B, A, B, etc.).
In any event, it seems to me that receiving a treatment (sequence of treatments) only tells half of the story. The other half of the story has to do with whether the treatment (sequence of treatments) had the capacity to prolong these patients' lives from the first time it was administered. If 90% of the patients in a particular census tract received treatment A once during the study (say) but all of these patients died within a month from receiving this treatment - is that enough information for someone to be able to make clinical decisions? However, if the remaining 10% of the people in that census tract - who received treatment B - died within 5 years from treatment initiation, then one would have to wonder if future patients (with similar covariates) should be switched to treatment B?
Perhaps there is some geographic factor that differentiates the two treatments (e.g., treatment A is less expensive and is prescribed to those in poorer regions of New York). But if no information is available on how the treatments help prolong life, it might be hard to compare percentages of people who (last) received treatment A or treatment B.
Lance is much better qualified than I am to provide guidance on the power calculations. But I think the study findings should be amenable to clinical decision-making - unless perhaps I don't understand other aspects of this study.
------------------------------
Isabella Ghement
Ghement Statistical Consulting Company Ltd.
Original Message:
Sent: 01-20-2016 10:16
From: Keith Goldfeld
Subject: Power analysis for census tract level analysis
Isabella -
Thanks for your note. I provided a little more detail in response to Lance - but can repeat some of it here.
Why would variation in disease prevalence at the census tract level be interesting to uncover from a clinical perspective? Is that type of variation something that can be factored in when treating the disease? How else would knowledge of that variation affect clinical or epidemiological practice?
We are looking at deaths from a particular disease and are interested if they received treatment A or treatment B prior to death. We know that in different parts of the city more folks get treatment A than B. Our goal is to identify clusters where treatment A predominates so that we can compare them to clusters where treatment B predominates. Ideally, these clusters would be as similar as possible except for their outcomes with respect to A and B.
Is the disease itself common or rare?
Deaths are pretty rare - less than 1% of the population. But NYC has a large population, so there are a considerable number of cases citywide. And we are interested really not so much in the number of deaths, but in comparing the numbers of those with treatment A vs B, where both treatments are not rare.
Can you clarify if you'll have access to multiple years worth of data and also if you are planning to include any other explanatory variables in your model?
Yes, we will have multiple years - maybe as many as 10. Even though 10 is a lot, we believe that the patterns of A vs B are fairly stable, though of course we will be finding that out.
- Keith
------------------------------
Keith Goldfeld
NYU School of Medicine