Discussion: View Thread

Power -- ITT and attrition

  • 1.  Power -- ITT and attrition

    Posted 08-18-2011 11:30
    Hi Everyone,
    I have a question about sample size estimation for a proposal.  Suppose we are looking at a longitudinal study that will examine the efficacy of an intervention vs control group.  The primary outcome is a count measure, and thus will rely on GEE modeling.  If one were to propose an intent to treat analysis, thus incorporating an appropriate approach to imputation, is it common to inflate the initial sample size calculation to accommodate for expected attrition?  Thanks for your insight/experience in this area.
    AH  

    -------------------------------------------
    Alexandra Hanlon
    -------------------------------------------


  • 2.  RE:Power -- ITT and attrition

    Posted 08-18-2011 13:57
    Absolutely necessary.
    As a grant reader, that is part of how I judge whether
    an investigator has thought through the proposal and its implementation.

    Imputation does not eliminate this problem at all.
     and intention to treat may exacerbate the problem in a longitudinal study.

    -------------------------------------------
    Raymond Hoffmann
    Professor
    Medical College of Wisconsin
    -------------------------------------------








  • 3.  RE:Power -- ITT and attrition

    Posted 08-18-2011 14:21
    Hi,
    It is a common practice. However, depending on the magnitude of the attrition, some degree of bias in the computed estimate may be introduced. One can not assume the data are missing completely at random. Increasing the sample size will only give you better precision regarding a potentially biased estimate. Every effort should be made to limit the amount of missing data due to attrition. Limiting missing data includes double checking/confirming you have the most appropriate endpoint, the length of the study is appropriate (aka, would not want an endpoint at say, day 21, if subjects are likely to drop out around day 14 or so.
    -------------------------------------------
    Kari Kastango
    Sr. Biostatistician
    Sunovion Pharmaceuticals Inc.
    -------------------------------------------








  • 4.  RE:Power -- ITT and attrition

    Posted 08-18-2011 16:36

    As I said in a private message this approach is commonly taken and is not a bad idea.  As you can see from later responses there is not universal agreement on whether or not to do it or how to do it.  I suggest being conservative and overestimate the required sample size a little.  I don't understand what Kari means by a biased estimate.  Are we trying to somehow get an unbiased estimate of the sample size?? I think not.  Estimating the required sample size is always somewhat of a crap shoot as there are so many things we need to guess at to determine a sample size that yields a specified power.  That is why I am a big proponent of the adptive two-stage design with sample size reestimation at the first stage.  Once you are into the trial you can have a better idea of dropout rate, probably effect size and variability of the estimate of the statistic of interest (often a mean difference between groups).  No one wants to grossly overestimate the sample size but an mild overestimate is better than a gross underestimate that leaves you short on power and at risk of not being able to detect a true difference.  At least with the adaptive approach the guesses are not wild after the first stage.

    I don't understand some of the complicated discussion about missingness and ITT.  If you get a dropout and collect no data on a subject I don't see how you can use that patient in the analysis.  If you have some data then you should use the patient.  But all you are trying to do is guess how many patients will enter the trial but not go far enough for you to collect data on them.  This does not require complicated analysis and/or simulation.  The important thing is to be a little conservative.  The job is easier if you are able to take the adaptive approach.
    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 5.  RE:Power -- ITT and attrition

    Posted 08-18-2011 17:25

    Clearly the Kastango comment regarding biased estimation refers to bias in the outcome estimate resulting from differential attrition.
    -------------------------------------------
    Bradley Huitema
    Western Michigan University
    -------------------------------------------








  • 6.  RE:Power -- ITT and attrition

    Posted 08-18-2011 17:30
    Okay, but that gets away from the question at hand which was sample size estimation.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 7.  RE:Power -- ITT and attrition

    Posted 08-18-2011 17:42

    In a private message I got from Edith Zhang she points out that sensitivity analysis is often done at the end of the trial to be assured that attrition did not create a bias that could result in a change in conclusion about efficacy based on the sensitivity.  She posited that the extreme sensitivity based on counting all drop outs as failures in the treatment group might be needed to show unequivocal efficacy.  I think this is extreme and certainly if you achieve a favorable outcome even with the extreme sensitivity you would win the day with the FDA.  But if the result leaves the superiority in question (assuming this is a superiority trial) I think I would try to argue for approval based on the original result and some less extreme sensitivities.

    This is all very interesting discussion but I really do think the question only pertained to sample size adjustment for dropout and not what analysis should be used in the event of dropouts.  I think for the sake of Alexandra we should talk about what she wanted to know and these other topics could be posed separately for those interested.
    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 8.  RE:Power -- ITT and attrition

    Posted 08-18-2011 17:57

    You've raised an interesting point " ...dropout and not what analysis should be used in the event of dropouts..."
     
    Typically when we plan a study that we expect to present to FDA we (to the extent possible) develop a sample size for the actual statistical test we plan to use in the analysis.  It is definitely not a requirement of any kind. Software such as EAST have simulation features, for power and sample size when using group sequential methods.



    -------------------------------------------
    Chris Barker, Ph.D.
    President - San Francisco Bay Area Chapter of the American Statistical Association
    www,barkerstats.com
    -------------------------------------------








  • 9.  RE:Power -- ITT and attrition

    Posted 08-18-2011 14:38
    In my opinion, ITT sample size calculations are almost always done incorrectly (and I am guilty of
    that crime, also).  Typically, if one does a "usual" calculation, whether for a longitudinal or fixed time study (it doesn't matter), the common practice is to inflate the required n to allow for drop out, attrition, etc.  However, this approach assumes that the drop outs will not be included in the analysis.  In ITT, the drop outs are supposed to be included, even if the outcome is assumed to be a treatment failure.

    A more correct approach (illustrated with a binary response endpoint). In your power calculation suppose you assume that in the control group (C) the response rate is 30% and that in the intervention group (I) the rate is to be 50%.  nQuery says that for 80% power, you need n=93 per arm (chi square, a=0.05).  You then assume that 20% of the I subjects drop out and revert to control, so that for those 20%, their true response rate is 30% (it could be higher if they benefit from some of the I dose that they got...).  Then, the true response rate in the I group is the weighted percentage: (.2x30%) + (.8x50%) = 46%.  You now have to re-do the calculation comparing 46% to 30%, which yields n=144 per arm.  (This example was made simple by assuming that only I subjects drop out;  you could do the same kind of weighted calculation if C subjects also dropped out.  You just need to make assumptions about d/o rates and response rates for the d/o's.)

    Of course, you can make other types of assumptions about what the response rate is for a dropout. Also, it's a bit more difficult to do this with continuous outcomes.

    Moral of the story: Just increasing sample size to account for attrition does not deal with ITT.  In fact, what one is doing is stating the sample size for the Per Protocol analysis.  One must factor in the weighted outcome corresponding to what happens to drop outs in each arm.
    -------------------------------------------
    ===================================
    Martin L Lesser, PhD, EMT-CC
    Director and Investigator,
       Biostatistics Unit,
       Feinstein Institute for Medical Research
    Professor, Dep't of Molecular Medicine,
       Hofstra North Shore-LIJ School of
          Medicine
    Chair, IRB Committee "B"
     
    Mailing Address:
    Biostatistics Unit
    Feinstein Institute for Medical Research
    North Shore - LIJ Health System
    350 Community Drive
    Manhasset, NY  11030
    Phone: 516-562-0300
    FAX: 516-562-0344
     
    -------------------------------------------








  • 10.  RE:Power -- ITT and attrition

    Posted 08-18-2011 15:11

    Agree with Dr. Lesser completely.  In cases where the treatment of LTFU are more complex, like continuous cases, frequently simulation is the way to understand the impact.  You could set up a simulation of subjects with different drop patterns and the final analysis you will do -- however you impute (or multiply impute subjects) to understand the impact to power.  

    This is certainly much more work... (but you can simulate missing not at random, but a function of outcome, etc)

    Then of course, you may be more than half way to doing a flexible sample size to escape the problems with fixed sample sizes and power!

    -------------------------------------------
    Scott Berry, PhD
    Statistical Scientist
    Berry Consultants
    -------------------------------------------








  • 11.  RE:Power -- ITT and attrition

    Posted 08-18-2011 16:25

    Drop out may not be the only problem to consider.

    If you are preparing a survival analysis, Frank Harrel's HMISC/Design library, in R, has a function "spower() "
    based on  papers by Lakatos, and others,  to simulate power, in a 2 sample test of survival. You  can vary drop -in/drop -out rates, compliance,  etc.

    A link to  the help file and citations therein.

    http://pinard.progiciels-bpi.ca/libR/library/Hmisc/html/spower.html


    -------------------------------------------
    Chris Barker, Ph.D.
    President - San Francisco Bay Area Chapter of the American Statistical Association
    www,barkerstats.com
    -------------------------------------------








  • 12.  RE:Power -- ITT and attrition

    Posted 08-18-2011 20:45

    If Alexandra's need for sample size is to get a rough idea of budget, then determining the per protocol sample size and bulking it up by X% may be an adequate ball park estimate. If the sample size is to go in the proposal (which seems implied by the initial post), then I echo Scott's comments. Sample size is tied to the statistical analysis and to treatment effect, both of which are affected by missing data. Simulating missing data scenarios at the time of sample size calculation is a pain, but has benefits: (1) It surfaces the potential for bias due to differential drop out. (2) It allows the statistician to engage the rest of the study team about reasonable assumptions on amount of drop out, reasons for drop out, the implications for treatment effect, and, as Kari mentioned, how to reduce likelihood of dropout. (3) Addressing such topics in the proposal may make it more credible and fundable (as Raymond suggested). (4) It also accelerates figuring out the statistical analysis method to be used at the end of the study, and therefore determining whether it will be feasible to perform.

    -------------------------------------------
    Dick Bittman
    Bittman Biostat, Inc.
    -------------------------------------------








  • 13.  RE:Power -- ITT and attrition

    Posted 08-18-2011 22:01
    I completely disagree with this strange justification.  Alexandra may very well be computing sample size both for budgeting and for the protocol for a clinical trial and sample is always tied to the primary hypothesis test or confidence interval to be generated. The increase of the sample size by X% to achieve the correct sample size for a fixed sample size based on a random sample and a specific power to a test or half width of a confidence interval.  Of course the key word is random sample.  The type of missing data may be such that it destroys the balance provided by random samples.  So that may dictate doing sensitivity analysis as has been mentioned and it also raises the issue as to the adequacy of the simple boosting of the sample size. Nevertheless the question raised was whether or not this boosting of sample size was common practice adn from my experience in the pharmaceutical industry I believe it is.  Also determining the sample size and addressing missing data in the analysis are to my way of thinking two distinctly different issues.  So I think the discussion of the data analysis is a little off topic.  I agree with all the points you raise about the analysis.  I just think for this thread we should stay on topic.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 14.  RE:Power -- ITT and attrition

    Posted 08-19-2011 09:40

    I wrote this yesterday around 2pm EDT, then hit "send," but it does not seem to have gone out.

    ==========================

    So far, this is all reasonable advice, but let me contribute this general sermon:

    When the statistical plan gets complex like this, only a custom simulation study will handle the sample-size questions in a way that will tell savvy reviewers that a solid professional statistician truly collaborated in designing the study and writing the proposal. Yes, this can be a lot of work. Better investigators will understand and support it; the best will expect it. Those who just want a quickie, pedestrian stat considerations section for a complex problem will be nothing but trouble in the long run and the statistician should try to politely disengage--for the sake of both parties.

    I find it is usually possible to link each key research question to a single estimate and statistical interval (or plot of posterior distribution) that will quantify the "oomph" that answers the question directly (using that wonderful term coined by Stephen Ziliak and Deirdre McCloskey; see Porter reference below).

    This will come from using a given statistical model and method, frequentist, likelihood, or Bayesian.

    Accordingly, the stat planning (including sample-size) questions become:

      (1) How biased, if any, is the given estimate? Is the degree of bias large enough to matter or will the hypothesized "oomph" of the statistical effect overwhelm it?

      (2) How large might the sampling error (e.g., standard error) be? Given the hypothesized "oomph" factor, is this large enough to really matter?

    Note that in frequentist-land, (1) and (2) can be combined via the the mean squared error of the estimate vs. the "true" point value. For example, if the true adjusted odds ratio is conjectured to 2.50, does it really matter if its MSE will almost surely be less than 0.10 for a total sample size of N versus less than 0.07 for a total sample size of 2N? 0.10 vs 0.07 may not matter much, but N versus 2N may matter a lot in terms of cost and feasibility.

      (3) How well will the associated statistical interval inform us about the true value for the parameter? In other words, how tight will the interval (or posterior distribution) tend to be? What is the distribution of the key limit (lower or upper) of the interval?

      (4) Answering (3) can include the estimated statistical "power." For example (frequentist), if you hypothesize that the given parameter, theta, exceeds some null value, theta_0, then the power is just the probability that the lower confidence limit for theta exceeds theta_0. For equivalence testing, this is the probability that entire interval will fall inside the prescribed equivalence region.

    Note: My emphasis on using estimates and intervals (and, thus, avoiding p-values unless "forced" to use them) is hardly new or original. Plus, associated with this is my disinterest in basing sample-size analyses on classical power computations, which are too often nothing more than statistical fables designed to appease today's typical grant or article reviewer.

    Three quick reads; their content is a good as their titles:

    Porter, T. M. (2008). Signifying little. Science, 320(5881):1292. http://www.sciencemag.org/content/320/5881/1292.full.pdf. This is a book review of The Cult of Statistical Significance How the Standard Error Costs Us Jobs, Justice, and Lives by Stephen T. Ziliak and Deirdre N. McCloskey University of Michigan Press, Ann Arbor, 2008. 348 pp.

    Connor, J. T. (2004). The value of a p-valueless paper. Am J Gastroenterol, 99(9):1638-40.
    http://www.nature.com/ajg/journal/v99/n9/full/ajg2004321a.html

    Bacchetti, P., Deeks, S. G., and McCune, J. M. (2011). Breaking free of sample size dogma to perform innovative translational research. Science Translational Medicine, 87(3):23.
    http://stm.sciencemag.org/content/3/87/87ps24.short

    Such thinking needs to be stressed anew in our teaching, but that's another sermon.

    Thanks for listening.
    -------------------------------------------
    Ralph O'Brien
    Case Western Reserve University
    -------------------------------------------








  • 15.  RE:Power -- ITT and attrition

    Posted 08-19-2011 09:53

    Thank you.  I appreciate all of your thoughtful, detailed, and constructive input.  As a newcomer to the group, I find this  discussion extremely helpful and thought provoking. 

    -------------------------------------------
    Alexandra Hanlon
    -------------------------------------------








  • 16.  RE:Power -- ITT and attrition

    Posted 08-19-2011 10:26

    Thanks Ralph for the sermon.  It is interesting to think of sample size in terms of MSE for the estimate as well as width of the confidence interval .  But all the canned sample size packages that I know about do power vs sample size & CI halfwidth and sample size including your own programs and the offshoots in SAS (proc power and glmpower).

    I still think that especially for complex problems the sample size calculations can be improved by better "guesses" when the adaptive sample size reestimation approach is used.  I advocate it primarily as a two-stage procedure but it can also be done with reassessments at 3 or more stages.

    I will give up my complaint about talking about issues of estimator bias and analysis of the data since the conversation is so interesting and Alexandra seems to be enjoying all of the discussion.
    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------