Discussion: View Thread

  • 1.  biology experiment analysis

    Posted 05-09-2013 08:47
    I would appreiciate your input on what appears to be a straightforward analysis, but I do want to make sure!!

    - growing 10K worm eggs in each of 18 different solutions (tx groups)
    - this can be done in triplicate (3 results possible from each solution.)

    The idea is on every other day, starting on day 1 (days 1, 3, 5, etc to 23, 25) pull samples of 500 eggs from each solution
    and, looking at the first 100 get the number dead vs alive.  Their proposed plan was to do this same approach on each day.

    My suggested revision is to treat this as survival analysis (time to death) and as such revise the experiment to
    - on day 1, yes, pull 100 eggs and get the number of dead & alive (with 3 samples, get avg number dead.  Could also use max number dead - any ideas on which would be better?)
    - on day 3, pull 100 - avg number dead from day 1, and look at the number dead in the remaining sample
    - on day 3, pull 100 - (avg number dead from days 1 + avg dead on day 3), etc
    etc.

    In this way the samples would approximate/reflect the effect of the deaths on the population.

    The end result would be 100 observations per treatment with x(1)= avg number dead as of day 1, x(2)=avg number dead as of day 3, etc
    and any remaining alive at day 25 censored at day 25.  From there the survival analysis could be applied and
    comparisons of each treatment to the others done.

    Any suggestions for improvement?

    Thank-you for your time!

    -------------------------------------------
    Douglas Criger
    Manager, Biostatistics
    -------------------------------------------


  • 2.  RE:biology experiment analysis

    Posted 05-09-2013 14:39
    Survival analysis is tricky here. So tricky that you might want to rethink things. For example, the sample that you pull out on day 5, none of them come with a death certificate giving the time and date of their death. If there are 40 live and 60 dead then the 40 live observations are right censored at t=5. The 60 dead are left censored at t=5. So you need software that can handle both left and right censoring.

    If it were me, I'd just compute a single proportion (0.60 in this example) and model those proportions using a nonlinear regression model. You still have 13 time points for each of the 18 solutions, so you should be able to fit the curve without any problems. Maybe it isn't quite as flexible as a survival model, but it is easy to defend and a whole lot simpler.

    If you run things in triplicate, you can either aggregate the proportion across the replicate samples or you can use a mixed model estimating both between and within replicate variation. Taking the maximum across replicates is not a serious alternative, in my opinion.

    -------------------------------------------
    Stephen Simon
    Independent Statistical Consultant
    P. Mean Consulting
    -------------------------------------------