Discussion: View Thread

  • 1.  assumptions in using the log-rank test

    Posted 04-19-2012 17:21

    Is there an issue with using the log-rank test to compare two survival curves if there is a large proportion (>50%) of zero survival times in each curve?  I thought this test was similar to a Cochran-Mantel-Haenszel test where the strata are the survival times - in which case, I can't see the problem with a lot of zeros.

    Thank you for your time!
    -------------------------------------------
    Colleen Kelly
    Principal Consultant
    Kelly Statistical Consulting
    -------------------------------------------


  • 2.  RE:assumptions in using the log-rank test

    Posted 04-19-2012 17:43
    can you explain why you have so many zero's? 

    -------------------------------------------
    Daniel Scharfstein
    Johns Hopkins School of Hygiene & Public Health
    -------------------------------------------








  • 3.  RE:assumptions in using the log-rank test

    Posted 04-19-2012 18:02
    I think there probably is and a survival curve like that doesn't sound like a real survival curve.Kaplan-Meier is a nonparametric estimate for a continuous survival function.  What you have is a function with a point mass of p at zero along with a subsurvival function (1-p)S(t) for t>0.  You could estimate S(t) separately with a kaplan-meier curve and use the fraction of observations at zero as an estimate for p.  Now if one or both of the "survival" curves have this structure the log rank test applied to the entire data set would have a problem.  But I think you could test the p1=p2 and if you can't reject that null hypothesis apply the log rank test to the survival curve piece in the subsurvival functions.  So you first do a binomial test for p. If it fails that declare the two curves different.  Otherwise apply the log rank test to compare S1(t) with S2(t).  I think there would need to be some adjustment needed to account for the two tests, the first an unconditional binomial test and the second a conditional log rank test conditioning on p.
    Does that sound reasonable?  I have not seen this in the literature but i think it owuld make sense.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 4.  RE:assumptions in using the log-rank test

    Posted 04-19-2012 19:14
    I do not see a problem using the logrank test (to get a p-vaue) in this situation if there is a sufficient sample size. The p-value derived from the logrank statistic assumes asymptotic normality of the statistic, so there needs to be a sufficient sample size for that. If many failures occur at zero, then the value of the logrank statistics would be the same as if all of those failures were moved to some unused positive time (before the next failure or censored observation.) 

    I don't know what the context is, but imagine a situation where two groups seeking employment at a company are being compared. E.g., we are wondering if the employment duration is better for people who have some specific qualification vs. those without it. At the company some people are hired, some are not hired (employment duration = 0), and those who are hired may be terminated (fired, let go) some time later. If the failure event is the combination of terminated and not being hired, then a number of people will fail at day zero (they didn't get the job) and some others will fail later (lost the job.) 

    There are probably other examples. It would be nice to her the specific situation that prompted this discussion.

    Best wishes,

    Nayak



    -------------------------------------------
    Nayak Polissar
    Principal Statistician
    The Mountain-Whisper-Light Statistics
    -------------------------------------------








  • 5.  RE:assumptions in using the log-rank test

    Posted 04-19-2012 20:18

    You may be right about this. i am not sure.  But I like my approach better because there are really two pieces to the survival curve the discrete component at 0 and the continuous subsurvival function.  Doing it my way you can see if there is a difference which component the difference is due to.
    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 6.  RE:assumptions in using the log-rank test

    Posted 04-20-2012 00:19
    I quite agree Michael. 

         You have to think about and model the process.  Not rely on asymptotics to salvage a model that really doesn't fit.  For something as ultra-skewed as this, it is unlikely that there would be sufficient numbers for asymptotics to actually apply.  Certainly not in the biomedical field that I work in. 
         This problem is analogous to a ZIP (zero inflated Poisson process).  I used something analogous to your idea to compare two interventions for reducing sexual risk for aids, there being excess numbers of 0 risky intercourse while the rest of the distribution fit a negative binomial (non-constant hazard Poisson).  My goal too was to understand how the treatment and covariates affected each group. 
         Of course in the biomedical field, if we had large numbers die the first day (time 0), we'd know there was a really severe problem!!!  I have seen this in some ER data where there were clearly two distinct populations: those so severely ill they cannot be saved and those that can be mediated by treatment.  They had to be treated as separate groups.

    Ray

    -------------------------------------------
    Raymond Hoffmann
    Professor
    Medical College of Wisconsin
    -------------------------------------------