ASA Connect

View Only

Back to eGroups

Expand all | Collapse all

Survival Analysis. Proportion of Censored Cases.

1. Survival Analysis. Proportion of Censored Cases.

0 Recommend
Evelio Velis
Posted 02-23-2016 15:34
Hello everyone!

We are performing a survival analysis for a group of ovarian cancer patients (n=500). We are trying to identify significant differences between ethnic groups regarding survival time.

The exact "date of death" is known for only 35 patients (7%), for the rest of the patients (93%) only the "date of last contact" has been recorded. Thus, they have been "censored" for the analysis.

Here is my question:

Should we be concerned about the proportion of censored cases (93%)?

Is there any known "rule" regarding the maximum proportion of censored cases we should include in the analysis?

Any reference I could use?

Thanks a lot. I would really appreciate your advice/comments.

Evelio Velis, MD, PhD
Director and Associate Professor

Master of Science in Health Services Administration

MSHSA-MPH Dual Degree Programs
College of Nursing and Health Sciences
Barry University
11300 NE 2nd Avenue
Miami, FL 33161-6695
P: 305-899-4089 F: 305-899-3543
www.barry.edu
2. RE: Survival Analysis. Proportion of Censored Cases.

1 Recommend
Colleen Kelly
Posted 02-24-2016 01:41
In general, you should worry about 93% censored observations. If all of these patients were censored at follow-up time X (say 1 year), then you would not have much information in the data, and would not be able to calculate the median survival, for example. All you would know is that the median is greater than 1 year. But, the amount of information in your data will depend on the distribution of the "date of last contact". If the length of follow-up varies widely over patients, you will have more information than the simple example above.

--
Colleen Kelly Ph.D., PStat
Kelly Statistical Consulting
cell: 760-846-6763
www.kellystatisticalconsulting.com

Original Message
3. RE: Survival Analysis. Proportion of Censored Cases.

1 Recommend
Andrew Althouse
Posted 02-24-2016 09:22
I do not believe that there is a rule of thumb regarding the proportion of censored cases for a survival analysis - plenty of medical research papers publish survival analyses with as little as 10 percent of their cohort having reached the primary endpoint. However, a few things that stand out to me from your description:

1) You will probably start your analysis by producing Kaplan-Meier curves to illustrate the survival in your ethnic groups. You should make sure to display the N remaining within each group along the bottom of your Figure so the reader can place the estimates in proper context.

2) From there, I expect that you will create a Cox proportional-hazards model with the ethnic groups as your primary predictor variable of interest, and then possibly adjust for potential confounding variables (age at diagnosis, stage of cancer, tumor size, etc). There is one (very, very loose) rule of thumb in multivariate modeling tends to be no more than one variable/parameter in your model for every 10 "events" - so with only 35 known events, you're likely to be very limited in the number of covariates that you may consider. If you are testing for differences between ethnic groups and wish to adjust for age, cancer stage, etc...you're going to run out of room for additional parameters very quickly, so you should carefully choose with confounding variables to include (or if they are even necessary at all - for example, if distribution of "age at diagnosis" is nearly identical in the different ethnic groups, you may choose to omit that from your model, as it should not confound the relationship between ethnic group and survival).

3) You do not mention how many ethnic groups you are considering or the proportion of patients within each group, but this is one cause for concern that I see...for example, suppose that one of your ethnic groups of interest includes only 20 patients, with 3 deaths. It will be very hard to know whether there is a true difference in survival between that ethnic group versus the other ethnic groups within your study population, particularly if you attempt to adjust for confounding variables. The term "power" gets thrown around recklessly far too much, but this would be a case where even if there was/is a true difference in survival between that group and the remainder of your cohort, you are likely insufficiently powered to determine whether that differences is likely to be real.

Finally, I will slightly redirect a statement of Colleen's from above: I don't think it's of tremendous concern that the authors cannot estimate a "median survival" (although I will share a humorous anecdote with you here - I have previously worked with a group of surgeons that wished to estimate "median" survival in a large cohort of patients when far less than 50 percent of their patients had died, and they just absolutely could not understand why it was impractical to estimate the median survival. It was like trying to explain that 2+2=4 to people that just kept asking "But why isn't it 5?"). Sure, it would be nice to state that the median survival in (Group A) is significantly different from the median survival in (Group B), but as noted above, many papers all around the medical literature use survival analyses with far less than half of their cohort meeting the primary endpoint. However, Colleen's statement does get at the point that with so few of your patients having reached the endpoint, it is likely that even if there ARE differences in survival between ethnic groups, you may not have sufficient data to determine that. You run the risk of performing this analysis and showing a "non-significant" result even if there is an elevated risk in one ethnic group vs. the others simply because you have so few events, so make sure to interpret your findings in proper context.
------------------------------
Andrew D. Althouse, PhD
Supervisor of Statistical Projects
UPMC Heart & Vascular Institute
Presbyterian Hospital, Office C701
Phone: 412-802-6811
Email: althousead@upmc.edu

Original Message
4. RE: Survival Analysis. Proportion of Censored Cases.

2 Recommend
William Meeker
Posted 02-25-2016 08:52
I would be more concerned about the meaning of "Date of Last Contact" and it relationship to the "Date of Death."

Think of this as a competing risk problem where the beginning of time is date of diagnosis and there are two possible endpoints: death and loss of contact. If the times of those two events are independent, then the standard methods of comparison with censored data would be valid (although the power of any test would be limited because of the small number of deaths). For example, if the date of last contact is the date that the data was analyzed for all subjects, then there would be no problem.

If, on the other hand, there is a strong correlation between the time to no contact and the time of death random variables, then things are more complicated, as censoring is informative. For example, if the date of last contact results from an unreported death or other reason related to health, the two random variables would not be independent. Furthermore, if the nature (i.e., strength) of the correlation depends on the covariates, then seriously misleading comparisons between groups could result.

It is important to know why contact was terminated before death.

William Q. Meeker
Department of Statistics
2109 Snedecor Hall
Iowa State University
Ames, Iowa 50011
Phone: 515-294-5336
Fax: 515-294-4040
Home Fax: 515-232-1323
www.public.iastate.edu/~wqmeeker

Original Message
5. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
James Breneman
Posted 02-24-2016 12:07
I would suggest using Weibull analysis for your data as well. Then compare your Non-parametric results to the Weibull distribution results. I have found after looking at a number of Medical datasets (mostly small n< 500) Weibull analysis does a good job of modeling the "Failure" distribution (1-Survival). The basic tenets of the Weibull distribution are met by most medical examples (weakest link theory). And Weibull analysis handles "censored" data easily. Most of the major stat packages to Weibull analysis. Just a parametric approach that seems to work very well for me.
------------------------------
James Breneman
SAE Fellow
Pratt & Whitney Fellow- Statistics, Risk Analysis
Professional Statistician

Original Message
6. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
James Schlosser
Posted 02-25-2016 10:39
One other issue you may want to examine is if the censoring over time are similar among the ethnic groups. If not, it may be indicative of bias in following up on the status of the patients.
------------------------------
James Schlosser
Business Analyst
jim_schlosser@yahoo.com

Original Message
7. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Jonathan Siegel
Posted 02-26-2016 07:10
In general, the key quantity of concern in designing survival analysis trials is the number of events rather than the percent censored. For low-risk events, it is simply necessary to have a high proportion censored at trial end in order to have a trial with a reasonable duration, especially in a commercial pharmaceucal setting where products have patent lives placing caps on feasible trials. Under the assumptions of proportional hazards, events are the basis for evaluating design power. I would therefore focus on the 35 events

There are three basic questions:

1. Are the number of events sufficient to detect a meaningful differences? This is a power calculation, which you should do before selecting a sample size. 35 events are very likely insufficient for a meaningful comparison between groups.

2. Are the number of events sufficient to be consistent with "large numbers" assumptions? Proportional hazards assumes that sample and event sizes are large enough for "large numbers" (normal statistics) assumptions to work. When events are rare, a larger sample size is needed for this to be the case. 20 events per group (40 total) is a rule of thumb minimum here, so this may be an issue.

3. Are the number of events sufficient to check the proportional hazards assumption? In a rigorous trial, the assumption is checked where possible, typically by dividing the study duration into intervals and calculating the hazards at each intervals. This requires a much larger number of events. It's often not possible to check the proportional hazards assumption in a rare event context.

Both Hosmer's and Lee's books on applied survival analysis would be good references, as well as Collett's Modeling Survival Data in Medical Research.

FInal Notes: When data are not randomized, such as when evaluating differences between baseline factors, it is preferable to use an estimation approach and evaluate e.g. the hazard ratio between the factors, rather performing a formal hypothesis test. The proportional hazards model has been criticized and more complicated and arguably better approaches to estimation proposed (the accelerated failure model provides a better causal interpretation), but this is an advanced topic.

Jonathan Siegel

Associate Director Clinicsl Statistics

Bayer HealthCare
------------------------------
Jonathan Siegel
Senior Research Biostatistician

Original Message
8. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Marc Bourdeau
Posted 02-26-2016 09:10
To Jonathan Siegel.

You write about an «accelerated failure model». I fail to see an accelerated model in the biomedical context? Maybe a reference?

Thank you for your elaborate note!
------------------------------
Marc Bourdeau
École Polytechnique de Montréal

Original Message
9. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Jonathan Siegel
Posted 02-29-2016 08:04
Dear Marc,

For a criticism of the proportional hazards model and advocacy of the accelerated failure model for estimation, see e.g. Aalen et al., Does Cox analysis of a randomized survival study yield a causal treatment effect? Lifetime Data Analysis 21:579-593 (2015). This critique has not of course resulted in displacement of the proportional hazards model in general biomedical use.

Additional note: Agree high censoring will result in being unable to estimate the medians. However, this is not generally a problem, as quantities like hazard ratios are used instead in this context. Medians have also been criticized. They only describe the relationship between the groups at a single point. When the trial does not cover the whole of the survival distribution, one must rely on assumptions to extrapolate that one point to the entirety of the survival behavior.
------------------------------
Jonathan Siegel
Associate Director Clinical Statistics
Bayer HealthCare

Original Message
10. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Eric Siegel
Posted 02-26-2016 21:16
I have a couple of questions:

(1) Are the dates of death recorded in one column and the dates of last contact recorded in a different column? Or are they recorded in the same column with a vital-status indicator nearby?

(2) What is the median follow-up on the patients who are still alive? And what are typical literature values for median survival in ovarian cancer?
------------------------------
Eric Siegel, MS
Research Associate
Department of Biostatistics
Univ. Arkansas Medical Sciences

Original Message
11. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Eric Siegel
Posted 02-26-2016 21:21
There is a third question that I want to ask:

(3) Among the patients who do not have a date of death, what is the median time from today back to their date of last contact?
------------------------------
Eric Siegel, MS
Research Associate
Department of Biostatistics
Univ. Arkansas Medical Sciences

Original Message
12. RE: Survival Analysis. Proportion of Censored Cases.

0 Recommend
Evelio Velis
Posted 03-02-2016 16:40
Hello everyone!

I would like to thank all of you for your very useful comments and suggestions. I've been trying to respond to some of your questions individually.

Thanks a LOT. Evelio Velis.
------------------------------
Evelio Velis
Director and Associate Professor
Barry University/College of Health Sciences

Original Message

ASA Connect

Survival Analysis. Proportion of Censored Cases.

Evelio Velis02-23-2016 15:34

Colleen Kelly02-24-2016 01:41

Andrew Althouse02-24-2016 09:22

William Meeker02-25-2016 08:52

James Breneman02-24-2016 12:07

James Schlosser02-25-2016 10:39

Jonathan Siegel02-26-2016 07:10

Marc Bourdeau02-26-2016 09:10

Jonathan Siegel02-29-2016 08:04

Eric Siegel02-26-2016 21:16

Eric Siegel02-26-2016 21:21

Evelio Velis03-02-2016 16:40

1. Survival Analysis. Proportion of Censored Cases.

2. RE: Survival Analysis. Proportion of Censored Cases.

3. RE: Survival Analysis. Proportion of Censored Cases.

4. RE: Survival Analysis. Proportion of Censored Cases.

5. RE: Survival Analysis. Proportion of Censored Cases.

6. RE: Survival Analysis. Proportion of Censored Cases.

7. RE: Survival Analysis. Proportion of Censored Cases.

8. RE: Survival Analysis. Proportion of Censored Cases.

9. RE: Survival Analysis. Proportion of Censored Cases.

10. RE: Survival Analysis. Proportion of Censored Cases.

11. RE: Survival Analysis. Proportion of Censored Cases.

12. RE: Survival Analysis. Proportion of Censored Cases.

Contact Us

Membership

Privacy

Follow Us