ASA Connect

 View Only
Expand all | Collapse all

Adjusted Survival Curves

  • 1.  Adjusted Survival Curves

    Posted 10-17-2016 21:07

    In a survival analysis setting using Cox regression, one often constructs plots of adjusted survival curves.   These plots involving setting some of the variables in the Cox regression model to their "average" value. For example, if the model includes the variables Treatment and Gender, Gender is set to its average value (e.g., proportion of Males in the data) so that we can plot the gender-adjusted survival curves for patients in the Novel Treatment and those in the Standard Treatment groups. 

    Here is what I don't understand about this method: 

    1. Does it assume that we are comparing the survival behaviour of patients with an "average" value for Gender across the two treatment groups?  What does it even mean for patients to have an "average" value for Gender? 

    2. Does it assume that we are comparing the survival behaviour of patients placed on the two treatments by "averaging out" the effect of Gender?

    3.  What if Gender had 3 levels (e.g., Male, Female, Undeclared)? If we code the Gender effect via two dummy variables (say, D1 and D2, with D1 comparing Female vs Male and D2 comparing Undeclared vs Male), do we set the values of D1 and D2 to the proportion of Females and Undeclared values in the data?  And how do we interpret the gender-adjusted curves (i.e., to what kind of patients do they apply)?

    These may be "stupid questions", but I would appreciate any insights.

    Thanks,

    Isabella

    ------------------------------
    Isabella Ghement
    Ghement Statistical Consulting Company Ltd.
    ------------------------------


  • 2.  RE: Adjusted Survival Curves

    Posted 10-18-2016 00:09
    Edited by David Norris 10-18-2016 01:30

    What statistical software allows or encourages you to 'adjust to' interpolations between the values of a categorical variable? Frank Harrell's R package 'rms' will typically choose modes of categorical variables as default 'adjust-to' values.

    Sometimes in mathematics, the formalism seems to give rise to apparently meaningless construct that later turn out to be fantastically useful. The most obvious example of this is √(-1), but there are others like non-Euclidean geometries.

    In statistics, on the other hand, I would suggest that this does not happen! Our formalisms might suggest seductive ideas we feel compelled to 'interpret', but we may be better off resisting such temptations. Just because binary variables can be 'coded' as being from {0,1}, and that set is a subset of a larger set like [0,1] or even all of ℜ, this need not induce a search for individuals with a 'gender' of 0.5 or π or -14. Statistics is not physics.

    Kind regards,

    ------------------------------
    David C. Norris, MD
    David Norris Consulting, LLC
    Seattle, WA



  • 3.  RE: Adjusted Survival Curves

    Posted 10-18-2016 01:06

    Hi David, 

    Thanks for your thoughtful response.   I agree with you that "averaging" binary variables doesn't make sense from an interpretation point of view (hence my original questions). I may be off about this, but I think the "effects" package in R uses this "averaging" argument to produce effect plots (not in a survival setting, but in a similar situation where some model variables have to be set to 'typical' values in order to visualize the effects of other variables in the model on the response). 

    In my own setting, I actually have to produce adjusted survival curves based on N = 20 imputation data sets and a Cox regression model which includes 5 categorical predictors. I guess I could build adjusted survival curves for the levels of each predictor variable in turns while setting the values of the other predictors in the model to their modal values, as in the rms package. (The other option would be to define some 'prototypical' patients and compute adjusted survival curves for those. But there are so many options that it's hard to come up with a select few.) The adjusted survival curves would then be "averaged" across imputation data sets in an appropriate manner.

    I just wanted to get a sense on how other people tend to go about a task like this and make sure that the end result is interpretable. 

    All the best, 

    Isabella

    ------------------------------
    Isabella Ghement
    Ghement Statistical Consulting Company Ltd.



  • 4.  RE: Adjusted Survival Curves

    Posted 10-18-2016 02:43

    Hi Isabella, 

    One potential solution to the (valid, in my view) problems you raise is to adjust survival curves using inverse probability weights, instead of via a conditionally adjusted proportional hazards model:

    Adjusted survival curves with inverse probability weights. - PubMed - NCBI

    After implementing IPW, the curves can be interpreted as standardized to the total population.

    Best, 

    Ashley

    ------------------------------
    Ashley Naimi
    University of Pittsburgh



  • 5.  RE: Adjusted Survival Curves

    Posted 10-18-2016 07:08
    1. Hi All,

    One can construct adjusted survival curves based on average survival experience and not on average covariate values. See Adjusting Survival Curves for Confounders: A Review and a New Method by Nieto and Coresh.

    Vatsala.

    ------------------------------
    Vatsala Karwe, Ph.D.



  • 6.  RE: Adjusted Survival Curves

    Posted 10-18-2016 07:47
    These are insightful questions; far too many people plot such curves and assume that they
    are "adjusted curves" for some population. They are not, and the problem has been
    rediscovered multiple times. See chapter 10 of Therneau and Grambsh, Modeling Survival
    Data, Springer 2000 for references and a lengthy explanation of the issues. In a nutshell
    mean(f(x)) != f(mean(x)); the left side is what you want and the right is what the method
    below computes (for a fairly complex function f). See the "Adjusted Survival Curves"
    vignette in the R package 'survival' for a discussion of how to create population average
    curves correctly. (If you are using R you can view vignettes within the package,
    otherwise http://mirror.las.iastate.edu/CRAN/web/packages/survival/vignettes/adjcurve.pdf
    or other CRAN mirrors.)

    Terry Therneau
    Mayo Clinic

    --------------------------------------------------------

    included message from Isabella Ghement :

    In a survival analysis setting using Cox regression, one often constructs plots of
    adjusted survival curves. These plots involving setting some of the variables in the
    Cox regression model to their "average" value. For example, if the model includes the
    variables Treatment and Gender, Gender is set to its average value (e.g., proportion of
    Males in the data) so that we can plot the gender-adjusted survival curves for patients in
    the Novel Treatment and those in the Standard Treatment groups.

    Here is what I don't understand about this method:

    1. Does it assume that we are comparing the survival behaviour of patients with an
    "average" value for Gender across the two treatment groups? What does it even mean
    for patients to have an "average" value for Gender?

    2. Does it assume that we are comparing the survival behaviour of patients placed on the
    two treatments by "averaging out" the effect of Gender?

    3. What if Gender had 3 levels (e.g., Male, Female, Undeclared)? If we code the Gender
    effect via two dummy variables (say, D1 and D2, with D1 comparing Female vs Male and D2
    comparing Undeclared vs Male), do we set the values of D1 and D2 to the proportion of
    Females and Undeclared values in the data? And how do we interpret the gender-adjusted
    curves (i.e., to what kind of patients do they apply)?

    These may be "stupid questions", but I would appreciate any insights.

    Thanks,

    Isabella






  • 7.  RE: Adjusted Survival Curves

    Posted 12-12-2016 15:38

    Thank you very much to everyone who responded to my initial inquiry about adjusted survival curves.  

    Terry, I finally had a chance to review your excellent CRAN vignette regarding adjusted survival curves and wanted to follow up with a few questions.  

    Let's say I am interested in constructing an adjusted "conditional" survival curve using the methodology in the vignette.  The way I understand this methodology, it implies two steps: 1) model and 2) balance.  

    For 1), a Cox proportional hazards regression model is fitted to the sample data. For 2), the model is used to obtain a survival curve for each individual in the sample (i.e., for each configuration of predictor variables represented in the sample) and then the adjusted "conditional" survival curve is obtained by performing a simple averaging of the obtained survival curves.  

    Now, let's say that the Cox proportional hazards regression model includes 3 predictors (i.e., X1, X2 and X3) such that X1 and X2 are both categorical with two levels each (i.e., levels a and b for X1; levels A and B for X2) and X3 is continuous.  Let's also say that we are interested in a variation of the adjusted "conditional" survival curves that would enable us to visualize the effect of each of these predictor variables in turns on survival (rather than the combined effect of all three predictor variables on overall survival).   

    For the effect of X1, we could separate our sample into two sub-samples:  i) subjects where X1 = a and ii) subjects where X1 = b.   We could then use the Cox proportional hazards regression model to separately construct survival curves for all configurations of values for X2 and X3 present in each of the two sub-samples.  Simple averaging of those survival curves across the subjects in each sub-sample would yield two adjusted "conditional" survival curves  - one for each level of X1.          

    For the effect of X2, we would proceed in a similar fashion as described for X1, except that the two sub-samples would correspond to subjects where X2 = A and X2 = B, respectively, and the construction of survival curves would pertain to all configurations of values of X1 and X3 present in the two sub-samples. 

    For the effect of X3, things are trickier if we only care about some pre-specified values of X3 (rather than all observed values of X3 present in the sample).  For example, if X3 stands for Age, the pre-specified values of Age might be 30, 40 and 40 years.  For a relatively small study, it's entirely possible that none of the subjects in the study actually have any of those ages (or some of those ages). So the question is: How would we proceed in this situation, especially when it comes to looking at the effects of X1 and X2 on survival?  

    For this situation, could we use a slightly different reasoning where, instead of averaging survival curves over all subjects in the sample with various observed configurations of predictor variables, we would average survival curves over "prototypical" subjects with idealized configurations of predictor variables?  In the context of the example given, we could define the following "prototypical" subjects:

        X1 = a,  X2 = A, X3 = 30  (where X3 = Age)

        X1 = a,  X2 = B, X3 = 30

        X1 = b, X2 = A, X3 = 30

        X2 = b, X2 = B, X3 = 30

        X1 = a,  X2 = A, X3 = 40     

        etc. 

    Each of these idealized configurations of predictor variables would yield a single survival curve.  Eliciting the effect of X1 would mean averaging the survival curves corresponding to subjects with (i) X1 = a for whom X2 can be either A or B and X3 can be either 30, 40, 50 and (ii) X1 = b for whom X2 can be either A or B and X3 can be either 30, 40, 50, etc.  This would yield to average survival curves across "prototypical" subjects with either X1 = a or X1 = b for whom X2 can be either A or B and X3 can be either 30, 40, 50 (rather than average survival curves in the entire cohort of patients with either X1 = a or X1 = b).  

    Does this make sense?  And if it doesn't make sense, is there a better way to deal with adjusted survival curves in situations where one of the predictor variables is continuous and we are interested in learning more about its effect on survival by comparing adjusted survival curves at some of its pre-specified values?  

    Thanks a lot,

    Isabella  

    ------------------------------
    Isabella R. Ghement, Ph./D.
    Ghement Statistical Consulting Company Ltd.
    E-mail: isabella@ghement.ca
    Web: www.ghement.ca
    Tel: 604-767-1250



  • 8.  RE: Adjusted Survival Curves

    Posted 12-13-2016 13:16
    The key idea is that you create a population for the "other" covariates. So to compare
    those with X1=a and X1=b, create a data set "pop" containing some population for the other
    variables (X2 and X3). Then make two data sets temp1 = cbind(pop, X1='a') and temp2 =
    cbind(pop, X1='b'), and get the population survival curve for each of them using
    survexp(). Plot them together. The population can be any distribution you want, such
    that you can tell a sensible story about what the curves represent: the distribution found
    in the data set, one that is more balanced, less balanced, only old people, .... In
    response to your question of using "only prototypical" subjects as the population the
    answer is that of course you can. You simply need to be able to say what it represents.

    For your continuous variable X3 below, the first question is what population to use
    for X1 and X2, call it pop12, and the second is what curves you want to draw. An
    effective graph will choose a small number of "representative" values from X3. Too many
    and the plot is too busy, to few and its not sufficiently varied. But each curve will be
    the same process: temp = cbind(pop12, X3 = some value); survexp(coxfit, newdata=temp).

    Terry Therneau



    -----------------------------

    Thank you very much to everyone who responded to my initial inquiry about adjusted
    survival curves.

    Terry, I finally had a chance to review your excellent CRAN vignette regarding adjusted
    survival curves and wanted to follow up with a few questions.

    Let's say I am interested in constructing an adjusted "conditional" survival curve using
    the methodology in the vignette. The way I understand this methodology, it implies two
    steps: 1) model and 2) balance.

    For 1), a Cox proportional hazards regression model is fitted to the sample data. For 2),
    the model is used to obtain a survival curve for each individual in the sample (i.e., for
    each configuration of predictor variables represented in the sample) and then the adjusted
    "conditional" survival curve is obtained by performing a simple averaging of the obtained
    survival curves.

    Now, let's say that the Cox proportional hazards regression model includes 3 predictors
    (i.e., X1, X2 and X3) such that X1 and X2 are both categorical with two levels each (i.e.,
    levels a and b for X1; levels A and B for X2) and X3 is continuous. Let's also say that
    we are interested in a variation of the adjusted "conditional" survival curves that would
    enable us to visualize the effect of each of these predictor variables in turns on
    survival (rather than the combined effect of all three predictor variables on overall
    survival).

    For the effect of X1, we could separate our sample into two sub-samples: i) subjects
    where X1 = a and ii) subjects where X1 = b. We could then use the Cox proportional
    hazards regression model to separately construct survival curves for all configurations of
    values for X2 and X3 present in each of the two sub-samples. Simple averaging of those
    survival curves across the subjects in each sub-sample would yield two adjusted
    "conditional" survival curves - one for each level of X1.

    For the effect of X2, we would proceed in a similar fashion as described for X1, except
    that the two sub-samples would correspond to subjects where X2 = A and X2 = B,
    respectively, and the construction of survival curves would pertain to all configurations
    of values of X1 and X3 present in the two sub-samples.

    For the effect of X3, things are trickier if we only care about some pre-specified values
    of X3 (rather than all observed values of X3 present in the sample). For example, if X3
    stands for Age, the pre-specified values of Age might be 30, 40 and 40 years. For a
    relatively small study, it's entirely possible that none of the subjects in the study
    actually have any of those ages (or some of those ages). So the question is: How would we
    proceed in this situation, especially when it comes to looking at the effects of X1 and X2
    on survival?

    For this situation, could we use a slightly different reasoning where, instead of
    averaging survival curves over all subjects in the sample with various observed
    configurations of predictor variables, we would average survival curves over
    "prototypical" subjects with idealized configurations of predictor variables? In the
    context of the example given, we could define the following "prototypical" subjects:

    X1 = a, X2 = A, X3 = 30 (where X3 = Age)

    X1 = a, X2 = B, X3 = 30

    X1 = b, X2 = A, X3 = 30

    X2 = b, X2 = B, X3 = 30

    X1 = a, X2 = A, X3 = 40

    etc.

    Each of these idealized configurations of predictor variables would yield a single
    survival curve. Eliciting the effect of X1 would mean averaging the survival curves
    corresponding to subjects with (i) X1 = a for whom X2 can be either A or B and X3 can be
    either 30, 40, 50 and (ii) X1 = b for whom X2 can be either A or B and X3 can be either
    30, 40, 50, etc. This would yield to average survival curves across "prototypical"
    subjects with either X1 = a or X1 = b for whom X2 can be either A or B and X3 can be
    either 30, 40, 50 (rather than average survival curves in the entire cohort of patients
    with either X1 = a or X1 = b).

    Does this make sense? And if it doesn't make sense, is there a better way to deal with
    adjusted survival curves in situations where one of the predictor variables is continuous
    and we are interested in learning more about its effect on survival by comparing adjusted
    survival curves at some of its pre-specified values?

    Thanks a lot,

    Isabella

    ------------------------------
    Isabella R. Ghement, Ph./D.
    Ghement Statistical Consulting Company Ltd.
    E-mail: isabella@ghement.ca
    Web: www.ghement.ca
    Tel: 604-767-1250
    ------------------------------
    Reply to Group Reply to Sender via Email View Thread Recommend Forward Flag as
    Inappropriate Post New Message



  • 9.  RE: Adjusted Survival Curves

    Posted 12-19-2016 14:25

    Terry, thank you so much for your clear and thoughtful response!   You have a gift for simplifying difficult concepts and relating them in a compelling way. I have learned a lot from your answers, as well as the answers of the other contributors on this thread.  

    Best wishes for the holidays,

    Isabella  

    ------------------------------
    Isabella R. Ghement, PhD
    Ghement Statistical Consulting Company Ltd.



  • 10.  RE: Adjusted Survival Curves

    Posted 10-18-2016 12:27
    It's not at all a stupid question. It does remind me, though, of the
    quote that the average person has one ovary and one testicle.

    You could conceptualize your prediction as a prediction of a cohort of
    patients rather than an individual patient. And you can certainly
    conceptualize the concept of being, for example, 40% male much more
    easily than conceptualizing the concept of an individual being 40% male.
    Or if you don't like that analogy, then think of it as the survival
    probability of an individual who was randomly selected from a cohort
    that is 40% male.

    It's not a perfect analogy because of the non-linear nature of survival
    models, but anyone who fusses about this is a nitpicker, in my opinion.

    Steve Simon, www.pmean.com




  • 11.  RE: Adjusted Survival Curves

    Posted 10-19-2016 11:18

    Steve Simon's answer pretty much nails it. The interpretation is not about an individual but about groups of individuals. Adjusting by gender (either 2 or 3 categories) would attempt to estimate survival behavior differences between the treatment groups where these groups have (hypothetically) similar gender breakdown (assuming that the form of the model, a linearized equation, is appropriate enough). It doesn't necessarily apply to a particular individual.

    ------------------------------
    Andres Azuero
    UAB



  • 12.  RE: Adjusted Survival Curves

    Posted 10-20-2016 09:46

    For survival model the non-linear nature of the estimation is somewhat nitpicking as he says, but for the logistic model the effect is huge, and one can end up with corrected probabilities that are quite a bit lower than the average of the probabilities.  This is why I prefer my method of adjustment, to the total population.

     

     

     

    David A. Schoenfeld, Ph.D.

    Professor of Medicine, Harvard Medical School

    Professor in the Department of Biostatistics, Harvard School of Public Health

    50 Staniford Street

    Boston, Massachusetts 02114

    phone: 617-726-6111

    dschoenfeld@partners.org

     

    The information in this e-mail is intended only for the person to whom it is
    addressed. If you believe this e-mail was sent to you in error and the e-mail
    contains patient information, please contact the Partners Compliance HelpLine at
    http://www.partners.org/complianceline . If the e-mail was sent to you in error
    but does not contain patient information, please contact the sender and properly
    dispose of the e-mail.






  • 13.  RE: Adjusted Survival Curves

    Posted 10-18-2016 12:30

    Isabella,

    It's been a while since I've used Cox regression to construct plots of adjusted survival curves. But what I remember is this: when the plot procedure sets a variable such as Gender to its average value, that is a default setting, and one can override the default with something more sensible. I don't remember how to do that now, but it might be as simple as changing the dummy-variable coding from Effects parameterization to Reference parameterization.

    ---Eric


    Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.





  • 14.  RE: Adjusted Survival Curves

    Posted 10-19-2016 13:41

    Averaging covariates isn't the right way to do it.  The problem is that the exponential is a concave function and the corrected survival curves will be lower than the survival curves for the whole population. This problem is worse with corrected values using the logistic model.

    I discuss this issue in my paper "Statistical design and analysis issues for the ARDS Clinical Trials Network: The Coordinating Center perspective" pp 286.

     

    Basically if you had two treatment groups what you want to estimate is the survival curves if both treatments had been applied to the SAME population. For a linear model using the average values of covariates would work, even considering the absurdity of a patient with average gender.  However for a nonlinear model you need to predict the survival curve for each patient, on each of the treatments using the model you estimated and then average those curves to get "corrected" survival curves.

     

     

     

    David A. Schoenfeld, Ph.D.

    Professor of Medicine, Harvard Medical School

    Professor in the Department of Biostatistics, Harvard School of Public Health

    50 Staniford Street

    Boston, Massachusetts 02114

    phone: 617-726-6111

    dschoenfeld@partners.org

     

    The information in this e-mail is intended only for the person to whom it is
    addressed. If you believe this e-mail was sent to you in error and the e-mail
    contains patient information, please contact the Partners Compliance HelpLine at
    http://www.partners.org/complianceline . If the e-mail was sent to you in error
    but does not contain patient information, please contact the sender and properly
    dispose of the e-mail.