ASA Connect

 View Only
  • 1.  Time data as discrete values

    Posted 01-02-2018 09:22
    Dear ASA Colleagues,

    Marketing has given me some data to analyze for possible claims. The data is given in hours(continuous variable), however, the responses for the results were limited to 1, 2, 4, 8, 12, 24, and 30 hours(discrete values). My question is can I analyze the data using discrete distributions since the data is discrete values, instead of continuous distributions? Has anyone been in a similar situation? Looking forward to your comments.

    Best Regards,

    Shawn Currie


  • 2.  RE: Time data as discrete values

    Posted 01-03-2018 17:12

    This sounds like a study where subjects provide results at each point in one day up to 12 hours, go to sleep, and then are measured at 24 and 30 hours the following day.  The measurement is a discrete event such as smelling a fragrance.  The goals is to find a mean time or median time to the event, such as duration of fragrance.  One approach is to use an ordinal logistic model where each time point is a class compared to a reference. Another approach used for discrete survival analysis is a complementary log-log model.  If you are only interested in a claim about one product, it is an intercept only model. If you are comparing two products, then there is one binary independent variable. 

     

    The problem is that these methods assume equal intervals and would reduce the specific times to 7 ordered points.  Can anyone think of an approach for unequal intervals in discrete survival analysis?   



    ------------------------------
    Georgette Asherman
    ------------------------------



  • 3.  RE: Time data as discrete values

    Posted 01-03-2018 17:15
    Is time an independent variable or a result?  If the former, they may have simply chosen convenient times to take data.  If the latter, they might merely be rounded.  Either way, it seems appropriate to treat them as continuous because, for example, 12 is truly half way between 8 and 16.  Treating them as ordinal or character will throw away potentially valuable information.

    Keep in mind that residual plots and distribution may look kind of choppy, but that shouldn't really be a problem.  You might also get some deceptively significant diagnostic tests.​

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------



  • 4.  RE: Time data as discrete values

    Posted 01-04-2018 20:21
    There are a number of situations in clinical trials that have some but not complete analogy to yours. Clinic visits have to be scheduled at discrete intervals, and blood and similar samples have to be taken at discrete timepoints, but the questions clinical trials collect this data to answer are typically continuous in character, e.g. time to progression or the parameters of a PK model.

    Continuous models are often used all the same, but they suffer from bias when used with data which is essentially discrete in character.

    In these situations, I often use the traditional continuous approach, but do sensitivity models (e.g. for power calculations) taking into account the discreteness of the data, and have attempted to train myself to look for situations where discreteness bias renders the apparent results misleading and unreliable.

    There are a couple of rules of thumb:

    A. Bias of a point estimate analyzed as continuous is generally manageable when the interval between assessments is small compared to the estimate, but discreteness may need to be taken into account when it is large.

    B. The first assessment is an LOQ, reflecting left-censoring bias. When clinical trials report point estimates at the first assessment, the actual result could be anywhere before. This point is often not understood in the medical community. Even leading clinical journals regularly present estimates at the first assessment as meaningful, e.g. as evidence efficacy in the two arms is similar.

    C. Confidence limits and other estimates of variation can similarly be highly prone to bias. An example is cases where reported confidence intervals are very tight around the point estimate, leading people to interpret the point estimate as highly reliable. Such an interpretation is often misleading and an artifact of discreteness bias. It may reflect e.g. scheduling variation of visits around a target date, and may have nothing to do with the actual variation in the parameter being estimated.

    Jonathan Siegel
    Associate Director Clinical Statistics

    Sent from my iPhone




  • 5.  RE: Time data as discrete values

    Posted 01-05-2018 09:01

    Back when I wore my molecular pharmacologist hat and lab coat (before allergies to mice squashed that career choice and I turned myself into a statistician), I frequently ran experiments with a similar design in the lab. Imaginatively, pharmacologists call these experiments "time courses".

    We exposed animals or cultured cells to a treatment and then measured the "zero-time" response immediately before or after exposure and subsequent "time course" responses at various specified times after exposure. We looked at the data gathered at each time point as a measurement of the magnitude of response of an ongoing process and, as Emil Friedman said in his response to your posting, we most often interpreted the data to be continuous.

    By default, we applied repeated-measures ANOVA for most datasets-repeated measures because we repeatedly measured the response in the same animals or in aliquots of the same population of cultured cells and ANOVA because we generally compared more than two treatments. The choice of parametric versus nonparametric repeated measures ANOVA was based on what we knew or didn't know about the distribution of the data (practically speaking, whether, because of some methodology constraint, we had a small sample size or had to gather ordinal or ranked data).  Concisely, if cultured cells (large numbers of cells) were being used, we applied parametric repeated measures ANOVA whereas, if animals (limited sample size) were being used, we applied nonparametric, repeated-measures ANOVA, such as Friedman's test.

    This is a pretty straight-forward, accessible approach to analyzing the type of data that you described, which is important if you have to communicate your analysis to non-researchers/statisticians.  

    A second possibility that might work for your data:  Wearing my current policy researcher hat, I recently analyzed a data set with repeated measures (one measurement per quarter of a fiscal year over several fiscal years) on individual hospitals in a sample of hospitals. The data were dichotomous-something either happened or it didn't and I needed to include covariates in the model so I applied a generalized estimating equations logistic regression analysis to the data.  The GEE accounted for the unknown correlation between the repeated measurements on individual hospitals.  You would have to select a distribution and link function appropriate to your data.   Applying GEE was intellectually satisfying to me but the results of this analysis were not easy to communicate to my client.  



    ------------------------------
    Linda A. Landon, PhD, ELS

    Research Communiqué
    Business, Marketing, & Policy Research

    www.researchcommunique.com
    LandonPhD@ResearchCommunique.com
    573-797-4517

    PhD, Molecular Pharmacology
    Graduate Certificate, Applied Statistics
    Board-Certified Editor in the Life Sciences
    ------------------------------