ASA Connect

 View Only
  • 1.  Question regarding p-value interpretation in Group Sequential Designs

    Posted 03-25-2016 08:51
    Edited by George Skountrianos 03-28-2016 08:58

    Hello everyone. I have a general question that I would appreciate any input on:

    Suppose we design a typical group sequential design and to stop (for efficacy or futility) at the first interim analysis the result has to be significant at a nominal p-value of 0.002. Now further suppose that we observe a p-value of 0.025 at this stage. Two questions:

    1. Per the GS boundaries we technically cannot stop but how would we interpret this observed p-value. You can imagine a project team asking "yes this result is not significant at the 0.002 level but why can't we say it is significant at the 0.025 level?"

    2. If the project team chooses to stop at this stage due to reasons other that statistical (e.g. operationally not feasibility to continue the project) can we use the p-value of 0.025?

    So my overall question is how do we interpret these observed interim p-values when they are between stopping bounds

    Thank you!

    George

    ------------------------------
    George Skountrianos
    Statistician
    Hollister Incorporated
    ------------------------------



  • 2.  RE: Question regarding p-value interpretation in Group Sequential Designs

    Posted 03-28-2016 07:30

    If the project team commits and declares that the first look would be the last look, before looking at the data, then in the subsequent analysis if it is found the p-value to be 0.025, then one can stop the trial and say the result is significant at p=0.025.  But if such a decision is made after looking at the data,, then it will not be valid as it would be a biased decision.

    ------------------------------
    Anthiyur Kannappan



  • 3.  RE: Question regarding p-value interpretation in Group Sequential Designs

    Posted 04-04-2016 16:23

    Suppose (we are taking a frequentist perspective here) that you repeated the clinical trial 1000 times, each time performing the same sequence. If the p-value is 0.025 or less at the interim analysis, you stop and declare efficacy, if it isn't, you continue. Let's also suppose the null hypothesis is true.

    Then 2.5% of the time, an average of 25 times of the 1000, you stop and declare success.

    What happens if the p-value is above 0.025 at the interim. 25 times of your 1000 you won't have to worry about it. But what about the other 975?

    You have spent all of your alpha at the interim, so there is none left for a final analysis. And any test you perform after the interim will result in a higher number of successes for the trial as a whole than your nominal value.

    Any course of action you take will violate an ethical constraint. If you continue and test, you are conducting a study that has a higher false positive rate than you represented to the scientific community. If you continue and don't test, you are subjecting your patients to potentially dangerous novel treatment for the supposed benefit of science without obtaining any actual scientific benefit from their risk, which is not fair to your patients.  And if you either don't test or don't continue at all, you are conducting a study which will likely have significantly less power (higher false negative rate), than you represented (or than is generally considered ethical for clinical trials), which is not fair to the general future patient community, nor to your investors.

    ------------------------------
    Jonathan Siegel
    Senior Research Biostatistician



  • 4.  RE: Question regarding p-value interpretation in Group Sequential Designs

    Posted 03-28-2016 10:05

    First of all, the use of a sequence of p-values as test cut-offs in a group sequential design was, and still is, motivated by the desire to maintain an upper bound on the overall false positive probability with repeated testing, and the particular numerical p-value cut-offs depend on the particular design.   This leads to many silly things, since a given data set, either interim or final,  can lead to different  decisions, depending on how one generated the data and what one intended to do

    A simple, much cited example of this lunacy (Berger and Wolpert, 1984) is a dataset consisting of 9 responses in a sample of size 12, with the aim to perform a one-sided test of  Pr(response) = theta = .50 versus theta > .50. The p-value = .0730 if the data were obtained by simply sampling 12 subjects, implying a binomial distribution, but the p-value = .0337 if  the data were obtained by sampling until 3 failures (non-responses) were observed, since this implies a negative binomial distribution.  If one has the data but does not know what was intended, one cannot compute a p-value.   If, instead,  one takes a Bayesian approach and assumes that theta follows a beta (.50, 50) prior, then one can compute the posterior probability Pr( theta > .50 | 9 responses in 12 subjects)  = .96.  Of course, a posterior 96% credible interval for theta is .471 to .924, so the data do not provide strong evidence of anything. 

    In your example, the people look at the interim data and decide to violate the design.  This says that the actual design is something of the form "We will follow a formal group sequential design provided by a statistician, unless we don't like the decision that it dictates, in which case we will decide whatever we like."   Additionally, it is very hard to interpret what actually is going on, since the data have been reduced to a single p-value, so a great deal of information is missing.

    Given the above actual design, which allows the people to decide anything they like based on the interim data, computing or interpreting an empirical p-value is a fool's errand.

    ------------------------------
    Peter Thall
    Professor
    Univ. of Texas-MD Anderson Cancer Center



  • 5.  RE: Question regarding p-value interpretation in Group Sequential Designs

    Posted 03-28-2016 18:40

    The short answer is, we interpret them as inconclusive.

    ------------------------------
    Eric Siegel, MS
    Research Associate
    Department of Biostatistics
    Univ. Arkansas Medical Sciences