I'm in the middle of a situation where it looks (at least to me) that an Institutional Review Board (IRB) is micromanaging the details of a clinical study. They asked the principal investigator (PI) the following question:
"The IRB would like to know, how you set the parameters for the power calculation, such as effect size, alpha level. For effect size, you need to have some data to justify or should choose a conservative one. And also which statistical test was the calculation based on."
The PI sent the comments to me, of course.
I wrote a long response about how a Type I error rate of 5% and a Type II error rate of (about) 10% are pretty much standard in the research community. I pointed out the problem with using "effect sizes" and that the sample size justification needs to be based on the minimum clinically important difference instead. I also noted that an IRB demanding a conservative value represents a one-sided perspective on the issue because sample sizes that are too large raise just as many ethical issues as sample sizes that are too small.
So now the IRB asks the question "Please indicate whether or not the PI is confident to observe a difference of at least 15 on the XXXX between two groups and a difference of at least 25 on the YYYY between two groups."
The original power calculation suggested a sample of 30 patients total (15 per group) based on the power and sample size calculator for the independent samples t-test of Russell V. Lenth. Here is what we wrote in the original protocol: "If the standard deviation for the XXXX score is 11, then we would have 89% power for detecting a shift of 15. If the standard deviation of YYYY is 18, then we would have 90% power for detecting a difference of 25."
Now at this point, I'm tempted to say, "Who the heck do you think you are to question the PIs judgment about what is likely to be seen in a study like this?" Instead, I am just suggesting that the PI respond "Differences of this size have been seen in similar research studies, but we won't know for sure in our setting until we do the research." after making sure, of course, that there are indeed a couple of studies that actually do have differences of 15 and 25 on the two outcome measures.
But this is bothering me a lot. I'm at a loss as for why the IRB is doing this. In the hundreds of studies that I've helped get through the IRB, I have never had anyone question the sample size justification in such detail. Are they upset because the effect size is so much larger than 0.8? Are they convinced that a sample of 30 patients total is guaranteed to be underpowered?
I'm curious if anyone else has had a similar encounter with an IRB. Any advice on how to handle this would also be appreciated.
-------------------------------------------
Stephen Simon
Independent Statistical Consultant
P. Mean Consulting
-------------------------------------------