The practice of simultaneously rejecting the null hypothesis and estimating population parameters conflates two different statistical goals. (1) Parameter estimation (estimating a true population value) is the goal of population survey research. (2) Hypothesis testing is the goal of much science and engineering, where the aim is to falsify a null hypothesis rather than estimate an alternative parameter. Hypothesis testing only justifies a decision about the existence of an effect, not confidence that the reported interval contains the true effect. Treating the interval as an accurate estimator is a conceptual error.
Original Message:
Sent: 03-02-2026 07:21
From: Jonathan Shuster
Subject: Important Meta-analysis paper prepring
Thanks for this reply. But the key irrefutable mathematical facts against the inverse variance weighted random effects methods for a set of randomized clinical trials are (1) they make contradictory assumptions and (2) they apply linear weighted distribution theory in an illegitimate manner (weights are not constants to a high degree of accuracy). Any statistician, who in a situation that might involve public health policy, uses these methods in such a situation or as a reviewer who allows the use of these methods is potentially risking scientifically unsupportable inferences.
Fixed-effects are legitimate for the narrow hypothesis that the true effect sizes are zero for all studies. The resulting confidence intervals, which today are the crux of our inferences, cannot be trusted under the more general and realistic random effects scenario.
------------------------------
Jonathan Shuster
Original Message:
Sent: 03-01-2026 08:46
From: Eugene Komaroff
Subject: Important Meta-analysis paper prepring
My previous post should not be taken as disparagement of meta-analysis, which is a subset of the systematic review methodology as practiced in medical research (Cochrane Reviews), psychosocial research (Campbell Collaboration), and educational research (What Works Clearinghouse), among others. The reviewers enhance their qualitative systematic review with a statistical meta-analysis if the data permits.
Consider a meta-analysis of sample means where the statistic of interest is the pooled mean. The standard error (SE) in statistical theory is the standard deviation of an infinitely large theoretical sampling distribution of means that is indexed by sample size (n). Therefore, SE is actually a function [SE(n)], not a fixed number. The sample estimates of SE (n) are obtained from each study in the meta-analysis by dividing the sample standard deviation (sd) by the square root of the sample size [se = sd/sqrt(n)]. Because standard error is a ratio, the value is affected by both the numerator (sample standard deviation) and the denominator (sample size). Regarding David's post, even if all the studies in a meta-analysis had the same sample size, the estimated standard errors would still differ because the sample standard deviations vary (hopefully randomly, or only due to sampling error).
The sample estimates of the standard error are fundamental to the concept of precision in meta-analysis. Precision (1/se2) is the weight that multiplies the corresponding sample mean when computing the pooled (overall) mean. For instance, if one study had se2 = 10 and another had se2 = 100, the mean with the smaller precision (1/se2 = 0.10) contributes more to the pooled mean than the mean that is multiplied by the larger precision (0.01). It is important not to confuse precision with accuracy, as often happens in discussions of confidence intervals. Precision and accuracy are known as reliability and validity in psychometrics. Measurements can be precise but may not be accurate. For instance, one can get the same number by repeatedly stepping on a weight scale. The measurements are precise but not accurate unless the scale is calibrated to zero. Precise measurements may not be accurate, but accurate measurements must be precise.
What I described above was a fixed effects meta-analysis. A random-effects meta-analysis enables the incorporation of a variance component, which inflates precision. This component represents heterogeneity of the sample means in the analysis. It is problematic when it exceeds what is expected by chance alone (sampling error). This is similar to the workaround for a pooled standard deviation needed for a two-sample t-test, for example, when the assumption of homogeneity of variance is violated. The variance component is an explicit acknowledgement that the means in the meta-analysis may have been estimates of different parameters due to differences in, for example, demographics, interventions, exposures, study designs, outcomes, measurements, and statistical methods. Nevertheless, it is possible to reduce the component by adding study-level covariates to the model. Although some have criticized this as an example of adding apples and oranges, others have argued that there is something to be gained by studying the fruit salad.
Note that, above, I stated "assume randomly," which acknowledges Jonathan's scholarly, mathematical/statistical concern about the weights not being random. However, the data are assumed to be random samples when conducting inferential analyses (e.g., t-tests, ANOVAs, regression). This assumption is clearly violated when compelling evidence establishes that the observed data could not have arisen from chance variation alone.
------------------------------
Eugene Komaroff
komaroffeugene@gmail.com
Original Message:
Sent: 02-25-2026 13:46
From: David Zucker
Subject: Important Meta-analysis paper prepring
Hi,
I think the issue Jon is raising basically boils down to a concern that the study size may be informative, i.e., there is correlation between the study size and the effect size. I think it would be clearer to frame the issue in terms of informative study size, as opposed to saying that the problem is that the weights are nonrandom. There are basically two main sources of randomness in the weights used in a standard meta-analysis: (a) randomness arising from the fact that the weights depend on quantities such as the between-study variance in the treatment effect and the within-study variance of the outcome variable, which are generally unknown and have to be estimated, and (b) randomness arising from variation in the sample size from study to study. The standard meta-analysis is based on the assumption that the randomness due to parameter estimation is negligible and the assumption that the study size is noninformative. If the number of studies and the sample size within studies are both large, the assumption of negligible parameter estimation error is probably reasonable, so that the main source of randomness is the variation in study size. The fact that the study sizes are random is not in and of itself the problem. If the study sizes are noninformative, the standard meta-analysis is valid despite the randomness. The problem comes in when the study sizes are informative. There is literature on methods for clustered data with informative cluster sizes that can be brought to bear here. One message that comes out of what Jon is saying is that researchers conducting a meta-analysis should examine whether there is a material degree of correlation between the study size and the treatment effect, and if there is, they should try to identify the source of the correlation.
Regards,
David
------------------------------
David Zucker
Department of Statistics and Data Science
Hebrew University of Jerusalem
Original Message:
Sent: 02-24-2026 10:53
From: Eugene Komaroff
Subject: Important Meta-analysis paper prepring
Hi Jon. Thank you for your reply. I fully agree that patient-level meta-analysis with stratification is the optimal approach for controlling and adjusting for Simpson's paradox, which can arise from treatment-by-patient interactions. Additionally, if there were any discrepancy between the "average" treatment effect observed in a large meta-analysis and the "average" treatment effect from a substantial patient-level randomized controlled trial, I would vote or place greater trust in the patient-level results.
Eugene
------------------------------
Eugene Komaroff
komaroffeugene@gmail.com
Original Message:
Sent: 02-23-2026 15:19
From: Jonathan Shuster
Subject: Important Meta-analysis paper prepring
Thanks for your response. Selection bias is a problem which I address (not resolve) in the discussion. Patient level data would be very helpful but with a lot of experience with Simpson's Paradox, I do not believe in study level covariates. But yes, these methods can be extended to apply to patient level data-I do mention stratification as well. With mandatory registries for clinical trials, I think that the problem of missing studies has been resolved. The big drawback to all meta-analysis lies in quitting unpromising therapies. This suggests bias given there are a large number of studies done.
Best wishes,
Jon
------------------------------
Jonathan Shuster
Original Message:
Sent: 02-23-2026 10:15
From: Eugene Komaroff
Subject: Important Meta-analysis paper prepring
Hi Jonathan. My distrust of overall effect sizes from meta-analysis (MA) stems not from the statistical methods but from the GIGO (garbage in, garbage out) principle. An overall effect size can be highly misleading due to selection bias and compromised research quality and integrity in the collection and presentation of the evidence (https://www.cochrane.org/evidence/why-our-evidence-trusted). In any event, unlike fixed-effects MA, random-effects MA permits the inclusion of study-level covariates to reduce the nuisance caused by the between-studies variance (see attached). Do you think your ratio estimation method can be extended to accommodate study-level covariates?
Eugene
------------------------------
Eugene Komaroff
komaroffeugene@gmail.com
Original Message:
Sent: 02-20-2026 10:22
From: Eileen Beachell
Subject: Important Meta-analysis paper prepring
Thank you, thank you, thank you! There are so many issues with meta-analysis. The large data sets with 0% error, etc. I'm so glad that someone has taken the time to bring forward the many concerns with this strategy. I often think about the measurement errors (lab instruments and techniques) associated with each data set and the combined errors. Thank you for your post. Eileen
------------------------------
Eileen Beachell
Original Message:
Sent: 02-19-2026 13:46
From: Jonathan Shuster
Subject: Important Meta-analysis paper prepring
The paper, linked below, is now In Press in Statistics in Biopharmaceutical Research. If you have any interest in meta-analysis, which is at the apex of most evidence pyramids, or are seeking a fertile area for further research, you should pay close attention to this paper. If you are a peer-reviewer of a report of a meta-analysis of clinical trials, I hope you prioritize the math over tradition to be sure there are evidence-based conclusions. Start by reading the second and third paragraphs of the introduction, which motivate the importance of this article. Next, Section 2 demonstrates that the current mainstream methods (Inverse variance weighting) represent a misuse of linear combination theory, rendering these estimates potentially seriously biased. Mainstream methodology can lead to unsupportable conclusions about a therapy. Section 3 produces an asymptotically valid methodology, but it defines its target population differently from the mainstream. The Shuster (2023) reference in the link showed in two examples from major medical journals that unsupportable public health conclusions that affected patient safety were reached.
Mainstream Meta-Analysis of Clinical Trials produces strongly Inconsistent Estimators
------------------------------
Jonathan Shuster
------------------------------