Zum Hauptinhalt springen

I4R Discussion Paper Series #102

2024

Felix Holzmeister (University of Innsbruck), Magnus Johannesson, Robert Böhm (University of Vienna, University of Copenhagen), Anna Dreber, Jürgen Huber (University of Innsbruck), Michael Kirchler (University of Innsbruck)

Heterogeneity in Effect Size Estimates: Empirical Evidence and Practical Implications

A typical empirical study involves choosing a sample, a research design, and an analysis path. Variation in such choices across studies leads to heterogeneity in results that introduce an additional layer of uncertainty not accounted for in reported standard errors and confi dence intervals. We provide a framework for studying heterogeneity in the social sciences and divide heterogeneity into population heterogeneity, design heterogeneity, and analytical heterogeneity. We estimate each type's heterogeneity from multi-lab replication studies, prospective meta-analyses of studies varying experimental designs, and multi-analyst studies. Our results suggest that population heterogeneity tends to be relatively small, whereas design and analytical heterogeneity are large. A conservative interpretation of the estimates suggests that incorporating the uncertainty due to heterogeneity would approximately double sample standard errors and confi dence intervals. We illustrate that heterogeneity of this magnitude—unless properly accounted for—has severe implications for statistical inference with strongly increased rates of false scientific claims.