Split samples and design sensitivity in observational studies

Ruth Heller*, Paul R. Rosenbaum, Dylan S. Small

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

53 Scopus citations


An observational or nonrandomized study of treatment effects may be biased by failure to control for some relevant covariate that was not measured. The design of an observational study is known to strongly affect its sensitivity to biases from covariates that were not observed. For instance, the choice of an outcome to study, or the decision to combine several outcomes in a test for coherence, can materially affect the sensitivity to unobserved biases. Decisions that shape the design are, therefore, critically important, but they are also difficult decisions to make in the absence of data. We consider the possibility of randomly splitting the data from an observational study into a smaller planning sample and a larger analysis sample, where the planning sample is used to guide decisions about design. After reviewing the concept of design sensitivity, we evaluate sample splitting in theory, by numerical computation, and by simulation, comparing it to several methods that use all of the data. Sample splitting is remarkably effective, much more so in observational studies than in randomized experiments: splitting 1,000 matched pairs into 100 planning pairs and 900 analysis pairs often materially improves the design sensitivity. An example from genetic toxicology is used to illustrate the method.

Original languageEnglish
Pages (from-to)1090-1101
Number of pages12
JournalJournal of the American Statistical Association
Issue number487
StatePublished - 2009
Externally publishedYes


FundersFunder number
National Science Foundation
United States-Israel Binational Science Foundation2008049, SES-0849370


    • Coherence
    • Multiple comparisons
    • Permutation test
    • Sensitivity analysis


    Dive into the research topics of 'Split samples and design sensitivity in observational studies'. Together they form a unique fingerprint.

    Cite this