TY - JOUR
T1 - Split samples and design sensitivity in observational studies
AU - Heller, Ruth
AU - Rosenbaum, Paul R.
AU - Small, Dylan S.
N1 - Funding Information:
Ruth Heller is Assistant Professor, Faculty of Industrial Engineering and Management, Technion-Israel Institute of Technology, Haifa, Israel (E-mail: [email protected]). Paul R. Rosenbaum is Professor, Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104 (E-mail: [email protected]). Dylan S. Small is Associate Professor, Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104 (E-mail: [email protected]). The work of Heller, Rosenbaum and Small is supported by BSF grant No. 2008049 and the work of Rosenbaum is supported by grant SES-0849370 from the Measurement, Methodology and Statistics Program of the U.S. National Science Foundation.
PY - 2009
Y1 - 2009
N2 - An observational or nonrandomized study of treatment effects may be biased by failure to control for some relevant covariate that was not measured. The design of an observational study is known to strongly affect its sensitivity to biases from covariates that were not observed. For instance, the choice of an outcome to study, or the decision to combine several outcomes in a test for coherence, can materially affect the sensitivity to unobserved biases. Decisions that shape the design are, therefore, critically important, but they are also difficult decisions to make in the absence of data. We consider the possibility of randomly splitting the data from an observational study into a smaller planning sample and a larger analysis sample, where the planning sample is used to guide decisions about design. After reviewing the concept of design sensitivity, we evaluate sample splitting in theory, by numerical computation, and by simulation, comparing it to several methods that use all of the data. Sample splitting is remarkably effective, much more so in observational studies than in randomized experiments: splitting 1,000 matched pairs into 100 planning pairs and 900 analysis pairs often materially improves the design sensitivity. An example from genetic toxicology is used to illustrate the method.
AB - An observational or nonrandomized study of treatment effects may be biased by failure to control for some relevant covariate that was not measured. The design of an observational study is known to strongly affect its sensitivity to biases from covariates that were not observed. For instance, the choice of an outcome to study, or the decision to combine several outcomes in a test for coherence, can materially affect the sensitivity to unobserved biases. Decisions that shape the design are, therefore, critically important, but they are also difficult decisions to make in the absence of data. We consider the possibility of randomly splitting the data from an observational study into a smaller planning sample and a larger analysis sample, where the planning sample is used to guide decisions about design. After reviewing the concept of design sensitivity, we evaluate sample splitting in theory, by numerical computation, and by simulation, comparing it to several methods that use all of the data. Sample splitting is remarkably effective, much more so in observational studies than in randomized experiments: splitting 1,000 matched pairs into 100 planning pairs and 900 analysis pairs often materially improves the design sensitivity. An example from genetic toxicology is used to illustrate the method.
KW - Coherence
KW - Multiple comparisons
KW - Permutation test
KW - Sensitivity analysis
UR - http://www.scopus.com/inward/record.url?scp=70349763600&partnerID=8YFLogxK
U2 - 10.1198/jasa.2009.tm08338
DO - 10.1198/jasa.2009.tm08338
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:70349763600
SN - 0162-1459
VL - 104
SP - 1090
EP - 1101
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 487
ER -