TY - JOUR
T1 - Association Mapping and Significance Estimation via the Coalescent
AU - Kimmel, Gad
AU - Karp, Richard M.
AU - Jordan, Michael I.
AU - Halperin, Eran
N1 - Funding Information:
G.K., E.H. and R.M.K. were supported by National Science Foundation grant IIS-0513599. E.H. was also supported by National Science Foundation grant IIS-0713254. M.I.J. was supported by a grant from Microsoft Research and by an appointment as a Miller Research Professor in the Miller Institute for Basic Research in Science.
PY - 2008/12/12
Y1 - 2008/12/12
N2 - The central questions asked in whole-genome association studies are how to locate associated regions in the genome and how to estimate the significance of these findings. Researchers usually do this by testing each SNP separately for association and then applying a suitable correction for multiple-hypothesis testing. However, SNPs are correlated by the unobserved genealogy of the population, and a more powerful statistical methodology would attempt to take this genealogy into account. Leveraging the genealogy in association studies is challenging, however, because the inference of the genealogy from the genotypes is a computationally intensive task, in particular when recombination is modeled, as in ancestral recombination graphs. Furthermore, if large numbers of genealogies are imputed from the genotypes, the power of the study might decrease if these imputed genealogies create an additional multiple-hypothesis testing burden. Indeed, we show in this paper that several existing methods that aim to address this problem suffer either from low power or from a very high false-positive rate; their performance is generally not better than the standard approach of separate testing of SNPs. We suggest a new genealogy-based approach, CAMP (coalescent-based association mapping), that takes into account the trade-off between the complexity of the genealogy and the power lost due to the additional multiple hypotheses. Our experiments show that CAMP yields a significant increase in power relative to that of previous methods and that it can more accurately locate the associated region.
AB - The central questions asked in whole-genome association studies are how to locate associated regions in the genome and how to estimate the significance of these findings. Researchers usually do this by testing each SNP separately for association and then applying a suitable correction for multiple-hypothesis testing. However, SNPs are correlated by the unobserved genealogy of the population, and a more powerful statistical methodology would attempt to take this genealogy into account. Leveraging the genealogy in association studies is challenging, however, because the inference of the genealogy from the genotypes is a computationally intensive task, in particular when recombination is modeled, as in ancestral recombination graphs. Furthermore, if large numbers of genealogies are imputed from the genotypes, the power of the study might decrease if these imputed genealogies create an additional multiple-hypothesis testing burden. Indeed, we show in this paper that several existing methods that aim to address this problem suffer either from low power or from a very high false-positive rate; their performance is generally not better than the standard approach of separate testing of SNPs. We suggest a new genealogy-based approach, CAMP (coalescent-based association mapping), that takes into account the trade-off between the complexity of the genealogy and the power lost due to the additional multiple hypotheses. Our experiments show that CAMP yields a significant increase in power relative to that of previous methods and that it can more accurately locate the associated region.
UR - http://www.scopus.com/inward/record.url?scp=57049117902&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2008.10.017
DO - 10.1016/j.ajhg.2008.10.017
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 19026399
AN - SCOPUS:57049117902
SN - 0002-9297
VL - 83
SP - 675
EP - 683
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 6
ER -