TY - JOUR
T1 - Analysis of Latino populations from GALA and MEC studies reveals genomic loci with biased local ancestry estimation
AU - Pasaniuc, Bogdan
AU - Sankararaman, Sriram
AU - Torgerson, Dara G.
AU - Gignoux, Christopher
AU - Zaitlen, Noah
AU - Eng, Celeste
AU - Rodriguez-Cintron, William
AU - Chapela, Rocio
AU - Ford, Jean G.
AU - Avila, Pedro C.
AU - Rodriguez-Santana, Jose
AU - Chen, Gary K.
AU - Le Marchand, Loic
AU - Henderson, Brian
AU - Reich, David
AU - Haiman, Christopher A.
AU - Gonzàlez Burchard, Esteban
AU - Halperin, Eran
N1 - Funding Information:
Funding: Research reported in this publication was supported in part by the National Cancer Institute of the National Institutes of Health under award (R03CA162200 to B.P.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The research was also supported in part by the German-Israeli Foundation (GIF) grant number 109433.2/2010 and by the Israeli Science Foundation (grant 04514831).
PY - 2013/6/1
Y1 - 2013/6/1
N2 - Motivation: Local ancestry analysis of genotype data from recently admixed populations (e.g. Latinos, African Americans) provides key insights into population history and disease genetics. Although methods for local ancestry inference have been extensively validated in simulations (under many unrealistic assumptions), no empirical study of local ancestry accuracy in Latinos exists to date. Hence, interpreting findings that rely on local ancestry in Latinos is challenging.Results: Here, we use 489 nuclear families from the mainland USA, Puerto Rico and Mexico in conjunction with 3204 unrelated Latinos from the Multiethnic Cohort study to provide the first empirical characterization of local ancestry inference accuracy in Latinos. Our approach for identifying errors does not rely on simulations but on the observation that local ancestry in families follows Mendelian inheritance. We measure the rate of local ancestry assignments that lead to Mendelian inconsistencies in local ancestry in trios (MILANC), which provides a lower bound on errors in the local ancestry estimates. We show that MILANC rates observed in simulations underestimate the rate observed in real data, and that MILANC varies substantially across the genome. Second, across a wide range of methods, we observe that loci with large deviations in local ancestry also show enrichment in MILANC rates. Therefore, local ancestry estimates at such loci should be interpreted with caution. Finally, we reconstruct ancestral haplotype panels to be used as reference panels in local ancestry inference and show that ancestry inference is significantly improved by incoroprating these reference panels.
AB - Motivation: Local ancestry analysis of genotype data from recently admixed populations (e.g. Latinos, African Americans) provides key insights into population history and disease genetics. Although methods for local ancestry inference have been extensively validated in simulations (under many unrealistic assumptions), no empirical study of local ancestry accuracy in Latinos exists to date. Hence, interpreting findings that rely on local ancestry in Latinos is challenging.Results: Here, we use 489 nuclear families from the mainland USA, Puerto Rico and Mexico in conjunction with 3204 unrelated Latinos from the Multiethnic Cohort study to provide the first empirical characterization of local ancestry inference accuracy in Latinos. Our approach for identifying errors does not rely on simulations but on the observation that local ancestry in families follows Mendelian inheritance. We measure the rate of local ancestry assignments that lead to Mendelian inconsistencies in local ancestry in trios (MILANC), which provides a lower bound on errors in the local ancestry estimates. We show that MILANC rates observed in simulations underestimate the rate observed in real data, and that MILANC varies substantially across the genome. Second, across a wide range of methods, we observe that loci with large deviations in local ancestry also show enrichment in MILANC rates. Therefore, local ancestry estimates at such loci should be interpreted with caution. Finally, we reconstruct ancestral haplotype panels to be used as reference panels in local ancestry inference and show that ancestry inference is significantly improved by incoroprating these reference panels.
UR - http://www.scopus.com/inward/record.url?scp=84878275481&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btt166
DO - 10.1093/bioinformatics/btt166
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84878275481
SN - 1367-4803
VL - 29
SP - 1407
EP - 1415
JO - Bioinformatics
JF - Bioinformatics
IS - 11
ER -