TY - JOUR
T1 - GePMI
T2 - A statistical model for personal intestinal microbiome identification
AU - Wang, Zicheng
AU - Lou, Huazhe
AU - Wang, Ying
AU - Shamir, Ron
AU - Jiang, Rui
AU - Chen, Ting
N1 - Publisher Copyright:
© 2018, The Author(s).
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Human gut microbiomes consist of a large number of microbial genomes, which vary by diet and health conditions and from individual to individual. In the present work, we asked whether such variation or similarity could be measured and, if so, whether the results could be used for personal microbiome identification (PMI). To address this question, we herein propose a method to estimate the significance of similarity among human gut metagenomic samples based on reference-free, long k-mer features. Using these features, we find that pairwise similarities between the metagenomes of any two individuals obey a beta distribution and that a p value derived accordingly well characterizes whether two samples are from the same individual or not. We develop a computational framework called GePMI (Generating inter-individual similarity distribution for Personal Microbiome Identification) and apply it to several human gut metagenomic datasets (>300 individuals and >600 samples in total). From the results of GePMI, most of the human gut microbiomes can be identified (auROC = 0.9470, auPRC = 0.8702). Even after antibiotic treatment or fecal microbiota transplantation, the individual k-mer signature still maintains a certain specificity.
AB - Human gut microbiomes consist of a large number of microbial genomes, which vary by diet and health conditions and from individual to individual. In the present work, we asked whether such variation or similarity could be measured and, if so, whether the results could be used for personal microbiome identification (PMI). To address this question, we herein propose a method to estimate the significance of similarity among human gut metagenomic samples based on reference-free, long k-mer features. Using these features, we find that pairwise similarities between the metagenomes of any two individuals obey a beta distribution and that a p value derived accordingly well characterizes whether two samples are from the same individual or not. We develop a computational framework called GePMI (Generating inter-individual similarity distribution for Personal Microbiome Identification) and apply it to several human gut metagenomic datasets (>300 individuals and >600 samples in total). From the results of GePMI, most of the human gut microbiomes can be identified (auROC = 0.9470, auPRC = 0.8702). Even after antibiotic treatment or fecal microbiota transplantation, the individual k-mer signature still maintains a certain specificity.
UR - http://www.scopus.com/inward/record.url?scp=85052816896&partnerID=8YFLogxK
U2 - 10.1038/s41522-018-0065-2
DO - 10.1038/s41522-018-0065-2
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 30210803
AN - SCOPUS:85052816896
SN - 2055-5008
VL - 4
JO - npj Biofilms and Microbiomes
JF - npj Biofilms and Microbiomes
IS - 1
M1 - 20
ER -