TY - JOUR
T1 - Improving the identification of phenotypic abnormalities and sexual dimorphism in mice when studying rare event categorical characteristics
AU - Karp, Natasha A.
AU - Heller, Ruth
AU - Yaacoby, Shay
AU - White, Jacqueline K.
AU - Benjamini, Yoav
N1 - Publisher Copyright:
© 2017 Karp et al.
PY - 2017/2
Y1 - 2017/2
N2 - Biological research frequently involves the study of phenotyping data. Many of these studies focus on rare event categorical data, and functional genomics studies typically study the presence or absence of an abnormal phenotype. With the growing interest in the role of sex, there is a need to assess the phenotype for sexual dimorphism. The identification of abnormal phenotypes for downstream research is challenged by the small sample size, the rare event nature, and the multiple testing problem, as many variables are monitored simultaneously. Here, we develop a statistical pipeline to assess statistical and biological significance while managing the multiple testing problem. We propose a two-step pipeline to initially assess for a treatment effect, in our case example genotype, and then test for an interaction with sex. We compare multiple statistical methods and use simulations to investigate the control of the type-one error rate and power. To maximize the power while addressing the multiple testing issue, we implement filters to remove data sets where the hypotheses to be tested cannot achieve significance. A motivating case study utilizing a large scale highthroughput mouse phenotyping data set from the Wellcome Trust Sanger Institute Mouse Genetics Project, where the treatment is a gene ablation, demonstrates the benefits of the new pipeline on the downstream biological calls.
AB - Biological research frequently involves the study of phenotyping data. Many of these studies focus on rare event categorical data, and functional genomics studies typically study the presence or absence of an abnormal phenotype. With the growing interest in the role of sex, there is a need to assess the phenotype for sexual dimorphism. The identification of abnormal phenotypes for downstream research is challenged by the small sample size, the rare event nature, and the multiple testing problem, as many variables are monitored simultaneously. Here, we develop a statistical pipeline to assess statistical and biological significance while managing the multiple testing problem. We propose a two-step pipeline to initially assess for a treatment effect, in our case example genotype, and then test for an interaction with sex. We compare multiple statistical methods and use simulations to investigate the control of the type-one error rate and power. To maximize the power while addressing the multiple testing issue, we implement filters to remove data sets where the hypotheses to be tested cannot achieve significance. A motivating case study utilizing a large scale highthroughput mouse phenotyping data set from the Wellcome Trust Sanger Institute Mouse Genetics Project, where the treatment is a gene ablation, demonstrates the benefits of the new pipeline on the downstream biological calls.
KW - Gene–phenotype map
KW - Mouse models
KW - Multiple testing
KW - Rare events
KW - Sexual dimorphism
UR - http://www.scopus.com/inward/record.url?scp=85021448132&partnerID=8YFLogxK
U2 - 10.1534/genetics.116.195388
DO - 10.1534/genetics.116.195388
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85021448132
SN - 0016-6731
VL - 205
SP - 491
EP - 501
JO - Genetics
JF - Genetics
IS - 2
ER -