TY - JOUR
T1 - Using stochastic approximation techniques to efficiently construct confidence intervals for heritability
AU - Schweiger, Regev
AU - Fisher, Eyal
AU - Rahmani, Elior
AU - Shenhav, Liat
AU - Rosset, Saharon
AU - Halperin, Eran
N1 - Publisher Copyright:
© Copyright 2018, Mary Ann Liebert, Inc.
PY - 2018/7
Y1 - 2018/7
N2 - Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense single-nucleotide polymorphism (SNP)-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SEs), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals (CIs). In addition, for larger data sets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications. Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate CIs. FIESTA is based on parametric bootstrap sampling, and, therefore, avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared with previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, for example, requiring only several seconds for data sets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all data set sizes.
AB - Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense single-nucleotide polymorphism (SNP)-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SEs), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals (CIs). In addition, for larger data sets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications. Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate CIs. FIESTA is based on parametric bootstrap sampling, and, therefore, avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared with previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, for example, requiring only several seconds for data sets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all data set sizes.
KW - confidence intervals
KW - heritability
KW - stochastic approximation
UR - http://www.scopus.com/inward/record.url?scp=85050305542&partnerID=8YFLogxK
U2 - 10.1089/cmb.2018.0047
DO - 10.1089/cmb.2018.0047
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85050305542
SN - 1066-5277
VL - 25
SP - 794
EP - 808
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 7
ER -