TY - JOUR
T1 - Optimal control of false discovery criteria in the two-group model
AU - Heller, Ruth
AU - Rosset, Saharon
N1 - Publisher Copyright:
© 2020 Royal Statistical Society
PY - 2021/2
Y1 - 2021/2
N2 - The highly influential two-group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two-group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.
AB - The highly influential two-group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two-group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.
KW - false discovery rate
KW - infinite linear programming
KW - large-scale inference
KW - multiple testing
KW - positive FDR
UR - http://www.scopus.com/inward/record.url?scp=85097009377&partnerID=8YFLogxK
U2 - 10.1111/rssb.12403
DO - 10.1111/rssb.12403
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85097009377
SN - 1369-7412
VL - 83
SP - 133
EP - 155
JO - Journal of the Royal Statistical Society. Series B: Statistical Methodology
JF - Journal of the Royal Statistical Society. Series B: Statistical Methodology
IS - 1
ER -