TY - JOUR
T1 - Identifying differentially expressed genes using false discovery rate controlling procedures
AU - Reiner, Anat
AU - Yekutieli, Daniel
AU - Benjamini, Yoav
N1 - Funding Information:
This research has been partially supported by the F.I.R.S.T. grant from the Israeli Academy of Sciences and Humanities. Y.Benjamini has been partially supported by an NIH grant and a U.S.–Israel Binational Science Foundation grant.
PY - 2003/2/12
Y1 - 2003/2/12
N2 - Motivation: DNA microarrays have recently been used for the purpose of monitoring expression levels of thousands of genes simultaneously and identifying those genes that are differentially expressed. The probability that a false identification (type I error) is committed can increase sharply when the number of tested genes gets large. Correlation between the test statistics attributed to gene co-regulation and dependency in the measurement errors of the gene expression levels further complicates the problem. In this paper we address this very large multiplicity problem by adopting the false discovery rate (FDR) controlling approach. In order to address the dependency problem, we present three resampling-based FDR controlling procedures, that account for the test statistics distribution, and compare their performance to that of the naiÌve application of the linear step-up procedure in Benjamini and Hochberg (1995). The procedures are studied using simulated microarray data, and their performance is examined relative to their ease of implementation. Results: Comparative simulation analysis shows that all four FDR controlling procedures control the FDR at the desired level, and retain substantially more power then the family-wise error rate controlling procedures. In terms of power, using resampling of the marginal distribution of each test statistics substantially improves the performance over the naiÌve one. The highest power is achieved, at the expense of a more sophisticated algorithm, by the resampling-based procedures that resample the joint distribution of the test statistics and estimate the level of FDR control.
AB - Motivation: DNA microarrays have recently been used for the purpose of monitoring expression levels of thousands of genes simultaneously and identifying those genes that are differentially expressed. The probability that a false identification (type I error) is committed can increase sharply when the number of tested genes gets large. Correlation between the test statistics attributed to gene co-regulation and dependency in the measurement errors of the gene expression levels further complicates the problem. In this paper we address this very large multiplicity problem by adopting the false discovery rate (FDR) controlling approach. In order to address the dependency problem, we present three resampling-based FDR controlling procedures, that account for the test statistics distribution, and compare their performance to that of the naiÌve application of the linear step-up procedure in Benjamini and Hochberg (1995). The procedures are studied using simulated microarray data, and their performance is examined relative to their ease of implementation. Results: Comparative simulation analysis shows that all four FDR controlling procedures control the FDR at the desired level, and retain substantially more power then the family-wise error rate controlling procedures. In terms of power, using resampling of the marginal distribution of each test statistics substantially improves the performance over the naiÌve one. The highest power is achieved, at the expense of a more sophisticated algorithm, by the resampling-based procedures that resample the joint distribution of the test statistics and estimate the level of FDR control.
UR - http://www.scopus.com/inward/record.url?scp=0037433040&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btf877
DO - 10.1093/bioinformatics/btf877
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0037433040
SN - 1367-4803
VL - 19
SP - 368
EP - 375
JO - Bioinformatics
JF - Bioinformatics
IS - 3
ER -