A multi-lab experimental assessment reveals that replicability can be improved by using empirical estimates of genotype-by-lab interaction

Iman Jaljuli*, Neri Kafkafi, Eliezer Giladi, Ilan Golani, Illana Gozes, Elissa J. Chesler, Molly A. Bogue, Yoav Benjamini

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

AU The: utility Please confirm that all heading levels are represented correctly of mouse and rat studies critically depends on their : replicability in other laboratories. A widely advocated approach to improving replicability is through the rigorous control of predefined animal or experimental conditions, known as standardization. However, this approach limits the generalizability of the findings to only to the standardized conditions and is a potential cause rather than solution to what has been called a replicability crisis. Alternative strategies include estimating the heterogeneity of effects across laboratories, either through designs that vary testing conditions, or by direct statistical analysis of laboratory variation. We previously evaluated our statistical approach for estimating the interlaboratory replicability of a single laboratory discovery. Those results, however, were from a well-coordinated, multi-lab phenotyping study and did not extend to the more realistic setting in which laboratories are operating independently of each other. Here, we sought to test our statistical approach as a realistic prospective experiment, in mice, using 152 results from 5 independent published studies deposited in the Mouse Phenome Database (MPD). In independent replication experiments at 3 laboratories, we found that 53 of the results were replicable, so the other 99 were considered non-replicable. Of the 99 non-replicable results, 59 were statistically significant (at 0.05) in their original single-lab analysis, putting the probability that a single-lab statistical discovery was made even though it is non-replicable, at 59.6%. We then introduced the dimensionless “Genotype-by-Laboratory” (GxL) factor—the ratio between the standard deviations of the GxL interaction and the standard deviation within groups. Using the GxL factor reduced the number of single-lab statistical discoveries and alongside reduced the probability of a non-replicable result to be discovered in the single lab to 12.1%. Such reduction naturally leads to reduced power to make replicable discoveries, but this reduction was small (from 87% to 66%), indicating the small price paid for the large improvement in replicability. Tools and data needed for the above GxL adjustment are publicly available at the MPD and will become increasingly useful as the range of assays and testing conditions in this resource increases.

Original languageEnglish
Article numbere3002082
JournalPLoS Biology
Volume21
Issue number5
DOIs
StatePublished - May 2023

Funding

FundersFunder number
NIH-NSFDA045401
National Science FoundationBSF-NSF 2016746
National Institutes of HealthDA028420
United States-Israel Binational Science Foundation

    Fingerprint

    Dive into the research topics of 'A multi-lab experimental assessment reveals that replicability can be improved by using empirical estimates of genotype-by-lab interaction'. Together they form a unique fingerprint.

    Cite this