TY - JOUR
T1 - Fast calculation of p-values for one-sided Kolmogorov-Smirnov type statistics
AU - Moscovich, Amit
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2023/9
Y1 - 2023/9
N2 - A novel method for computing exact p-values of one-sided statistics from the Kolmogorov-Smirnov family is presented. It covers the Higher Criticism statistic, one-sided weighted Kolmogorov-Smirnov statistics, and the one-sided Berk-Jones statistics. In addition to p-values, the method can also be used for power analysis, finding alpha-level thresholds, and the construction of confidence bands for the empirical distribution function. With its quadratic runtime and numerical stability, the method easily scales to sample sizes in the hundreds of thousands and takes less than a second to run on a sample size of 25,000. This allows practitioners working on large data sets to use exact finite-sample computations instead of approximation schemes. The method is based on a reduction to the boundary-crossing probability of a pure jump stochastic process. FFT convolutions of two different sizes are then used to efficiently propagate the probabilities of the non-crossing paths. This approach has applications beyond statistics, for example in financial risk modeling.
AB - A novel method for computing exact p-values of one-sided statistics from the Kolmogorov-Smirnov family is presented. It covers the Higher Criticism statistic, one-sided weighted Kolmogorov-Smirnov statistics, and the one-sided Berk-Jones statistics. In addition to p-values, the method can also be used for power analysis, finding alpha-level thresholds, and the construction of confidence bands for the empirical distribution function. With its quadratic runtime and numerical stability, the method easily scales to sample sizes in the hundreds of thousands and takes less than a second to run on a sample size of 25,000. This allows practitioners working on large data sets to use exact finite-sample computations instead of approximation schemes. The method is based on a reduction to the boundary-crossing probability of a pure jump stochastic process. FFT convolutions of two different sizes are then used to efficiently propagate the probabilities of the non-crossing paths. This approach has applications beyond statistics, for example in financial risk modeling.
KW - Boundary crossing
KW - Continuous goodness-of-fit
KW - Higher criticism
KW - Hypothesis testing
KW - Stochastic process
UR - http://www.scopus.com/inward/record.url?scp=85158071224&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2023.107769
DO - 10.1016/j.csda.2023.107769
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85158071224
SN - 0167-9473
VL - 185
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
M1 - 107769
ER -