Nonparametric Predictive Inference For Reproducibility of One-Way Layout Tests
No Thumbnail Available
Date
2024-09
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Durham University
Abstract
The reproducibility of research findings is of main interest in many disciplines. Reproducibility
of a statistical test means that, if the experiment were repeated under the same conditions, it
would lead to the same conclusion with regard to rejection of the null hypothesis. The probability
that the test conclusion for the repeated test would be the same as the original test is called
reproducibility probability (RP). The concept of test reproducibility is inherently a predictive
inference problem. This thesis investigates the reproducibility of statistical hypothesis tests
for One-Way Layout tests using Nonparametric Predictive Inference (NPI). NPI is a predictive
approach based on few modelling assumptions that considers multiple future observations that
are exchangeable with the data observations which makes it suitable for inference about reproducibility.
The uncertainty can be quantified in NPI reproducibility through lower and upper
reproducibility probabilities.
This thesis considers reproducibility of general alternatives tests, including the Kruskal
Wallis test and the one-way ANOVA test, as well as the Jonckheere-Terpstra test for the ordered
alternative hypothesis. This thesis also considers reproducibility probabilities for the umbrella
alternatives tests, specifically the Mack-Wolfe test and the Esra-Fikri test, as well as for slippage
tests, namely, the Mosteller test. Deriving the exact NPI lower and upper reproducibility
probabilities is not trivial for some tests and computationally challenging for large sample
sizes. To address these difficulties, two NPI-based approaches are implemented, namely, the
NPI sampling of orderings and the NPI-bootstrap techniques. The NPI reproducibility is low
when the test statistic is close to the threshold between rejecting and not rejecting the null
hypothesis. If the test statistic is close to the rejection threshold for tests with directional
alternatives, reproducibility tends to be lower for rejection of the null hypothesis than for nonrejection.
This may be problematic, in particular as rejection of the null hypothesis is often the
main goal of statistical experiments.
Description
Keywords
Nonparametric Predictive Inference, Reproducibility, One-Way Layout Tests