Methodology (stat.ME)
Mon, 08 May 2023
1.Replication of "null results" -- Absence of evidence or evidence of absence?
Authors:Samuel Pawel, Rachel Heyard, Charlotte Micheloud, Leonhard Held
Abstract: In several large-scale replication projects, statistically non-significant results in both the original and the replication study have been interpreted as a "replication success". Here we discuss the logical problems with this approach: Non-significance in both studies does not ensure that the studies provide evidence for the absence of an effect and "replication success" can virtually always be achieved if the sample sizes are small enough. In addition, the relevant error rates are not controlled. We show how methods, such as equivalence testing and Bayes factors, can be used to adequately quantify the evidence for the absence of an effect and how they can be applied in the replication setting. Using data from the Reproducibility Project: Cancer Biology we illustrate that many original and replication studies with "null results" are in fact inconclusive, and that their replicability is lower than suggested by the non-significance approach. We conclude that it is important to also replicate studies with statistically non-significant results, but that they should be designed, analyzed, and interpreted appropriately.
2.Neural Likelihood Surfaces for Spatial Processes with Computationally Intensive or Intractable Likelihoods
Authors:Julia Walchessen, Amanda Lenzi, Mikael Kuusela
Abstract: In spatial statistics, fast and accurate parameter estimation coupled with a reliable means of uncertainty quantification can be a challenging task when fitting a spatial process to real-world data because the likelihood function might be slow to evaluate or intractable. In this work, we propose using convolutional neural networks (CNNs) to learn the likelihood function of a spatial process. Through a specifically designed classification task, our neural network implicitly learns the likelihood function, even in situations where the exact likelihood is not explicitly available. Once trained on the classification task, our neural network is calibrated using Platt scaling which improves the accuracy of the neural likelihood surfaces. To demonstrate our approach, we compare maximum likelihood estimates and approximate confidence regions constructed from the neural likelihood surface with the equivalent for exact or approximate likelihood for two different spatial processes: a Gaussian Process, which has a computationally intensive likelihood function for large datasets, and a Brown-Resnick Process, which has an intractable likelihood function. We also compare the neural likelihood surfaces to the exact and approximate likelihood surfaces for the Gaussian Process and Brown-Resnick Process, respectively. We conclude that our method provides fast and accurate parameter estimation with a reliable method of uncertainty quantification in situations where standard methods are either undesirably slow or inaccurate.
3.Peak-Persistence Diagrams for Estimating Shapes and Functions from Noisy Data
Authors:Woo Min Kim, Sutanoy Dasgupta, Anuj Srivastava
Abstract: Estimating signals underlying noisy data is a significant problem in statistics and engineering. Numerous estimators are available in the literature, depending on the observation model and estimation criterion. This paper introduces a framework that estimates the shape of the unknown signal and the signal itself. The approach utilizes a peak-persistence diagram (PPD), a novel tool that explores the dominant peaks in the potential solutions and estimates the function's shape, which includes the number of internal peaks and valleys. It then imposes this shape constraint on the search space and estimates the signal from partially-aligned data. This approach balances two previous solutions: averaging without alignment and averaging with complete elastic alignment. From a statistical viewpoint, it achieves an optimal estimator under a model with both additive noise and phase or warping noise. We also present a computationally-efficient procedure for implementing this solution and demonstrate its effectiveness on several simulated and real examples. Notably, this geometric approach outperforms the current state-of-the-art in the field.