Methodology (stat.ME)
Mon, 29 May 2023
1.Bayesian estimation of clustered dependence structures in functional neuroconnectivity
Authors:Hyoshin Kim, Sujit Ghosh, Emily C. Hector
Abstract: Motivated by the need to model joint dependence between regions of interest in functional neuroconnectivity for efficient inference, we propose a new sampling-based Bayesian clustering approach for covariance structures of high-dimensional Gaussian outcomes. The key technique is based on a Dirichlet process that clusters covariance sub-matrices into independent groups of outcomes, thereby naturally inducing sparsity in the whole brain connectivity matrix. A new split-merge algorithm is employed to improve the mixing of the Markov chain sampling that is shown empirically to recover both uniform and Dirichlet partitions with high accuracy. We investigate the empirical performance of the proposed method through extensive simulations. Finally, the proposed approach is used to group regions of interest into functionally independent groups in the Autism Brain Imaging Data Exchange participants with autism spectrum disorder and attention-deficit/hyperactivity disorder.
2.A flexible Clayton-like spatial copula with application to bounded support data
Authors:Moreno Bevilacqua, Eloy Alvarado, Christian Caamaño-Carrillo
Abstract: The Gaussian copula is a powerful tool that has been widely used to model spatial and/or temporal correlated data with arbitrary marginal distributions. However, this kind of model can potentially be too restrictive since it expresses a reflection symmetric dependence. In this paper, we propose a new spatial copula model that makes it possible to obtain random fields with arbitrary marginal distributions with a type of dependence that can be reflection symmetric or not. Particularly, we propose a new random field with uniform marginal distributions that can be viewed as a spatial generalization of the classical Clayton copula model. It is obtained through a power transformation of a specific instance of a beta random field which in turn is obtained using a transformation of two independent Gamma random fields. For the proposed random field, we study the second-order properties and we provide analytic expressions for the bivariate distribution and its correlation. Finally, in the reflection symmetric case, we study the associated geometrical properties. As an application of the proposed model we focus on spatial modeling of data with bounded support. Specifically, we focus on spatial regression models with marginal distribution of the beta type. In a simulation study, we investigate the use of the weighted pairwise composite likelihood method for the estimation of this model. Finally, the effectiveness of our methodology is illustrated by analyzing point-referenced vegetation index data using the Gaussian copula as benchmark. Our developments have been implemented in an open-source package for the \textsf{R} statistical environment.
3.MLE for the parameters of bivariate interval-valued models
Authors:S. Yaser Samadi, L. Billard, Jiin-Huarng Guo, Wei Xu
Abstract: With contemporary data sets becoming too large to analyze the data directly, various forms of aggregated data are becoming common. The original individual data are points, but after aggregation, the observations are interval-valued (e.g.). While some researchers simply analyze the set of averages of the observations by aggregated class, it is easily established that approach ignores much of the information in the original data set. The initial theoretical work for interval-valued data was that of Le-Rademacher and Billard (2011), but those results were limited to estimation of the mean and variance of a single variable only. This article seeks to redress the limitation of their work by deriving the maximum likelihood estimator for the all important covariance statistic, a basic requirement for numerous methodologies, such as regression, principal components, and canonical analyses. Asymptotic properties of the proposed estimators are established. The Le-Rademacher and Billard results emerge as special cases of our wider derivations.
4.Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models
Authors:Alexandre Mösching, Housen Li, Axel Munk
Abstract: Hidden Markov models (HMMs) are characterized by an unobservable (hidden) Markov chain and an observable process, which is a noisy version of the hidden chain. Decoding the original signal (i.e., hidden chain) from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as the Viterbi algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the Markov chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure which decodes the hidden sequence in polylogarithmic computational complexity in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. The procedure also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments. The maximization is performed only approximately using an adaptive search procedure. The resulting sequence is admissible in the sense that all transitions occur with positive probability. To complement formal results justifying our approach, we present Monte-Carlo simulations which demonstrate the speedups provided by QATS in comparison to Viterbi, along with a precision analysis of the returned sequences. An implementation of QATS in C++ is provided in the R-package QATS and is available from GitHub.