arXiv daily

Methodology (stat.ME)

Thu, 20 Jul 2023

Other arXiv digests in this category:Thu, 14 Sep 2023; Wed, 13 Sep 2023; Tue, 12 Sep 2023; Mon, 11 Sep 2023; Fri, 08 Sep 2023; Tue, 05 Sep 2023; Fri, 01 Sep 2023; Thu, 31 Aug 2023; Wed, 30 Aug 2023; Tue, 29 Aug 2023; Mon, 28 Aug 2023; Fri, 25 Aug 2023; Thu, 24 Aug 2023; Wed, 23 Aug 2023; Tue, 22 Aug 2023; Mon, 21 Aug 2023; Fri, 18 Aug 2023; Thu, 17 Aug 2023; Wed, 16 Aug 2023; Tue, 15 Aug 2023; Mon, 14 Aug 2023; Fri, 11 Aug 2023; Thu, 10 Aug 2023; Wed, 09 Aug 2023; Tue, 08 Aug 2023; Mon, 07 Aug 2023; Fri, 04 Aug 2023; Thu, 03 Aug 2023; Wed, 02 Aug 2023; Tue, 01 Aug 2023; Mon, 31 Jul 2023; Fri, 28 Jul 2023; Thu, 27 Jul 2023; Wed, 26 Jul 2023; Tue, 25 Jul 2023; Mon, 24 Jul 2023; Fri, 21 Jul 2023; Wed, 19 Jul 2023; Tue, 18 Jul 2023; Mon, 17 Jul 2023; Fri, 14 Jul 2023; Thu, 13 Jul 2023; Wed, 12 Jul 2023; Tue, 11 Jul 2023; Mon, 10 Jul 2023; Fri, 07 Jul 2023; Thu, 06 Jul 2023; Wed, 05 Jul 2023; Tue, 04 Jul 2023; Mon, 03 Jul 2023; Fri, 30 Jun 2023; Thu, 29 Jun 2023; Wed, 28 Jun 2023; Tue, 27 Jun 2023; Mon, 26 Jun 2023; Fri, 23 Jun 2023; Thu, 22 Jun 2023; Wed, 21 Jun 2023; Tue, 20 Jun 2023; Fri, 16 Jun 2023; Thu, 15 Jun 2023; Tue, 13 Jun 2023; Mon, 12 Jun 2023; Fri, 09 Jun 2023; Thu, 08 Jun 2023; Wed, 07 Jun 2023; Tue, 06 Jun 2023; Mon, 05 Jun 2023; Fri, 02 Jun 2023; Thu, 01 Jun 2023; Wed, 31 May 2023; Tue, 30 May 2023; Mon, 29 May 2023; Fri, 26 May 2023; Thu, 25 May 2023; Wed, 24 May 2023; Tue, 23 May 2023; Mon, 22 May 2023; Fri, 19 May 2023; Thu, 18 May 2023; Wed, 17 May 2023; Tue, 16 May 2023; Mon, 15 May 2023; Fri, 12 May 2023; Thu, 11 May 2023; Wed, 10 May 2023; Tue, 09 May 2023; Mon, 08 May 2023; Fri, 05 May 2023; Thu, 04 May 2023; Wed, 03 May 2023; Tue, 02 May 2023; Mon, 01 May 2023; Fri, 28 Apr 2023; Thu, 27 Apr 2023; Wed, 26 Apr 2023; Tue, 25 Apr 2023; Mon, 24 Apr 2023; Fri, 21 Apr 2023; Thu, 20 Apr 2023; Wed, 19 Apr 2023; Tue, 18 Apr 2023; Mon, 17 Apr 2023; Fri, 14 Apr 2023; Thu, 13 Apr 2023; Wed, 12 Apr 2023; Tue, 11 Apr 2023; Mon, 10 Apr 2023
1.Distributional Regression for Data Analysis

Authors:Nadja Klein

Abstract: Flexible modeling of how an entire distribution changes with covariates is an important yet challenging generalization of mean-based regression that has seen growing interest over the past decades in both the statistics and machine learning literature. This review outlines selected state-of-the-art statistical approaches to distributional regression, complemented with alternatives from machine learning. Topics covered include the similarities and differences between these approaches, extensions, properties and limitations, estimation procedures, and the availability of software. In view of the increasing complexity and availability of large-scale data, this review also discusses the scalability of traditional estimation methods, current trends, and open challenges. Illustrations are provided using data on childhood malnutrition in Nigeria and Australian electricity prices.

2.Multilevel latent class analysis with covariates: Analysis of cross-national citizenship norms with a two-stage approach

Authors:Roberto Di Mari, Zsuzsa Bakk, Jennifer Oser, Jouni Kuha

Abstract: This paper focuses on the substantive application of multilevel LCA to the evolution of citizenship norms in a diverse array of democratic countries. To do so, we present a two-stage approach to fit multilevel latent class models: in the first stage (measurement model construction), unconditional class enumeration is done separately on both low and high level latent variables, estimating only a part of the model at a time -- hence keeping the remaining part fixed -- and then updating the full measurement model; in the second stage (structural model construction), individual and/or group covariates are included in the model. By separating the two parts -- first stage and second stage of model building -- the measurement model is stabilized and is allowed to be determined only by it's indicators. Moreover, this two-step approach makes the inclusion/exclusion of a covariate a relatively simple task to handle. Our proposal amends common practice in applied social science research, where simple (low-level) LCA is done to obtain a classification of low-level unit, and this is then related to (low- and high-level) covariates simply including group fixed effects. Our analysis identifies latent classes that score either consistently high or consistently low on all measured items, along with two theoretically important classes that place distinctive emphasis on items related to engaged citizenship, and duty-based norms.

3.A criterion and incremental design construction for simultaneous kriging predictions

Authors:Helmut Waldl, Werner G. Müller

Abstract: In this paper, we further investigate the problem of selecting a set of design points for universal kriging, which is a widely used technique for spatial data analysis. Our goal is to select the design points in order to make simultaneous predictions of the random variable of interest at a finite number of unsampled locations with maximum precision. Specifically, we consider as response a correlated random field given by a linear model with an unknown parameter vector and a spatial error correlation structure. We propose a new design criterion that aims at simultaneously minimizing the variation of the prediction errors at various points. We also present various efficient techniques for incrementally buillding designs for that criterion scaling well for high dimensions. Thus the method is particularly suitable for big data applications in areas of spatial data analysis such as mining, hydrogeology, natural resource monitoring, and environmental sciences or equivalently for any computer simulation experiments. The effectiveness of the proposed designs is demonstrated through numerical examples.

4.Unbiased analytic non-parametric correlation estimators in the presence of ties

Authors:Landon Hurley

Abstract: An inner-product Hilbert space formulation is defined over a domain of all permutations with ties upon the extended real line. We demonstrate this work to resolve the common first and second order biases found in the pervasive Kendall and Spearman non-parametric correlation estimators, while presenting as unbiased minimum variance (Gauss-Markov) estimators. We conclude by showing upon finite samples that a strictly sub-Gaussian probability distribution is to be preferred for the Kemeny $\tau_{\kappa}$ and $\rho_{\kappa}$ estimators, allowing for the construction of expected Wald test statistics which are analytically consistent with the Gauss-Markov properties upon finite samples.

5.Studentising non-parametric correlation estimators

Authors:Landon Hurley

Abstract: Studentisation upon rank-based linear estimators is generally considered an unnecessary topic, due to the domain restriction upon $S_{n}$, which exhibits constant variance. This assertion is functionally inconsistent with general analytic practice though. We introduce a general unbiased and minimum variance estimator upon the Beta-Binomially distributed Kemeny Hilbert space, which allows for permutation ties to exist and be uniquely measured. As individual permutation samples now exhibit unique random variance, a sample dependent variance estimator must now be introduced into the linear model. We derive and prove the Slutsky conditions to enable $t_{\nu}$-distributed Wald test statistics to be constructed, while stably exhibiting Gauss-Markov properties upon finite samples. Simulations demonstrate convergent decisions upon the two orthonormal Slutsky corrected Wald test statistics, verifying the projective geometric duality which exists upon the affine-linear Kemeny metric.