Methodology (stat.ME)
Tue, 16 May 2023
1.Errors-in-variables Fréchet Regression with Low-rank Covariate Approximation
Authors:Kyunghee Han, Dogyoon Song
Abstract: Fr\'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables. However, its practical applicability has been hindered by its reliance on ideal scenarios with abundant and noiseless covariate data. In this paper, we present a novel estimation method that tackles these limitations by leveraging the low-rank structure inherent in the covariate matrix. Our proposed framework combines the concepts of global Fr\'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator. By incorporating the low-rank structure, our method enables more effective modeling and estimation, particularly in high-dimensional and errors-in-variables regression settings. We provide a theoretical analysis of the proposed estimator's large-sample properties, including a comprehensive rate analysis of bias, variance, and additional variations due to measurement errors. Furthermore, our numerical experiments provide empirical evidence that supports the theoretical findings, demonstrating the superior performance of our approach. Overall, this work introduces a promising framework for regression analysis of non-Euclidean variables, effectively addressing the challenges associated with limited and noisy covariate data, with potential applications in diverse fields.
2.Smooth hazards with multiple time scales
Authors:Angela Carollo, Paul H. C. Eilers, Hein Putter, Jutta Gampe
Abstract: Hazard models are the most commonly used tool to analyse time-to-event data. If more than one time scale is relevant for the event under study, models are required that can incorporate the dependence of a hazard along two (or more) time scales. Such models should be flexible to capture the joint influence of several times scales and nonparametric smoothing techniques are obvious candidates. P-splines offer a flexible way to specify such hazard surfaces, and estimation is achieved by maximizing a penalized Poisson likelihood. Standard observations schemes, such as right-censoring and left-truncation, can be accommodated in a straightforward manner. The model can be extended to proportional hazards regression with a baseline hazard varying over two scales. Generalized linear array model (GLAM) algorithms allow efficient computations, which are implemented in a companion R-package.
3.Sparse-group SLOPE: adaptive bi-level selection with FDR-control
Authors:Fabio Feser, Marina Evangelou
Abstract: In this manuscript, a new high-dimensional approach for simultaneous variable and group selection is proposed, called sparse-group SLOPE (SGS). SGS achieves false discovery rate control at both variable and group levels by incorporating the SLOPE model into a sparse-group framework and exploiting grouping information. A proximal algorithm is implemented for fitting SGS that works for both Gaussian and Binomial distributed responses. Through the analysis of both synthetic and real datasets, the proposed SGS approach is found to outperform other existing lasso- and SLOPE-based models for bi-level selection and prediction accuracy. Further, model selection and noise estimation approaches for selecting the tuning parameter of the regularisation model are proposed and explored.