Methodology (stat.ME)
Thu, 27 Jul 2023
1.Causal rule ensemble method for estimating heterogeneous treatment effect with consideration of main effects
Authors:Mayu Hiraishi, Ke Wan, Kensuke Tanioka, Hiroshi Yadohisa, Toshio Shimokawa
Abstract: This study proposes a novel framework based on the RuleFit method to estimate Heterogeneous Treatment Effect (HTE) in a randomized clinical trial. To achieve this, we adopted S-learner of the metaalgorithm for our proposed framework. The proposed method incorporates a rule term for the main effect and treatment effect, which allows HTE to be interpretable form of rule. By including a main effect term in the proposed model, the selected rule is represented as an HTE that excludes other effects. We confirmed a performance equivalent to that of another ensemble learning methods through numerical simulation and demonstrated the interpretation of the proposed method from a real data application.
2.Identifying regime switches through Bayesian wavelet estimation: evidence from flood detection in the Taquari River Valley
Authors:Flávia Castro Motta, Michel Helcias Montoril
Abstract: Two-component mixture models have proved to be a powerful tool for modeling heterogeneity in several cluster analysis contexts. However, most methods based on these models assume a constant behavior for the mixture weights, which can be restrictive and unsuitable for some applications. In this paper, we relax this assumption and allow the mixture weights to vary according to the index (e.g., time) to make the model more adaptive to a broader range of data sets. We propose an efficient MCMC algorithm to jointly estimate both component parameters and dynamic weights from their posterior samples. We evaluate the method's performance by running Monte Carlo simulation studies under different scenarios for the dynamic weights. In addition, we apply the algorithm to a time series that records the level reached by a river in southern Brazil. The Taquari River is a water body whose frequent flood inundations have caused various damage to riverside communities. Implementing a dynamic mixture model allows us to properly describe the flood regimes for the areas most affected by these phenomena.
3.Insufficient Gibbs Sampling
Authors:Antoine Luciano, Christian P. Robert, Robin J. Ryder
Abstract: In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns, and only aggregated, robust and inefficient statistics derived from the data are accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. In this article, operating within a parametric framework, we propose a method to sample from the posterior distribution of parameters conditioned on different robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or one or more quantiles. Leveraging a Gibbs sampler and the simulation of latent augmented data, our approach facilitates simulation according to the posterior distribution of parameters belonging to specific families of distributions. We demonstrate its applicability on the Gaussian, Cauchy, and translated Weibull families.
4.Graphical lasso for extremes
Authors:Phyllis Wan, Chen Zhou
Abstract: In this paper we estimate the sparse dependence structure in the tail region of a multivariate random vector, potentially of high dimension. The tail dependence is modeled via a graphical model for extremes embedded in the Huesler-Reiss distribution (Engelke and Hitz, 2020). We propose the extreme graphical lasso procedure to estimate the sparsity in the tail dependence, similar to the Gaussian graphical lasso method in high dimensional statistics. We prove its consistency in identifying the graph structure and estimating model parameters. The efficiency and accuracy of the proposed method are illustrated in simulated and real examples.