Methodology (stat.ME)
Mon, 14 Aug 2023
1.Righteous Way to Find Heterogeneous Effects of Multiple Treatments for Any Outcome Variable
Authors:Myoungjae Lee
Abstract: With a categorical treatment D=0,1,...,J, the ubiquitous practice is making dummy variables D(1),...,D(J) to apply the OLS of an outcome Y on D(1),...,D(J) and covariates X. With m(d,X) being the X-heterogeneous effect of D(d) given X, this paper shows that, for "saturated models", the OLS D(d) slope is consistent for a sum of weighted averages of m(1,X),...,m(J,X) where the sum of the weights for m(d,X) is one whereas the sum of the weights for the other X-heterogeneous effects is zero. Hence, if all m(1,X),...,m(J,X) are constant with m(d,X)=b(d), then the OLS D(d) slope is consistent for b(d); otherwise, the OLS is inconsistent in saturated models, as heterogeneous effects of other categories "interfere". For unsaturated models, in general, OLS is inconsistent even for binary D. What can be done instead is the OLS of Y on D(d)-E{D(d)|X, D=0,d} using only the subsample D=0,d to find the effect of D(d) separately for each d=1,...,J. This subsample OLS is consistent for the "overlap-weight" average of m(d,X). Although we parametrize E{D(d)|X, D=0,d} for practicality, using Y-E(Y|X, D=0,d) or its variation instead of Y makes the OLS robust to misspecifications in E{D(d)|X, D=0,d}.
2.A novel two-sample test within the space of symmetric positive definite matrix distributions and its application in finance
Authors:Žikica Lukić, Bojana Milošević
Abstract: This paper introduces a novel two-sample test for a broad class of orthogonally equivalent positive definite symmetric matrix distributions. Our test is the first of its kind and we derive its asymptotic distribution. To estimate the test power, we use a warp-speed bootstrap method and consider the most common matrix distributions. We provide several real data examples, including the data for main cryptocurrencies and stock data of major US companies. The real data examples demonstrate the applicability of our test in the context closely related to algorithmic trading. The popularity of matrix distributions in many applications and the need for such a test in the literature are reconciled by our findings.
3.Maintaining the validity of inference from linear mixed models in stepped-wedge cluster randomized trials under misspecified random-effects structures
Authors:Yongdong Ouyang, Monica Taljaard, Andrew B Forbes, Fan Li
Abstract: Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials (SW-CRTs). A key consideration for analyzing a SW-CRT is accounting for the potentially complex correlation structure, which can be achieved by specifying a random effects structure. Common random effects structures for a SW-CRT include random intercept, random cluster-by-period, and discrete-time decay. Recently, more complex structures, such as the random intervention structure, have been proposed. In practice, specifying appropriate random effects can be challenging. Robust variance estimators (RVE) may be applied to linear mixed models to provide consistent estimators of standard errors of fixed effect parameters in the presence of random-effects misspecification. However, there has been no empirical investigation of RVE for SW-CRT. In this paper, we first review five RVEs (both standard and small-sample bias-corrected RVEs) that are available for linear mixed models. We then describe a comprehensive simulation study to examine the performance of these RVEs for SW-CRTs with a continuous outcome under different data generators. For each data generator, we investigate whether the use of a RVE with either the random intercept model or the random cluster-by-period model is sufficient to provide valid statistical inference for fixed effect parameters, when these working models are subject to misspecification. Our results indicate that the random intercept and random cluster-by-period models with RVEs performed similarly. The CR3 RVE estimator, coupled with the number of clusters minus two degrees of freedom correction, consistently gave the best coverage results, but could be slightly anti-conservative when the number of clusters was below 16. We summarize the implications of our results for linear mixed model analysis of SW-CRTs in practice.
4.Path-specific causal decomposition analysis with multiple correlated mediator variables
Authors:Melissa J. Smith, Leslie A. McClure, D. Leann Long
Abstract: A causal decomposition analysis allows researchers to determine whether the difference in a health outcome between two groups can be attributed to a difference in each group's distribution of one or more modifiable mediator variables. With this knowledge, researchers and policymakers can focus on designing interventions that target these mediator variables. Existing methods for causal decomposition analysis either focus on one mediator variable or assume that each mediator variable is conditionally independent given the group label and the mediator-outcome confounders. In this paper, we propose a flexible causal decomposition analysis method that can accommodate multiple correlated and interacting mediator variables, which are frequently seen in studies of health behaviors and studies of environmental pollutants. We extend a Monte Carlo-based causal decomposition analysis method to this setting by using a multivariate mediator model that can accommodate any combination of binary and continuous mediator variables. Furthermore, we state the causal assumptions needed to identify both joint and path-specific decomposition effects through each mediator variable. To illustrate the reduction in bias and confidence interval width of the decomposition effects under our proposed method, we perform a simulation study. We also apply our approach to examine whether differences in smoking status and dietary inflammation score explain any of the Black-White differences in incident diabetes using data from a national cohort study.