
Machine Learning (stat.ML)
Wed, 05 Jul 2023
1.Using Random Effects Machine Learning Algorithms to Identify Vulnerability to Depression
Authors:Runa Bhaumik, Jonathan Stange
Abstract: Background: Reliable prediction of clinical progression over time can improve the outcomes of depression. Little work has been done integrating various risk factors for depression, to determine the combinations of factors with the greatest utility for identifying which individuals are at the greatest risk. Method: This study demonstrates that data-driven machine learning (ML) methods such as RE-EM (Random Effects/Expectation Maximization) trees and MERF (Mixed Effects Random Forest) can be applied to reliably identify variables that have the greatest utility for classifying subgroups at greatest risk for depression. 185 young adults completed measures of depression risk, including rumination, worry, negative cognitive styles, cognitive and coping flexibilities, and negative life events, along with symptoms of depression. We trained RE-EM trees and MERF algorithms and compared them to traditional linear mixed models (LMMs) predicting depressive symptoms prospectively and concurrently with cross-validation. Results: Our results indicated that the RE-EM tree and MERF methods model complex interactions, identify subgroups of individuals and predict depression severity comparable to LMM. Further, machine learning models determined that brooding, negative life events, negative cognitive styles, and perceived control were the most relevant predictors of future depression levels. Conclusions: Random effects machine learning models have the potential for high clinical utility and can be leveraged for interventions to reduce vulnerability to depression.
2.Monte Carlo Sampling without Isoperimetry: A Reverse Diffusion Approach
Authors:Xunpeng Huang, Hanze Dong, Yifan Hao, Yian Ma, Tong Zhang
Abstract: The efficacy of modern generative models is commonly contingent upon the precision of score estimation along the diffusion path, with a focus on diffusion models and their ability to generate high-quality data samples. This study delves into the potentialities of posterior sampling through reverse diffusion. An examination of the sampling literature reveals that score estimation can be transformed into a mean estimation problem via the decomposition of the transition kernel. By estimating the mean of the auxiliary distribution, the reverse diffusion process can give rise to a novel posterior sampling algorithm, which diverges from traditional gradient-based Markov Chain Monte Carlo (MCMC) methods. We provide the convergence analysis in total variation distance and demonstrate that the isoperimetric dependency of the proposed algorithm is comparatively lower than that observed in conventional MCMC techniques, which justifies the superior performance for high dimensional sampling with error tolerance. Our analytical framework offers fresh perspectives on the complexity of score estimation at various time points, as denoted by the properties of the auxiliary distribution.
3.DiffFlow: A Unified SDE Framework for Score-Based Diffusion Models and Generative Adversarial Networks
Authors:Jingwei Zhang, Han Shi, Jincheng Yu, Enze Xie, Zhenguo Li
Abstract: Generative models can be categorized into two types: explicit generative models that define explicit density forms and allow exact likelihood inference, such as score-based diffusion models (SDMs) and normalizing flows; implicit generative models that directly learn a transformation from the prior to the data distribution, such as generative adversarial nets (GANs). While these two types of models have shown great success, they suffer from respective limitations that hinder them from achieving fast sampling and high sample quality simultaneously. In this paper, we propose a unified theoretic framework for SDMs and GANs. We shown that: i) the learning dynamics of both SDMs and GANs can be described as a novel SDE named Discriminator Denoising Diffusion Flow (DiffFlow) where the drift can be determined by some weighted combinations of scores of the real data and the generated data; ii) By adjusting the relative weights between different score terms, we can obtain a smooth transition between SDMs and GANs while the marginal distribution of the SDE remains invariant to the change of the weights; iii) we prove the asymptotic optimality and maximal likelihood training scheme of the DiffFlow dynamics; iv) under our unified theoretic framework, we introduce several instantiations of the DiffFLow that provide new algorithms beyond GANs and SDMs with exact likelihood inference and have potential to achieve flexible trade-off between high sample quality and fast sampling speed.
4.Absorbing Phase Transitions in Artificial Deep Neural Networks
Authors:Keiichi Tamai, Tsuyoshi Okubo, Truong Vinh Truong Duy, Naotake Natori, Synge Todo
Abstract: Theoretical understanding of the behavior of infinitely-wide neural networks has been rapidly developed for various architectures due to the celebrated mean-field theory. However, there is a lack of a clear, intuitive framework for extending our understanding to finite networks that are of more practical and realistic importance. In the present contribution, we demonstrate that the behavior of properly initialized neural networks can be understood in terms of universal critical phenomena in absorbing phase transitions. More specifically, we study the order-to-chaos transition in the fully-connected feedforward neural networks and the convolutional ones to show that (i) there is a well-defined transition from the ordered state to the chaotics state even for the finite networks, and (ii) difference in architecture is reflected in that of the universality class of the transition. Remarkably, the finite-size scaling can also be successfully applied, indicating that intuitive phenomenological argument could lead us to semi-quantitative description of the signal propagation dynamics.
5.Conditional independence testing under model misspecification
Authors:Felipe Maia Polo, Yuekai Sun, Moulinath Banerjee
Abstract: Conditional independence (CI) testing is fundamental and challenging in modern statistics and machine learning. Many modern methods for CI testing rely on powerful supervised learning methods to learn regression functions or Bayes predictors as an intermediate step. Although the methods are guaranteed to control Type-I error when the supervised learning methods accurately estimate the regression functions or Bayes predictors, their behavior is less understood when they fail due to model misspecification. In a broader sense, model misspecification can arise even when universal approximators (e.g., deep neural nets) are employed. Then, we study the performance of regression-based CI tests under model misspecification. Namely, we propose new approximations or upper bounds for the testing errors of three regression-based tests that depend on misspecification errors. Moreover, we introduce the Rao-Blackwellized Predictor Test (RBPT), a novel regression-based CI test robust against model misspecification. Finally, we conduct experiments with artificial and real data, showcasing the usefulness of our theory and methods.