
Machine Learning (stat.ML)
Thu, 22 Jun 2023
1.Generalized Low-Rank Update: Model Parameter Bounds for Low-Rank Training Data Modifications
Authors:Hiroyuki Hanada, Noriaki Hashimoto, Kouichi Taji, Ichiro Takeuchi
Abstract: In this study, we have developed an incremental machine learning (ML) method that efficiently obtains the optimal model when a small number of instances or features are added or removed. This problem holds practical importance in model selection, such as cross-validation (CV) and feature selection. Among the class of ML methods known as linear estimators, there exists an efficient model update framework called the low-rank update that can effectively handle changes in a small number of rows and columns within the data matrix. However, for ML methods beyond linear estimators, there is currently no comprehensive framework available to obtain knowledge about the updated solution within a specific computational complexity. In light of this, our study introduces a method called the Generalized Low-Rank Update (GLRU) which extends the low-rank update framework of linear estimators to ML methods formulated as a certain class of regularized empirical risk minimization, including commonly used methods such as SVM and logistic regression. The proposed GLRU method not only expands the range of its applicability but also provides information about the updated solutions with a computational complexity proportional to the amount of dataset changes. To demonstrate the effectiveness of the GLRU method, we conduct experiments showcasing its efficiency in performing cross-validation and feature selection compared to other baseline methods.
2.Robust Statistical Comparison of Random Variables with Locally Varying Scale of Measurement
Authors:Christoph Jansen, Georg Schollmeyer, Hannah Blocher, Julian Rodemann, Thomas Augustin
Abstract: Spaces with locally varying scale of measurement, like multidimensional structures with differently scaled dimensions, are pretty common in statistics and machine learning. Nevertheless, it is still understood as an open question how to exploit the entire information encoded in them properly. We address this problem by considering an order based on (sets of) expectations of random variables mapping into such non-standard spaces. This order contains stochastic dominance and expectation order as extreme cases when no, or respectively perfect, cardinal structure is given. We derive a (regularized) statistical test for our proposed generalized stochastic dominance (GSD) order, operationalize it by linear optimization, and robustify it by imprecise probability models. Our findings are illustrated with data from multidimensional poverty measurement, finance, and medicine.
3.Mitigating Discrimination in Insurance with Wasserstein Barycenters
Authors:Arthur Charpentier, François Hu, Philipp Ratz
Abstract: The insurance industry is heavily reliant on predictions of risks based on characteristics of potential customers. Although the use of said models is common, researchers have long pointed out that such practices perpetuate discrimination based on sensitive features such as gender or race. Given that such discrimination can often be attributed to historical data biases, an elimination or at least mitigation is desirable. With the shift from more traditional models to machine-learning based predictions, calls for greater mitigation have grown anew, as simply excluding sensitive variables in the pricing process can be shown to be ineffective. In this article, we first investigate why predictions are a necessity within the industry and why correcting biases is not as straightforward as simply identifying a sensitive variable. We then propose to ease the biases through the use of Wasserstein barycenters instead of simple scaling. To demonstrate the effects and effectiveness of the approach we employ it on real data and discuss its implications.
4.Inferring the finest pattern of mutual independence from data
Authors:G. Marrelec, A. Giron
Abstract: For a random variable $X$, we are interested in the blind extraction of its finest mutual independence pattern $\mu ( X )$. We introduce a specific kind of independence that we call dichotomic. If $\Delta ( X )$ stands for the set of all patterns of dichotomic independence that hold for $X$, we show that $\mu ( X )$ can be obtained as the intersection of all elements of $\Delta ( X )$. We then propose a method to estimate $\Delta ( X )$ when the data are independent and identically (i.i.d.) realizations of a multivariate normal distribution. If $\hat{\Delta} ( X )$ is the estimated set of valid patterns of dichotomic independence, we estimate $\mu ( X )$ as the intersection of all patterns of $\hat{\Delta} ( X )$. The method is tested on simulated data, showing its advantages and limits. We also consider an application to a toy example as well as to experimental data.