Methodology (stat.ME)
Fri, 01 Sep 2023
1.Optimal Scaling transformations to model non-linear relations in GLMs with ordered and unordered predictors
Authors:S. J. W. Willems, A. J. van der Kooij, J. J. Meulman
Abstract: In Generalized Linear Models (GLMs) it is assumed that there is a linear effect of the predictor variables on the outcome. However, this assumption is often too strict, because in many applications predictors have a nonlinear relation with the outcome. Optimal Scaling (OS) transformations combined with GLMs can deal with this type of relations. Transformations of the predictors have been integrated in GLMs before, e.g. in Generalized Additive Models. However, the OS methodology has several benefits. For example, the levels of categorical predictors are quantified directly, such that they can be included in the model without defining dummy variables. This approach enhances the interpretation and visualization of the effect of different levels on the outcome. Furthermore, monotonicity restrictions can be applied to the OS transformations such that the original ordering of the category values is preserved. This improves the interpretation of the effect and may prevent overfitting. The scaling level can be chosen for each individual predictor such that models can include mixed scaling levels. In this way, a suitable transformation can be found for each predictor in the model. The implementation of OS in logistic regression is demonstrated using three datasets that contain a binary outcome variable and a set of categorical and/or continuous predictor variables.
2.Unidimensionality in Rasch Models: Efficient Item Selection and Hierarchical Clustering Methods Based on Marginal Estimates
Authors:Gerhard Tutz
Abstract: A strong tool for the selection of items that share a common trait from a set of given items is proposed. The selection method is based on marginal estimates and exploits that the estimates of the standard deviation of the mixing distribution are rather stable if items are from a Rasch model with a common trait. If, however, the item set is increased by adding items that do not share the latent trait the estimated standard deviations become distinctly smaller. A method is proposed that successively increases the set of items that are considered Rasch items by examining the estimated standard deviations of the mixing distribution. It is demonstrated that the selection procedure is on average very reliable and a criterion is proposed, which allows to identify items that should not be considered Rasch items for concrete item sets. An extension of the method allows to investigate which groups of items might share a common trait. The corresponding hierarchical clustering procedure is considered an exploratory tool but works well on average.