AbiOmics: An End-to-End Pipeline to Train Machine Learning Models for Discrimination of Plant Abiotic Stresses Using Transcriptomic Profiling Data
AbiOmics: An End-to-End Pipeline to Train Machine Learning Models for Discrimination of Plant Abiotic Stresses Using Transcriptomic Profiling Data
Park, M.; Oh, Y.; Choi, W.; Jo, Y. D.
AbstractAbiotic stresses are primary constraints on global crop productivity, reducing yields by up to 80%. While traditional phenotypic sensing detects stress only after physiological symptoms emerge and often fails to discriminate specific stressor types, transcriptomic profiling offers a high-dimensional solution, capturing rapid and sensitive molecular shifts. In this study, we developed AbiOmics, the first end-to-end machine learning pipeline specifically designed to identify and discriminate among multiple stressors. This approach represents a previously undocumented method for stress specification using large-scale transcriptomic big data. We identified 320 stress-specific marker genes using a curated collection of 1,243 transcriptomes of Arabidopsis samples treated with four major abiotic stresses, salt, cold, heat, and drought. A single-layer perceptron model trained on these features achieved 91% accuracy during five-fold cross-validation and 93% accuracy on an independent test set. The model demonstrated an unprecedented capacity to generalize to multi-stress conditions, identifying concurrent signatures in combinatorial salt-and-heat treatments. By integrating marker identification with SHAP-based biological interpretation, AbiOmics provides a rigorously validated diagnostic tool superior to conventional sensing. This framework establishes a high-confidence labeling strategy for AI-driven crop management and precision breeding to mitigate climate change impacts.