The EFFECT benchmark suite: measuring cancer sensitivity prediction performance - without the bias

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

The EFFECT benchmark suite: measuring cancer sensitivity prediction performance - without the bias

Authors

Szalai, B.; Gaspar, I.; Kaszas, V.; Mero, L.; Sztilkovics, M.; Szalay, K. Z.

Abstract

Creating computational biology models applicable to industry is much more difficult than it appears. There is a major gap between a model that looks good on paper and a model that performs well in the drug discovery process. We are trying to shrink this gap by introducing the Evaluation Framework For predicting Efficiency of Cancer Treatment (EFFECT) benchmark suite based on the DepMap and GDSC data sets to facilitate the creation of well-applicable machine learning models capable of predicting gene essentiality and/or drug sensitivity on in vitro cancer cell lines. We show that standard evaluation metrics like Pearson correlation are easily misled by inherent biases in the data. Thus, to assess the performance of models properly, we propose the use of cell line/perturbation exclusive data splits, perturbation-wise evaluation, and the application of our Bias Detector framework, which can identify model predictions not explicable by data bias alone. Testing the EFFECT suite on a few popular ML models showed that while library-standard non-linear models have measurable performance in splits representing precision medicine and target identification tasks, the actual corrected correlations are rather low, showing that even simple knock-out/drug sensitivity prediction is a yet unsolved task. For this reason, we aim our proposed framework to be a unified test and evaluation pipeline for machine learning models predicting cancer sensitivity data, facilitating unbiased benchmarking to support teams to improve on the state of the art.

Follow Us on

0 comments

Add comment