Development and validation of a reliable DNA copy-number-based machine learning algorithm (CopyClust) for breast cancer integrative cluster classification

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Development and validation of a reliable DNA copy-number-based machine learning algorithm (CopyClust) for breast cancer integrative cluster classification

Authors

Young, C. C.; Eason, K.; Manzano Garcia, R.; Moulange, R.; Mukherjee, S.; Chin, S.-F. C.; Caldas, C.; Rueda, O. M.

Abstract

The Integrative Clusters (IntClusts) provide a framework for the classification of breast cancer tumors into 10 distinct genomic subtypes based on DNA copy number and gene expression. Current classifiers achieve only low accuracy without gene expression data, warranting the development of new approaches to copy-number-only-based IntClust classification. A novel XGBoost-driven classification algorithm, CopyClust, was trained using genomic features from METABRIC and validated on TCGA achieving a nine-percentage point or greater improvement in overall IntClust subtype classification accuracy.

Follow Us on

0 comments

Add comment