Cross-modal Generation of Hit-like Molecules via Foundation Model Encoding of Gene Expression Signatures
Cross-modal Generation of Hit-like Molecules via Foundation Model Encoding of Gene Expression Signatures
Cheng, J.; Pan, X.; Yang, K.; Cao, S.; Liu, B.; Yuan, Y.
AbstractDesigning hit-like molecules from gene expression signatures takes into account multiple targets and complex biological effects, enabling the discovery of multi-target drugs for complex diseases. Traditional methods relying on similarity searching against a database are limited by the quality and size of the databases. Instead, multimodal deep learning offers the potential to overcome this bottleneck. Additionally, the recent development of foundation models provides a new perspective to understand gene expression levels. Thus, we propose GexMolGen (Gene Expression-based Molecule Generator) based on a foundation model scGPT to generate hit-like molecules from gene expression differences. By taking the desired and control gene expression profile as inputs, GexMolGen designs molecules that can induce the required transcriptome profile. The molecules generated by GexMolGen exhibit a high similarity to known gene inhibitors. Overall, GexMolGen explores the chemical and biological relationships in the drug discovery process. The source code and demo are available at https://github.com/Bunnybeibei/GexMolGen.