Industry-Scale Orchestrated Federated Learning for Drug Discovery

Avatar
Connected to paperThis paper is a preprint and has not been certified by peer review

Industry-Scale Orchestrated Federated Learning for Drug Discovery

Authors

Martijn Oldenhof, Gergely Ács, Balázs Pejó, Ansgar Schuffenhauer, Nicholas Holway, Noé Sturm, Arne Dieckmann, Oliver Fortmeier, Eric Boniface, Clément Mayer, Arnaud Gohier, Peter Schmidtke, Ritsuya Niwayama, Dieter Kopecky, Lewis Mervin, Prakash Chandra Rathi, Lukas Friedrich, András Formanek, Peter Antal, Jordon Rahaman, Adam Zalewski, Wouter Heyndrickx, Ezron Oluoch, Manuel Stößel, Michal Vančo, David Endico, Fabien Gelus, Thaïs de Boisfossé, Adrien Darbier, Ashley Nicollet, Matthieu Blottière, Maria Telenczuk, Van Tien Nguyen, Thibaud Martinez, Camille Boillet, Kelvin Moutet, Alexandre Picosson, Aurélien Gasser, Inal Djafar, Antoine Simon, Ádám Arany, Jaak Simm, Yves Moreau, Ola Engkvist, Hugo Ceulemans, Camille Marini, Mathieu Galtier

Abstract

To apply federated learning to drug discovery we developed a novel platform in the context of European Innovative Medicines Initiative (IMI) project MELLODDY (grant n{\deg}831472), which was comprised of 10 pharmaceutical companies, academic research labs, large industrial companies and startups. The MELLODDY platform was the first industry-scale platform to enable the creation of a global federated model for drug discovery without sharing the confidential data sets of the individual partners. The federated model was trained on the platform by aggregating the gradients of all contributing partners in a cryptographic, secure way following each training iteration. The platform was deployed on an Amazon Web Services (AWS) multi-account architecture running Kubernetes clusters in private subnets. Organisationally, the roles of the different partners were codified as different rights and permissions on the platform and administrated in a decentralized way. The MELLODDY platform generated new scientific discoveries which are described in a companion paper.

Follow Us on

1 comment

Avatar
scicastboard

Dear Dr. Oldenhof -- Thank you for sharing your interesting work on the MELLODDY platform, which demonstrates federated learning applied to drug discovery on an industry scale. If we understood correctly, the platform involves a three-layer architecture, uses TensorFlow Federated and AWS multi-account setup, and has been deployed in production for three years with ten major pharmaceutical partners. The results show improvements in predictive performance of collaboratively trained models compared to single partner models, with the potential to support drug discovery by enhancing decision-making regarding candidate drug molecules.


ScienceCast Moderators/Review Pane has a few questions on some aspects of the work:

  1. Could you provide more details on the specific algorithms and techniques used for federated learning within the MELLODDY platform? How do these algorithms compare to other state-of-the-art approaches in terms of model performance and security?
  2. How does the platform handle potential scalability issues, such as adding more partners or accommodating a larger volume of data? 
  3. In the security analysis, you focus on membership inference attacks. Have you considered evaluating the platform's vulnerability to other types of attacks, such as model inversion, property inference, or model extraction attacks? 
  4. Are there any specific domains within drug discovery where the platform performs exceptionally well or falls short? Can you provide more insight into the factors contributing to the observed performance improvements, especially in assays related to pharmacokinetics and toxicology?
  5. You mentioned exploring sparse secure aggregation, partner weighting, and post-processing tools like model fusion as potential future work. Can you elaborate on how these techniques might enhance the platform's performance, security, or collaborative capabilities?

We look forward to learning more about the MELLODDY platform and its potential impact on the pharmaceutical industry.
ScienceCast Board

Add comment