Federated learning in multimodal healthcare data [preprint]

Data silos are a huge issue in machine learning, particularly in the healthcare and biotech settings. This can be due to distributed servers or cloud localities, policy barriers like GDPR, or people sometimes just not wanting to move or share data.

Today we are sharing a really exciting preprint led by Ben Danek and Faraz Faghri from our team that shows how we can work across silos to maximize the value of federated healthcare datasets in a precision medicine context.

Check out this proof of concept for multimodal federated learning in Parkinson’s disease here.



From a technical perspective … this work illustrates the (relative) ease of use and high performance of modern fully federated machine learning methods in this space. In our paper, we conduct a performance evaluation of federated machine learning algorithms for Parkinson’s disease diagnosis using multi-omics. The top performing federated algorithm scores 87.6% AUC-PR on withheld data, within 2% of the top performing central machine learning algorithm. We find that federated learning can enable collaborative machine learning training on datasets siloed by policy boundaries, and cloud service providers.


Check out the PDF of the preprint on biorxiv for more details. More to come on the topic of federated learning in healthcare data soon enough.

Previous
Previous

Working across biobanks to show just how important sleep is to your brain across your lifespan [preprint].

Next
Next

Nothing like a rebrand.