Make open research FAIR and interoperable.

We work hard to carry out effective and transparent research across federated silos. In this space, we run into the same issues:

  1. Data harmonization is tough but common data elements (CDEs) and a little help from AI do this easier, making a central gold standard for federated learning or meta-analyses.

  2. Data is scattered, as is code for analyses and the lab notebooks used by our friends in the wet lab.

  3. Finding and understanding your data and tools is a challenge in and of itself across multidisciplinary teams.

  4. Data wrangling is 80% of the labor in data science!

We are prototyping tooling, called FAIRkit, to address these issues. This platform aggregates previous resources from our team leveraging AI-assistants to catalogue, connect and explain biomedical research data.

FAIRkit’s prototype includes:

  • 110K human-in-the-loop CDEs, a gold standard for scoring interoperability and prioritization of disparate datasets.

  • Tooling for data findability across silos i.e. metadata for your data.

  • Automated aggregation, indexing, explaining of code and lab notes to maximize reproducability and avoid re-inventing the wheel!

This platform combines years of code from projects like DIVER and GenCDE to support FAIR guidelines and gold standard open science research.

Check out details here … FAIRkit = GenCDE + DIVER.

Shout outs in particular to Mat Koretsky, Alan Long and Owen Bianchi!

Previous
Previous

A biomedical machine learning factory …

Next
Next

Automating biomedical data science pipelines for efficiency, interoperability and reproducibility.