Make open research FAIR and interoperable.
We work hard to carry out effective and transparent research across federated silos. In this space, we run into the same issues:
Data harmonization is tough but common data elements (CDEs) and a little help from AI do this easier, making a central gold standard for federated learning or meta-analyses.
Data is scattered, as is code for analyses and the lab notebooks used by our friends in the wet lab.
Finding and understanding your data and tools is a challenge in and of itself across multidisciplinary teams.
Data wrangling is 80% of the labor in data science!
We are prototyping tooling, called FAIRkit, to address these issues. This platform aggregates previous resources from our team leveraging AI-assistants to catalogue, connect and explain biomedical research data.
FAIRkit’s prototype includes:
110K human-in-the-loop CDEs, a gold standard for scoring interoperability and prioritization of disparate datasets.
Tooling for data findability across silos i.e. metadata for your data.
Automated aggregation, indexing, explaining of code and lab notes to maximize reproducability and avoid re-inventing the wheel!
This platform combines years of code from projects like DIVER and GenCDE to support FAIR guidelines and gold standard open science research.
Check out details here … FAIRkit = GenCDE + DIVER.
Shout outs in particular to Mat Koretsky, Alan Long and Owen Bianchi!