AI-READI: Driving Inclusive Innovation in the Future of T2DM Research

AI-READI: Driving Inclusive Innovation in the Future of T2DM Research

DATE
December 12, 2024
SHARE
The Language of Genomes

Funded by the National Institutes of Health (NIH) (MD, U.S.) Bridge2AI program, the AI-READI dataset advances the understanding of type 2 diabetes mellitus (T2DM), exploring both the pathogenesis and salutogenesis of the disease.

Diabetes Mellitus is a chronic metabolic disorder and type 2 accounts for 90% of cases. The main cause of T2DM is insulin resistance. There is currently no cure for T2DM, instead, treatment involves blood sugar level management to prevent further complications such as cardiovascular diseases.

Reported in the Nature Metabolism journal, Artificial Intelligence Ready and Equitable Atlas for Diabetes Insight (AI-READI) is a multimodal data generation project (DGP). It aims to address significant gaps in diabetes research by creating an AI-ready dataset, containing information that can offer a more holistic view of the disease.

The AI-READI Dataset: A Comprehensive Introduction

Concerns have been raised regarding biases in gender and race/ethnicity within datasets that are used to train AI models, largely due to a lack of diversity in the AI workforce. For instance, it was reported that women comprise only 22% of the global AI talent, and less than 25% of AI employees come from racial and ethnic minority backgrounds.

To address this issue, Bridge2AI ensures that the AI-READI database would include equal representation of participants across four race/ethnicity groups:  Asian, Black, White, and Hispanic. Having equal representations for this dataset is particularly important as certain populations, such as African Americans and Asian Americans, are at a higher risk of developing T2DM.

Moreover, specific groups that have been previously disregarded in research are intended to be targeted as well, particularly American Indian and Alaska Native communities, which are minorities that suffer from a disproportionately high burden of T2DM. This AI-READI project is currently recruiting 4,000 participants from three study sites (Seattle; San Diego; and Birmingham, Alabama), and is planned to span for four years (2022-2026). The recruitment process involves using ICD-10 codes, an international classification system for diseases, along with demographic data obtained from electronic health records. This ensures standardization and consistency of data from all participants.

All participants follow a study protocol including a pre-visit questionnaire, on-site data collection, and at-home data collection. On-site data collection includes collections of blood samples which are to be taken at the University of Alabama (AL, U.S.).

“We see data supporting heterogeneity among type 2 diabetes patients -; that people aren't all dealing with the same thing. And because we're getting such large, granular datasets, researchers will be able to explore this deeply."

Dr Cecilia Lee, Professor of Ophthalmology, University of Washington School of Medicine.

Building a Fair Future

The AI-READI blueprint will be shared through the FAIRhub platform, allowing other DGPs to adopt and utilize it, ultimately increasing the amount of data available for research. The data in the FAIRhub platform will be accessible annually either as a controlled-access set or a public-access set, which protects participants’ privacy.

In fact, more than 110 research organizations worldwide have downloaded the AI-READI project.

The project’s determination for inclusivity is very apparent and is further highlighted by the one-year internship offered by Bridge2AI. Led by AI-READI investigators, the internship specifically targets the recruitment of women and under-represented groups. The goal is to inspire similar projects to do the same.

Revolutionizing Diabetes Research: Progress and Challenges Ahead

AI-READI offers a robust framework to address a major challenge in healthcare AI: the lack of high-quality, readily available data required to develop reliable and effective AI systems. The potential uses of this AI-ready dataset are vast, from assisting the development of personalized treatment to identifying new risk factors of diabetes.

While these early developments are promising, it is important to note that the project is still in its initial stages, with a few years remaining until completion. Despite the project’s efforts to ensure inclusivity, the study locations are limited to the US, restricting geographic diversity. This is particularly relevant considering that the estimated prevalence of T2DM is highest in Brazil and Mexico. Therefore, this impacts the generalizability of the results, reducing its application to a global population.

Nevertheless, the preliminary outcomes are encouraging and suggest positive progress. Notable findings include data highlighting a link between disease state and pollution particles and the observed heterogeneity of T2DM among participants.

This project is seen as a transformative initiative in the intersection of big data, AI, and health research. By offering a diverse dataset, AI-READI may improve the understanding of the pathogenesis and salutogensis of T2DM.