The power of AI begins with the data

Opinion

Real world evidence

DATE

June 13, 2023

AUTHOR

Nirosha Lederer (Aetion) and Yashoda Sharma (Digital Medicine Society)

The Language of Genomes

In this opinion piece, Nirosha Lederer (Aetion, DC, USA) and Yashoda Sharma (Digital Medicine Society, CT, USA) explore the analytical validation of digital health measurement products, including AI/machine learning (ML). Yashoda and Nirosha provide their opinion on the issue of technology outpacing the development of regulation, and highlight the need for more industry collaboration if we are to deploy digital health products more inclusively.

The power of AI and ML comes from applications that allow us to sort through volumes of data faster and more efficiently to identify relationships among different data points. With this comes opportunities to revolutionize and streamline care delivery and the medical research paradigm. For example, experience plays a large role in the quality of care and accuracy of decision-making by physicians. AI/ML can aid clinical decision-support of earlier-career physicians with less experience by making recommendations based on others’ medical experiences and medical knowledge, thus narrowing the expertise gap between them and physicians with greater tenure. In developing and implementing clinical trials, AI/ML can improve the refinement of inclusion and exclusion criteria, increase and improve participant engagement for enrollment and retention, accelerate study team efficiency, and enhance data collections.

However, the ability to realize the potential of these AI/ML tools is dependent not only on the quality of algorithm development, but also the underlying data that is utilized to train the algorithms that then power the AI/ML model. Traditional healthcare data sources reflect data collected in the context of healthcare delivery in the traditional four walls of the system – for example, electronic health record (EHR) data that is collected as part of clinical workflows or claims data intended for billing purposes. It is important to understand what these data do and do not represent. For example, one study found that a risk-stratifying algorithm assigned equal risk to black patients who were sicker than white patients due to utilizing health cost as a proxy for health needs. Furthermore, the lack of diversity and representation in AI/ML development can lead to the creation of biased algorithms that reduce the reliability, generalizability and usability of these tools in sub-populations, and subsequently further exacerbate care disparities and discrimination.

AI/ML are deeply interwoven with digital health technologies, which have become ubiquitous – from simple personal fitness trackers to complex disease monitoring, treatment and intervention products. Digital health products collect clinical and other forms of health data from individuals during their daily activities. The continuous and customized data collected via remote monitoring devices, fitness trackers and other wearable devices, smartphones, patient reported outcomes, genomic data and publicly available population-level data will help individuals take more control over their health, help clinicians make more informed assessments and interventions, and contribute to public health initiatives and policy making. Many of these novel data streams alone provide far more in depth data than the EHR, and when combined, they provide a deeper explanation of multiple factors contributing to an individual’s health. At the same time digital health technologies are presenting opportunities for healthcare to be more equitable and inclusive by providing individuals, who may not have an opportunity to access traditional healthcare services, the tools to manage their care remotely, and also allowing us to study how medical products and interventions work in these historically understudied populations. It is more important than ever for clinicians, patients, sponsors, medical device developers and all stakeholders to have confidence in these products to accurately collect measures that provide true value.

Leaders in the digital health ecosystem have a responsibility to ensure that the data collected by digital health measurement products and then used to train AI are fit-for-purpose and representative of the populations who stand to benefit from it. We have three recommendations to begin to address the challenges facing AI/ML when it comes to advancing the use of digital health measurement products.

First, start with the fundamentals of ethics, which comes down to two tenets: 1) weighing the benefits and risks associated with developing and deploying digital health measurement products and 2) ensuring that the benefits outweigh the risks for all people, not unlike developing a drug or traditional device . Ethical training is necessary for all who are involved with generating data, including digital health product developers and deployers, patients (data source), clinicians and others who will be making decisions based on the data (end users). AI developers then have the responsibility of applying these tenets to the design phase of the AI lifecycle; starting with identifying the problem to solve and then interrogating how and when the data was collected, cleaned, and applied to the problem that needs solving. Comprehensive digital health ethics training has to become a cornerstone of healthcare as we integrate the use of digital technologies and evolve with digital medicine. But are there consequences?

Secondly, inclusion is needed throughout the digital health measurement product development and deployment processes. Digital health technologies are the result of two industries that have historically existed in silos – technology and healthcare (clinical and research), yet their value comes from knowledge exchange among all the stakeholders in both industries. For the technology industry, each step in the product development lifecycle offers opportunities for being inclusive; be it from identifying the problem the product is attempting to solve and selecting the appropriate measures, to software development, user testing, and post-market performance. Once the product is built, quality evaluation through verification, analytical validation, and clinical validation should be conducted with the lens of inclusivity, to ensure that the data collected is fit-for-purpose.

The second industry, healthcare, leads with product deployment. Developing the product inclusively is only one side of the coin, if the people who need access to the product cannot access it or use it successfully then it is irrelevant how it was developed. Careful consideration and detailed information on the user’s life-experiences and how this impacts their use of the product will become critical for the quality of data collection.

Finally, more diversity is needed among those who play a role in how the data is collected and utilized in AI/ML model development and training. Clinical staff, specifically medical doctors, do not adequately represent their patient populations; this has implications for public health as well as clinical trial participation. Similarly, greater diversity is needed among AI developers; diversity across demographic and social determinants of health parameters. Doctors and clinical staff exert a level of influence over the data source (patients or clinical trial participants), while model development and AI/ML training developers will impact the usability and application of the data. Therefore, these groups should represent the intersectionality of those categorizations that impact how people experience the healthcare system and the effects of their previous life experiences on their health. Data collected by digital health products are affected by a variety of factors, from the amount of melanin in a person’s skin, to their digital access and literacy. Data integrity will therefore require a level of knowledge to work within these parameters.

The potential of AI/ML to advance healthcare, especially with the use of data collected from digital health measurement products, is vast. However, we cannot place sole reliance on the AI/ML models; more attention has to be given to the data sources, as they are the main limitation of AI’s power to transform healthcare. We do this by first ensuring that ethical practices are in place at every level of the digital health ecosystem, then we carry that forward to develop and deploy digital health products more inclusively, while increasing the diversity of those with power over how data will be collected and interpreted. As the excitement over AI intensifies, let us not forget the old adage of garbage in, garbage out. The clean-up starts now. ‍

Disclaimers:

The opinions expressed in this feature are those of the interviewee and do not necessarily reflect the views of Future Medicine AI Hub or Future Science Group.

Expert Insight on the Future of Healthcare

with world-renowned experts