SALUS - Article - Assessing the impact of predictive AI in healthcare settings

Models built on machine learning in healthcare can be victims of their own success, researchers propose.

Experts at the Icahn School of Medicine and the University of Michigan assessed the impact of implementing predictive models on the subsequent performance of those and other models. They found that using the models to adjust how care is delivered can alter the baseline assumptions that the models were “trained” on, often leading to problems.

“We wanted to explore what happens when a machine learning model is deployed in a hospital and allowed to influence physician decisions for the overall benefit of patients,” said first and corresponding author Akhil Vaid, MD, clinical instructor of Data-driven and Digital Medicine (D3M), part of the Department of Medicine at Icahn Mount Sinai.

“For example, we sought to understand the broader consequences when a patient is spared from adverse outcomes like kidney damage or mortality. AI models possess the capacity to learn and establish correlations between incoming patient data and corresponding outcomes, but use of these models, by definition, can alter these relationships. Problems arise when these altered relationships are captured back into medical records.”

The study simulated critical care scenarios at two major healthcare institutions, the Mount Sinai Health System in New York and Beth Israel Deaconess Medical Center in Boston, analysing 130,000 critical care admissions. The researchers investigated three key scenarios:

Model retraining after initial use – Current practice suggests retraining models to address performance degradation over time. Retraining can improve performance initially by adapting to changing conditions, but the Mount Sinai study shows it can paradoxically lead to further degradation by disrupting the learned relationships between presentation and outcome.
Creating a new model after one has already been in use – Following a model’s predictions can save patients from adverse outcomes, such as sepsis. However, death may follow sepsis, and the model effectively works to prevent both. Any new models developed in the future for prediction of death will now also be subject to upset relationships as before. Since we do not know the exact relationships between all possible outcomes, any data from patients with care influenced by machine learning may be inappropriate to use in training further models.
Concurrent use of two predictive models – If two models make simultaneous predictions, using one set of predictions renders the other obsolete. Therefore, predictions should be based on freshly gathered data, which can be costly or impractical.

Model use leads to mixed associations because at-risk patients avoid adverse outcomes, and the electronic health record captures this. Future models trained on data containing these mixed associations perform worse. - Vaid et al., Annals of Internal Medicine; ©2023 American College of Physicians. Used with permission.

“Our findings reinforce the complexities and challenges of maintaining predictive model performance in active clinical use,” noted co-senior author Karandeep Singh, MD, from the University of Michigan. “Model performance can fall dramatically if patient populations change in their makeup. However, agreed-upon corrective measures may fall apart completely if we don’t pay attention to what the models are doing – or more properly, what they are learning from.”

Co-senior author Girish Nadkarni, MD, MPH, a professor of medicine at Icahn Mount Sinai, said: “We should not view predictive models as unreliable. Instead, it’s about recognising that these tools require regular maintenance, understanding and contextualisation. Neglecting their performance and impact monitoring can undermine their effectiveness. We must use predictive models thoughtfully, just like any other medical tool.

“Learning health systems must pay heed to the fact that indiscriminate use of, and updates to, such models will cause false alarms, unnecessary testing, and increased costs.”

Commenting on recommendations, Dr Vaid suggested that health systems should implement a system to track individuals impacted by machine-learning predictions, and that relevant governmental agencies should issue guidelines in this regard.

“These findings are equally applicable outside of healthcare settings and extend to predictive models in general,” he added. “As such, we live in a model-eat-model world where any naively deployed model can disrupt the function of current and future models, and eventually render itself useless.”

The paper, ‘Implications of the use of artificial intelligence predictive models in healthcare settings: A simulation study’, is published in the 9 October online issue of Annals of Internal Medicine.

Healthy Planet. Healthy People.

Assessing the impact of predictive AI in healthcare settings

Sharing Environmental Health Services Across Jurisdictional Boundaries