AI Case Study

KenSci improves on prediction models of mortality risk six months to one year out through deep machine learning leading to better palliative care

End of life care is challenging with the cost estimated at $205B in the US and many patients undergo multiple medical procedures with limited control over their strong preference to die at home surrounded by their loved ones. Using supervised machine learning trained on patient electronic medical records and patient claims KenSci Research was able to improve the leading mortality predictions risk six months to one year out.

Industry

Healthcare

Healthcare Providers And Services

Project Overview

KenSci researched models to help predict mortality six to twelve months out using "machine learning models to issue mortality scores as well integrate explanations associated with these predictions. The machine learning model is deployed as a single layer binary classification layer that can be accessed by a cloud based app from any browser. This system uses data from the hospital systems’ claims live feed or Electronic Health Record (EHR) data feed. New data can be continuously pulled into the cloud which is then transformed into a standardized schema. The standardized schema allows the capability of adding new data sources as well as enabling the transfer of data sources in the system with relative ease."

To address the problem of interpretablity "KenSci has invested significantly in developing models which not only have a high predictive power, but are also interpretable."

"Care providers can thus meaningfully consider the risk recommendation and objectively decide to accept or ignore such recommendations given their own expertise and understanding of the larger picture. This is one way to make the underlying machine learning system more assistive and trustworthy."

Reported Results

"The results that we obtained were better than the baseline that is currently used in the healthcare industry. We also created explanation based prediction models based on the LIME (Locally Interpretable Model-agnostic Explanations) to surface model insights.

Technology

"For both data sets, the following models were tried for the
binary classification problem: Adaboost, Random Forests,
Support Vector Machines, Naive Bayes, Bayes Net, Extreme
Gradient Boosting, CART and GLM (Generalized Linear
Models). The best results were obtained from Extreme Gradient
Boosting and we report the results from this model in
all the cases described here. The output from the model is a
scaled risk score between zero and one. We use a threshold
function such that if the score is above the threshold then it
is flagged as prediction for end of life otherwise it is flagged
as surviving."

Function

R And D

Core Research And Development

Background

"In the US a large percentage of elderly patients on Medicare pass away in acute care hospitals. For many, the last six months of life are characterized by complex medical procedures, repeated emergency visits, and frequent hospital stays. This corresponds to significant costs to the taxpayers with marginal benefits to the patients and their families. In most cases, one gets a very small window of time to prepare for and care for the patient about to expire. Thus, many patients and their families do not have a sense of control over the last stages of their lives. In surveys, a majority of Americans, around 70 percent, expressed their wish to ideally die at home; living their normal way of life, surrounded by their loved ones." "The challenge of predicting 6- to 12-month mortality risk is fairly complex. It's a $205 billion problem just in the US."

Using machine learning models is often a way to make better predictions but their is reluctance to use them in the medical setting without some level of interpretability.

Benefits

Data

"The data from Health System A came from a patient population with a history of heart failure (HF), and included 4,888 patients with a variety of electronic medical records data including:

* Demographic features
* Patient length of stay
* Overall cost related features
* Specific cost related features (in-patient, out-patient, home health, hospice, skilled nursing facility) readmissions information.

Counts of procedures performed, tracked through the Healthcare Common Procedure Coding System (including things like ambulance rides, equipment and prosthetics)

The data from Health System B consists of patients with any type of illness and includes 48,365 patients. Only claims data was available for Health System B.

In the case of assisting a physician in transitioning a patient to palliative care based on insights gained from a 6- to 12-month mortality prediction is a very complex endeavor. Data like demographics and co-morbidity provide good results but additional data sources such as physician input or variations in prescriptions can often provide signification additional information. At the end of the day there is never an ideal 'dream' dataset in machine learning. EMR's tend to contain less than 10 percent of the information about a person."