Lenus Health's Lead Data Scientist on the importance of explainable AI

The Lenus Engineering team is dedicated to further investigating why a model makes a decision that it does and strives to constantly innovate in this space.

Artificial intelligence (AI) and machine learning (ML) are often used as a black box, where features from training data go in and a prediction comes out. However, in a clinical setting, model explainability is critical to ensure trust and safety in model predictions. Transparency is crucial if ML is to aid clinical decision making in an effective and ethical manner.

Algorithms can be classed into two broad categories; black box or glass box. A glass box algorithm is one that is inherently interpretable and allows the user to follow the decision path made by the model. An example of a glass box algorithm would be a simple decision tree classifier or a linear regression. A black box algorithm on the other hand is inherently uninterpretable, whereby the user cannot explicitly follow the model decision from input to output.

Black box algorithms rely on additional libraries such as SHAP to derive both global and local model explainability. Global explainability describes what is important overall to the model (averaged across the entire training population), whereas local explainability describes what is important to an individual prediction. While the glass box algorithms are a great sell in terms of full transparency, interpretability does not always equal explainability due to variable confounding.

Understanding confounding is one of the most important considerations when developing models for clinical decision making. Confounding variables are ones which effect both the independent and dependent variables, appearing commonly in healthcare data. One such example of this issue is predicting survival risk in pneumonia patients (R. Caruana, 2015). The model predicted that asthmatic patients with pneumonia had a greater survival probability which obviously does not seem to make sense. Interrogation of the model found that the hospital’s treatment policy was different for patients presenting with pneumonia and asthma, meaning that asthma patients received a different treatment protocol (typically admitted directly to ICU), and as a result, had better survival rates. That being the case, it is wrong to say that having asthma increases your pneumonia survival chances.

An additional challenge which explainable AI seeks to solve is in identifying biases.

An additional challenge which explainable AI seeks to solve is in identifying biases. Many recent health prediction models have been shown to perform poorly on women, racial and ethnic minorities. Ground truth diagnosis labelling, for example, has been shown to be gender-skewed due to different disease presentation in Males vs Females. Similarly, during the model training phase, approximation errors have been shown to disproportionately affect under-represented groups.

Biases in Electronic Health Records (EHR) data may occur due to variations in patient populations, availability of care etc. When dealing with these data, we need to make sure we account for biases in the training data and that we don’t introduce additional biases into models. Biases because of data loss are also prevalent when building models using EHR data. Additionally, strict inclusion/exclusion criteria result in a dataset that is not representative of the population.  An example of this being a technological competency requirement to take part in a trial. Any model developed using this data would not be representative of the true population due to the exclusion of a certain demographic (in this example, a lot of the elderly population).

These biases often get embedded into the model and can have detrimental effects when deployed into the wild. For example, missing data in one healthcare setting may have a different meaning to missing data in another. Explainable AI allows the end-user to interrogate the model and surface any underlying biases.

The Lenus Engineering team is dedicated to further investigating why a model makes a decision that it does, and we strive to constantly innovate in this space. It is paramount that any model we develop does not have the potential to exacerbate any existing health disparities and that it is fair, ethical, and explainable.