top of page

AI Case Study

Novartis researchers train algorithm to identify different cell types to spot cancer in scans

Novartis has collaborated with PathAI to train a machine learning algorithm to categorise cell types and recognise cancerous ones. The system was trained on 400 pathology images from breast and lung cancer tissues from the Institute of Pathology at the University Hospital Basel. It had to identify cell types and if it spotted cancer, it had to predict a patient's probability of surviving five years. The system was successful in categorising cells in five different types; lymphocyte, tumor cell, macrophage, plasma cell and fibroblast. The next step is to try to use the system for spotting information on scans and images that pathologists have missed or are not able to recognise.



Pharmaceuticals And Biotech

Project Overview

"Many researchers are pursuing this possibility, but Novartis pathologists think AI might have an additional role to play. They hypothesize that pathology slides could contain information that helps explain why some patients respond to therapy when other seemingly similar patients do not.

To explore this idea, pathologists and data scientists from Novartis have joined forces with tech startup PathAI. They are training an AI system developed by PathAI to learn to see the same patterns pathologists see and then building on that to determine if the system can detect hidden but informative patterns too subtle or complex for pathologists to discern. The effort is part of a larger effort at Novartis to leverage data and digital technologies in ways that could help drug developers get the right drugs to the right patients faster.

In a first phase of testing, the collaborative team has trained the PathAI system to look at slides from untreated patients and distinguish tumor from normal tissue. The system can also identify different cell types on a slide reliably. For a pathologist, these feats are akin to finding a needle in a haystack and then labeling every piece of straw.

The ability to label every cell is becoming increasingly important as cancer therapies evolve to include medicines that target not only cancer cells but also immune cells. If computers can analyze an entire slide at once and quantify cell types and locations, they could potentially reveal patterns that predict how well a patient might fare on a given therapy.

“Hopefully we can figure out which features correlate with survival or response to a drug,” says Meg McLaughlin, a pathologist and Director of the Oncology Pathology and Biomarkers group in the Oncology Translational Research team at the Novartis Institutes for BioMedical Research (NIBR).

With a recent explosion of experimental immuno-oncology options alongside therapies that target cancer-driving mutations, one of the biggest challenges for drug hunters is matching the most appropriate therapy to individual patients. While genomic information helps drive smart decisions, valuable clues in pathology slides could also help. “We want to create a platform that enables the field of pathology to support the accelerating pace of drug development,” says Andrew Beck, a pathologist, computer scientist and CEO of PathAI, located in Boston, Massachusetts, in the US.

In collaboration with the Institute of Pathology at the University Hospital Basel in Switzerland, the Novartis team gained access to 400 pathology images from breast and lung cancer tissues along with anonymized information about the patients’ diagnoses and survival times.

The challenge for PathAI’s platform? Given an image, identify cancer, identify cell types and predict the patient’s probability of surviving five years.

One way to approach the challenge is to feed a set of untrained AI algorithms a subset of the data and see what it learns. Unlike a trained pathologist, the machine approaches the problem with no knowledge of cells or cancer.

To give the untrained algorithms more knowledge about the training data, PathAI decided to feed them even more rich data. A team of consulting pathologists marks up the slides, giving the algorithms more information to work with. It’s a bit like annotations in a hefty piece of literature that highlight and explain critical passages.

For example, when training the algorithms to distinguish cell types, PathAI diced the training slides into about 10 000 smaller images and had pathologists label the cell types in each slice. “We had to think really hard about how we annotate the images,” says McLaughlin. “That step determines to a large extent what you get out of the AI model in the end.”

After training, the PathAI platform lets users see pathology images through the machine’s eyes. Regions of the slides determined to be cancer glow bright red in a field of green surrounding tissue. Different cell types stand out in vivid colors like candies in a dish. The existing platform is for research use only, but PathAI aims to build applications that could be used by doctors in the future."

Reported Results

"The PathAI system has also learned to recognize cell types and overlays the slide with indications of five different kinds: lymphocyte (green), tumor cell (red), macrophage (yellow), plasma cell (black) and fibroblast (purple).

Now that the researchers have shown that the PathAI system has the potential to see what pathologists see, they want to find out if there’s information in those images that isn’t obvious to pathologists."




"For 150 years, pathologists have been looking through microscopes at tissue samples mounted on slides to diagnose cancer. Each assessment is weighty: Does this patient have cancer or not?

The job of a pathologist is daunting. A single slide could contain hundreds of thousands of cells. Only a handful might be cancer. Inaccurate diagnosis rates range from 3-9% of cases, according to a recent review.

Enter artificial intelligence (AI), an extra set of unbiased, indefatigable artificial eyes that could help catch errors."



"In collaboration with the Institute of Pathology at the University Hospital Basel in Switzerland, the Novartis team gained access to 400 pathology images from breast and lung cancer tissues along with anonymized information about the patients’ diagnoses and survival times."

bottom of page