AI Case Study

Pfizer identifies new potential cancer treatments using IBM Watson

Pfizer has implemented IBM Watson to aid with cancer treatment research after it identified a treatment combination that the research team was also investigating separately. The platform is now being used for immune-oncology drug discovery research as a way to accelerate the process.



Pharmaceuticals And Biotech

Project Overview

From Forbes: "Pfizer decided to collaborate with Watson after a potential treatment target emerged during a pilot program in the last year and a half. Pfizer was working on two products 'outside of cancer' and fed information about that research into Watson, which later suggested a 'very strong combination outcome' as a possible cancer treatment. Meanwhile, Pfizer’s own drug research team suggested a similar combination, giving the drug maker’s research team confidence it was headed in the right direction. By partnering with IBM’s Watson for Drug Discovery, Pfizer hopes to more quickly analyze and test hypotheses from “massive volumes of disparate data sources” that include more than 30 million sources of laboratory and data reports as well as medical literature. Watson will also be able to combine such a massive database with Pfizer’s own proprietary research information."

Reported Results

Validation of research team's own findings, but no specific details given about the drug or its development timeline.


The general process for IBM Watson applications in pharmaceutical research, according to the article in Clinical Therapies, is the following: "Once relevant datasets are collected and Watson has been provided with dictionaries that enable it to recognize terms, a set of annotators is applied to the data... In addition to extracting individual entities such as genes, Watson’s annotators identify the relationships among genes, drugs, and diseases. These annotators typically learn from patterns in the text where they occur and then extrapolate more generally for a given type of entity... Deep natural language-processing and machine- learning techniques were developed so that Watson could teach itself about these concepts and comprehend the subject at a more meaningful level."


R And D

Product Development


As the Clinical Therapies research article points out, one of the challeges of pharmaceutical research is that the "volume of published science grows at a rate of $9% annually, doubling the volume of science output nearly every 9 years. The ability to absorb only a fraction of available information results in many lost opportunities to further research. Drug discovery depends on identifying novel and effective targeting strategies that produce better clinical outcomes for patients. Harnessing volumes of information about how disease processes originate and progress and how drugs affect animals and humans could yield novel treatment strategies."



IBM Watson is intended to go through "more than 30 million sources of laboratory and data reports as well as medical literature" according to Forbes. According to the research article, the way in which this works is that "hundreds of external, public, licensed, and private sources of content that may contain relevant data are aggregated. In the case of Watson, IBM aggregates these data into a single repository called the Watson corpus. A unique Watson corpus is established for each domain to which Watson is applied."