top of page

AI Case Study

University of California San Francisco researchers use principal component analysis to compare medical treatment outcomes across different studies

Researchers from the University of California San Francisco use principal component analysis to better compare treatment outcomes measured across a variety of domains from different studies of mice with traumatic brain injuries. They discovered a synergy between two drug treatments with the method and that the method itself shows promise for future application.


Public And Social Sector

Education And Academia

Project Overview

According to Nature, the researchers "used the data that had been generated from the previous three studies, and applied AI techniques to explore which treatment might work best for TBI in rodent models. One challenge they had to overcome was the common practice of assessing brain damage on scales, in which a number along a given scale represents a symptom or the severity of a symptom... [The] team used an artificial neural network to perform a calculation that converts these scaled measurements into simple numbers, such that it identifies the relationship between the extent of damage and all the biological factors that may determine drug response.

The calculations and modeling that the machine performed enabled the researchers to first identify what characteristics would comprise 'better'—things such as lesion size, motor ability, memory capacity and others. Once the machine calculated this score, the team was then able to measure and rank the possible combinations from the three therapies tested in the previous studies to identify which combination best stacked up to achieve the characteristics scored by the computer as 'better.'"

Reported Results

As reported in Nature, the researchers "found that a mix of an anti-inflammatory agent called minocycline and LM11A-31, a molecule that helps to support and maintain nervous tissue, was the best drug regimen to promote recovery following TBI among the rats included in the original three studies. The team also discovered that waiting until at least a week after injury to initiate physical therapy would enhance the recovery from injury. Ferguson explains that although the three preclinical studies had collected and curated the data correctly, the authors had too many potential variables to look at to make sense of the data." From the research paper: "More generally, the outcomes suggest that the analytic methods used can be applied to complex data sets to reveal significant interaction patterns that are reproducible across studies and diverse outcome measures."


According to the research paper: "Data were curated into a common database aligning common data elements for use in an unbiased data-driven workflow including unsupervised non-linear principal component analysis (NL-PCA) and subsequent hypothesis testing... we developed an unbiased, data-driven approach to capture multi-scalar outcome measures. In this method, variables across endpoints were curated and placed together into NL-PCA, in which an alternating least squares algorithm drove optimal-scaling transformations and variance-maximizing dimensionality reduction to harmonize categorical, ordinal, and numeric variables in a single non-linear framework. Next, a linear mixed model (LMM) was applied to the NL-PCA outcome (PC scores) to test multidimensional therapeutic impact, free from outcome selection bias."


R And D

Core Research And Development


From the research paper: "In the context of traumatic brain injury (TBI), no one outcome measure is likely to reflect the full complexity and diverse nature of recovery, and the true results of therapeutic interventions after TBI likely lie in the relationship between numerous outcome variables. Indeed, in recent studies of TBI and spinal cord injury, pooled multicenter and multispecies data coupled to data-driven multidimensional analysis has revealed information with significant potential for therapeutic translation, which would not have been easily identified through univariate analysis of a single end-point."



According to the research paper, "three studies analyzed contained an aggregate N of 202 rats, with more than 30 recorded variables per subject".

bottom of page