AI Case Study

Moody's Analytics find loan risk prediction is improved using AI models over its proprietary RiskCalc model

Moody's Analytics compared its RiskCalc model with 3 AI models to determine which is better at predicting the probability that a private firm will default on its loans. The models include a neural network, random forest ensemble, and boosting, outperform the RiskCalc model by 2-3% but also represent more of a challenge for explaining outcomes.


Financial Services

Investment Banking And Investment Services

Project Overview

"The RiskCalc model delivers robust performance in predicting private firm defaults. But how does it compare to other machine learning techniques? We use the three popular machine learning methods to develop new models using the RiskCalc sample as a training set. We seek to answer the following questions: Do the machine learning models outperform the RiskCalc model’s GAM framework in default prediction? What are the challenges we face when using the machine learning methods for credit risk modeling? Which model is most robust? Which model is easiest to use? And what can we learn from the alternative models?"

Reported Results

Moody's found that "the machine learning models outperform the RiskCalc (GAM) model by 2 to 3 percentage points for both datasets. The accuracy ratio improves by 8 to 10 percentage points when we add loan behavioral information, regardless of the modeling approach. Credit line usage and loan payment information complement financial ratios and significantly enhance the models’ ability to predict defaults... the machine learning approaches are better equipped to capture the non-linear relationships common to credit risk. At the same time, the predictions made by the approaches are sometimes difficult to explain due to their complex “black box” nature. These machine learning models are also sensitive to outliers, resulting in an overfitting of the data and counterintuitive predictions".


The 3 AI models compared with Moody's RiskCalc model are: neural networks, random forests and boosting





"When a business applies for a loan, the lender must evaluate whether the business can reliably repay the loan principal and interest. Lenders commonly use measures of profitability and leverage to assess credit risk. A profitable firm generates enough cash to cover interest expense and principal due. However, a more-leveraged firm has less equity available to weather economic shocks. Given two loan applicants – one with high profitability and high leverage, and the other with low profitability and low leverage – which firm has lower credit risk? The complexity of answering this question multiplies when banks incorporate the many other dimensions they examine during credit risk assessment. These additional dimensions typically include other financial information such as liquidity ratio, or behavioral information such as loan/trade credit payment behavior. Summarizing all of these various dimensions into one score is challenging, but machine learning techniques help achieve this goal.

Moody’s Analytics RiskCalc model produces expected default probabilities for private firms by estimating the impact of a set of risk drivers. It utilizes a generalized additive model (GAM) framework, in which non-linear transformations of each risk driver are assigned weights and combined into a single score. A link function then maps the combined score to a probability of default."



Two different data sets were used: "The first datas comes from the Moody’s Analytics Credit Research Database (CRD) which is also the validation sample for the RiskCalc US 4.0 corporate model. It utilizes only firm information and financial ratios. The second dataset adds behavioral information, which includes credit line usage, loan payment behavior, and other loan type data. This information comes from the loan accounting system (LAS), collected as part of the CRD".