top of page

AI Case Study

PayPal improves customer churn and retention metrics with machine learning

PayPal has leveraged H2O’s machine learning technology for identifying customer churn. With H2O's system, modelling customer data that previously required 6 to 72 hours only required 5 to 10 minutes. As a result, the company benefited from insightful analytics that helped operational teams identify when and why a customer will churn to improve marketing campaigns.


Financial Services


Project Overview

"Paypal needed to redefine the metrics the company associated with churn, so the executive team could run KPIs and better understand the long-term health of the business. In addition, the operational teams, including marketing and product teams, needed more accurate, actionable information to help them run campaigns to retain or reactivate customers. To get the information each internal stakeholder needed quickly, Paypal’s Senior Data Scientist, Julian Bharadwaj, and his team began developing a predictive model to show when a customer would churn, or not. During this process, the team jumped straight into using random forest and GBM with H2O, running through R.

As the models were developed, Bharadwaj looked at transaction and behavioral variables as well as demographic data for customers who had churned. The first two turned out to be critical indicators of churn; the latter was not very useful, so the team dropped it. Using H2O made it fast and relatively easy to do that: the models could be modified across multiple parameters and run multiple times very quickly, so Bharadwaj could ensure the validity of the output.

Now in production, Paypal uses H2O on Hadoop to run a predictive modeling factory – large-scale, rapid modeling – that helps Paypal run more sophisticated and e ective marketing programs that reduce churn.

Immediately, Paypal began seeing great results. Initially, modeling on a laptop or a really big virtual machine took a long time when using R and ODBC on hundreds of thousands of rows of customer data. In fact, it wasn’t unusual for modeling to take around 6 hours for a subset of customers, and scoring on the entire customer base would take close to 72 hours. When Bharadwaj switched to using H2O on Hadoop, those times diminished to 10 minutes and 5 minutes respectively to train and score on Paypal’s entire customer base.

“When we started, it was a long, drawn-out process of testing,” said Bharadwaj. “We used Python, then moved over to R as a platform, and then realized that for the volume of data we had, and for the complexity of the models we were fitting, those solutions took a long time. We had to gure out ways to do it quickly, and that’s when we started exploring H2O.” Bharadwaj continued, “What took me 6-7 hours, now took me less than 30 minutes on just development hardware”."

Reported Results

"The company claims the following results:

* Improved churn metrics and accuracy of information delivered to both executive and operational teams
* Increased speed at which models could be run, giving teams immediately actionable data
* Created more sophisticated and e ective programs to reduce churn built around the output of the H2O machine learning algorithms"




General Operations


"Paypal offers a global payments platform in 203 markets. The company boasts 173 million active customer accounts which resulted in 4 billion payments processed in 2014. For Paypal, consumer churn can have a big impact on its bottom line. Previously, the company looked at the problem in speci c increments of time, noting that a customer who hadn’t used its platform in that time period must have churned. Paypal would run a report that showed the churn date for all the customers that fell into this category as the date the report was run. The report also showed which features were the last ones to be used for all customers who churned during that time period. While this information was useful, it wasn’t fully accurate, and, as such, the timeliness and e ectiveness of Paypal’s marketing e orts to win-back customers was less than ideal.

In reality, a customer’s churn date needed to be closer to when they last interacted with the Paypal platform, not simply when a churn report is run. With machine learning, the data scientists at Paypal could predict if a customer will stay with the platform or if that customer will churn and when.

Additionally, because different customer segments may have di erent reactions to the platform features that caused them to churn, using machine learning would enable the scientists to get more speci c feature importance results by customer rather than an aggregate."


Revenue - Churn risk reduction,Revenue - Customer retention


customer data

bottom of page