AI Case Study
Lyft delivered a 40% increase in potentially fraudulent users detected without increasing false positives by using neural networks
Lyft has improved the number of users it flags as potentially fraudulent by implementing a convolutional recurrent neural network architecture. This is an improvement over previous models as it constantly learns and self-updates with behaviour patterns that may indicate fraud.
"At Lyft, fraud decision-making is split between business rules handcrafted by analysts and machine learning models developed by research scientists. These business rules and machine learning models form the backbone of our detection system that trigger pre-authorizations and identity challenges targeted at blocking fraudsters. To improve our fraud decisioning, we started focusing on more modern, powerful modeling methods in the past year. More recently, we’ve shifted our attention to neural networks that were even more powerful and could gracefully work with far richer streaming data sources. In the end, we settled on a neural network that maps user actions to feature embeddings and a convolutional-recurrent architecture with an attention mechanism."
"At its core, any counter-fraud measure worth its salt is designed to irrevocably drive the fraudster’s operational costs up to the point where the fraud vector becomes economically unsustainable... To be effective against an adaptive adversary, it’s important to develop robust features that look at things fraudsters find hard to control or change."
"On a per-model basis, we found that the addition of the behavior fingerprinting module to our production structured features-only neural network produced a relative lift in recall of over 40% for the same precision with respect to all fraudulent users." Essentially this means Lyft has achieved a relative increase in the number of cases it flags as fraudulent by over 40%, while maintaining the same rate of true positives among those.
"Our behavior fingerprinting neural network is implemented as a stack of the embedding layer, ConvNet, and RNN in that order on Tensorflow through the Keras interface. We concatenate the RNN’s output with the structured features and pass it through fully-connected layers that returns a softmax multi-class output that determines the probability assigned to each possible fraud user segment.
The 1D convolutional network (ConvNet) forms the second component of our neural network. True to its namesake, the 1D ConvNet is built around the idea of 1D convolutions where trainable filters are convolved with the sequence of embedded user actions. Convolutional filters with learnable parameters help extract similar local features across multiple locations and encode subsequences of user actions that together form more meaningful local interactions. For instance, a single ride request action doesn’t mean much by itself. But when considered together with repeated ride cancellations and requests that precede it, the subsequence of repetitive user actions paints a much more suspicious picture of the user. Intuitively, when the inputs from the preceding ConvNet are passed into the RNN in our neural network, it determines how much of the information about user action subsequences should be retained for future consideration. It allows us to efficiently encode a temporal relation between earlier subsequence embeddings with later ones."
Activity logs from Lyft users: "the activity log is a temporally ordered sequence of user actions taken on our app along with their various metadata. These user actions range from ride request button presses to map-magnifying screen pinching. The action metadata include the duration of action, the time elapsed since the previous action, and the force applied on the phone screen by the user."