AI Case Study

Instacart predicts availability of 200 million grocery items every 30 minutes using machine learning

Instacart has built a model that reports the in-store availability of over 200 million grocery items, updated every 30 minutes. The third-party's personal shoppers execute consumers' orders by packing the requested items at retail partners like Aldi, Costco, Krogers, Safeway, and Wegmans. When looking for an item the shopper categorises it as found or not found, which in turn creates a profile about the product. According to an item's history and past orders the model can predict if it is likely to be found on the shopping floor or not. The team says it is constantly evolving their model.

Industry

Consumer Goods And Services

Food Beverage And Drugs

Project Overview

"Instacart’s machine learning team has built tools to figure that out!
Our marketplace’s scale lets us build sophisticated prediction models. Our community of over 70,000 personal shoppers scans millions of items per day across 15,000 physical stores and delivers them to the customers. These stores belong to our grocery retail partners like Aldi, Costco, Krogers, Safeway, and Wegmans.

Every time an Instacart shopper scans an item into their cart or marks an item as “not found”, we get information that helps us make granular predictions of an item’s in-store availability. This helps us set accurate expectations for out-of-stock items and recommend appropriate replacements for items likely to be out-of-stock.

As a quick overview of how Instacart works, customers place orders online to be fulfilled from one of our grocery retail partners. A personal shopper engaged by Instacart picks items in the store and delivers them in as little as an hour. We have millions of grocery products listed on our website. Each product at a particular store is defined as an “item” and we want to know the availability of each item.

If a personal shopper cannot find an item in the store, we label the item as “not found”. A not-found item is bad for every stakeholder in our marketplace — customers don’t get what they want, retail partners lose out on revenue, shoppers spend more time searching for them, and Instacart fails to deliver the best customer experience.

Not-founds occur primarily due to two reasons:

1. Availability: Instacart doesn’t own the logistics supply chain for products listed on its platform (our retail partners do), which makes it difficult for us as a third party to know whether a store has an item at a given time. We get regular updates (typically once a day) from our retail partners on the availability of all items. But items can sell out quickly within a day. We realized that we needed more granular data throughout the day — we needed to know the real-time availability of each item.

2. Find-ability: Sometimes, due to our exhaustive product catalog, shoppers aren’t able to find every item available in the store. It could be because the items are moved to the front of the store for a seasonal promotion or items are paired to drive more sales. For example, chips are placed next to salsa instead of their usual aisle. Recommending easy-to-find items saves shoppers’ time and cuts down on replacements for customers.

Hence, to infer real-time availability and capture find-ability, we built an item availability model that constantly predicts the availability of 200 million grocery items every 30 minutes.

We currently use item availability predictions in many ways across the product. One such use case is to decide items that customers are able to order. We hide items with very low availability scores and low relevance in search. We also use these predictions to route shoppers to stores with better availability of ordered items."

Reported Results

Results undisclosed

Technology

"As we set out to build the model, we formulated it as a classification problem where every ordered item is a training example. In order to capture an item’s availability and find-ability, the model is trained to predict if the item was found by the shopper. Making this model work is a challenging problem, both from the perspective of training a model with good performance as well as the scale at which it needs to perform."

Function

Supply Chain

Logistics

Background

"Ever wished there was a way to know if your favorite Ben and Jerry’s ice cream flavor is currently available in a grocery store near you?"

Benefits

Data

items labelled as found or not found