AI Case Study
Flipkart attempts to resolve address problem in India using machine learning
Flipkart, India's leading e-commerce company, is trying to fix the 'address' problem facing many developing nations. The addresses are not standardised and spelled differently, and with varied no. of lines. Flipkart is using to standardise addresses to
Consumer Goods And Services
"Flipkart, one of the biggest e-commerce companies in India, is turning to Machine Learning and Artificial Intelligence to solve the complicated address puzzle.
We spoke to senior Flipkart data scientist Ravindra Babu on how the company uses new age technologies to build an ‘address intelligence’.
The Address Conundrum:
One of the rudimentary problems Flipkart faces is the variation in spellings for a common place.
“There are many problems that are native to India. Address is one of them. In a given household, say there six members in a family. Each of them will write their address in different ways. There is no uniformity, although the house members are literate. Say Marathahalli in Bangalore can be written in two ways—Maratha Halli or Marathahalli. So there are these kinds of challenges," he said.
“So, if you want to involve a model when you expect a machine to understand these complications, that’s where the challenge lies. Any AI problem we come across, first thing we do is go through the data, get a feel about the data than really thinking of a model. We also consider how much of pre-processing is required and how much variability the data entails."
“On an average 9-10 words are sufficient for a shipment to reach a particular place (example: person’s name, TCS, near ITPL Bangalore). However, some people include details like cubicle number, extension number, directions to reach the place from nearest bus stop. We don’t restrict users while specifying addresses because we don’t want to give a bad experience. This makes an address length to almost 200 words. These kinds of addresses will be difficult for a machine to understand."
Flipkart is also working to get the addresses marked on the map. It has partnered with maps service providers like MapMyIndia for geo-tagging locations. But the challenge lies in building a solution where the data works according to the geo-location information while addressing the problem of wrong spellings.
How the system works:
Ravindra Babu observed that without a model in place, delivery hub staff will have to manually go through each shipment and distribute to field executives. This will make the process more tedious and time-consuming.
Flipkart first tries to understand how the addresses are given. The idea is to make sense of these addresses without asking the customers to type the addresses differently. The company uses a learning model to identify whether the address is monkey typed—addresses containing randomly typed alphanumeric characters.
The process also involves leveraging the expertise of field executives who have a better understanding of local addresses.
Before deploying a model, the data is rectified, validated and monitored. The mechanism ensures that the data is coherent as different executives may have labelled these data differently.
With a solution in place, the technology automatically provides a sub-area name. This makes the entire process faster and simpler.
Babu said the team has also improved geo-tagging. But there are some other challenges as well. Some people write wrong pin codes. And in some cases, a few areas have same pin codes. The scientists use a different model to solve the problem. The idea is to make the machine suggest the user right pin code for their area.
"India’s complex address system has continued to be a big riddle for technology companies. Earlier this year, Google launched Plus Codes to make it easier for users to find and share their addresses. While Google is still working on that, the address problem is far more severe for e-commerce companies.
These companies grapple with a wide variety of fraudulent activities where some users rig addresses to avail discounts or commit a fraud. This has been a major pain point for e-commerce companies which bank on the last-mile logistics and ultimately bleed revenues.
But fraud is just one aspect of the complicated address challenge. For instance, in many developed countries, latitude and longitude information of each address is also available. In many developing countries such as India, such accuracy in information is not available or is only partial."
Proof of concept; results not yet available
"Flipkart uses both conventional machine learning models and deep learning solutions which are used for various purposes, including the fix for address problem"
Text, Address data, Geo location