AI Case Study

DXC Technology automates triage of support tickets using machine learning

DXC Technology, a global IT service company, has leveraged Amazon Web Services (AWS) to developed a knowledge management (KM) article prediction mechanism for decreasing incident resolution time. By implementing machine learning in that process, the company's goal is to increase the efficiency of IT support. With the KM article prediction mechanism the company achieved both greater efficiency in supporting their clients but also greater support for its own team, where the program is used to solve internal issues rather than clients' issues.

Industry

Technology

Software And It Services

Project Overview

"DXC uses machine learning on AWS to automatically identify a KM article, which in turn can be automated with the orchestration runbook for ticket resolution to make IT support more efficient.

First: Build a data lake on Amazon S3
DXC customers submit incident tickets to IT Service Management Tools (ITSM). Tickets can be user generated or machine generated. Then data is pushed or pulled to Amazon S3 buckets. Amazon S3 provides low cost, highly durable object storage that can store any form or format of data.

Second: Choose the right machine learning tool and algorithm
Typically, the problem is how to classify text. AWS offers a variety of choices for customers to do text classifications. DXC evaluated the following AWS services.

Amazon SageMaker with its built-in algorithm called BlazingText.
Amazon Comprehend custom classification.
The Amazon Comprehend custom classification API was good choice since it is built ground-up for text classification. With Amazon Comprehend, we didn’t have to pick an algorithm, tune it and re-train our model looking for the highest accuracy – the API did this automatically. We plan to re-evaluate it when it supports synchronous calls (today it provide batch-mode classification).

Amazon SageMaker BlazingText implements the fastText algorithm and keep the right balance between scalability and accuracy.

Third: Train the model
Training data preparations:
Training the model is the most important part of the ML process. Training of supervised models requires labeled data. The DXC team wanted to label a significant amount of historical data for this purpose. In the pre-processing step, the text data was tokenized using NLTK (Python library) and stored in CSV format in Amazon S3 for the training. The training is done once a month with the historical data.

The tokenized training data looks like this. It is used as input to the training job.
Training job with hyperparameter optimization (HPO)

We use the automatic model tuning feature of Amazon SageMaker to automate and accelerate the search of hyperparameters for the BlazingText algorithm.

Initially, we set static hyperparameters that we don’t need to change across training jobs, and we also define ranges for the hyperparameters that need optimizations.

Fourth: Orchestrate data preparation, model training, and model deployment on Amazon SageMaker using AWS Step Functions
We orchestrated this ML workflow using AWS Step Functions, and we scheduled using an Amazon Cloud Watch Events rule.

AWS Step Functions performs the following steps:
It checks that the Amazon S3 bucket exists where input data for training is present.
It pre-processes the data set for model training.
It starts the training job in Amazon SageMaker with the required parameters.
It keeps checking the status of training job.
After the training is successful, it validates the model.
After the model validated, it deploys the model as Amazon SageMaker endpoints. (If the model endpoint exists, then it updates the model endpoint.)
All of these steps are developed as AWS Lambda functions.

Note: During AWS re:Invent 2018, a new feature was released that allowed Step Functions to be directly integrated with Amazon SageMaker. This feature can be used to develop some of the steps described earlier without writing Lambda functions. However, the feature was not available when DXC developed this solution.

Fifth: Call the inference
As soon as new ITSM tickets get ingested to an Amazon S3 bucket, an AWS Lambda function is triggered to call the inference using Amazon SageMaker endpoints.

Sixth: Build a CI/CD pipeline to automate the solution deployment
DXC developed a CI/CD pipeline using Ansible, Jenkins, and AWS CloudFormation templates to automate the deployment of the whole solution.

Seventh: Enable it for the support team
After the predictions are generated, they can be accessed using API endpoints based on Incident Identifiers or Incident Descriptions. Incident Descriptions are more suitable for real-time resolution of issues. It’s possible that you don’t even need to create a ticket. The description of an issue when checked against the Amazon SageMaker endpoint results in the output of a KM article identifier that can be referred offline, which might lead to the resolution of the issue. In this scenario, no ticket had to be created.

In the case where ticket has been created, a Service Desk Agent can use a chatbot that makes a call to the API or uses the API directly by providing the Incident Identifier. The output of the Incident Identifier is a KM article identifier. This can be quickly referred to offline for incident resolution, hence reducing the incident resolution time.

And further integration with runbook automation will result in the automation of ticket resolution with little or zero human effort."

Reported Results

"To summarize, the KM article prediction mechanism realized the following benefits:

1. Improved the support team’s efficiency. The support team can almost instantly know which KM article to be looked at for solving the ticket.
2. This prediction mechanism also can be used as a self-service tool where users can enter ticket descriptions and get back the KM article to solve their own issue. This will also reduce the number of tickets.
3. Integration of this mechanism with runbook automation will help automate resolution of tickets too."

Technology

Function

Background

"DXC Technology is a global IT service leader providing end-to-end services on Digital Transformation to businesses and governments. They also provide service management to their clients on-premises and in the cloud. The incident tickets raised as part of the process need to be resolved quickly to meet their service level agreements (SLA). DXC has goals to reduce human effort, reduce incident resolution time, enhance knowledge management, and enhance consistency of incident resolution. With these goals in mind, DXC developed a knowledge management (KM) article prediction mechanism."

Benefits

Data