AI Case Study
Ohio State University researchers using NLP to identify security flaws on Twitter
Researchers from Ohio State University, Leidos, and FireEye LLC train a convolutional neural network (CNN) to identify tweets referring to software security vulnerabilities and gauge their severity.
Internet Services Consumer
The research is "the first study of whether natural language processing techniques can be used to analyze users’ opinions about the severity of software vulnerabilities reported online." Tweets are deemed to be severe if they meet one of the following "(1) does the author believe that their followers should be worried about the threat? (2) is the vulnerability easily exploitable? and (3) could the threat affect a large number of users? If one or more of these criteria are met, then we consider the threat to be severe."
"Software vulnerabilities are flaws in computer systems that leave users open to attack; vulnerabilities are generally unknown at the time a piece of software is first published, but are gradually identified over time. As new vulnerabilities are discovered and verified they are assigned CVE numbers (unique identifiers), and entered into the National Vulnerability Database (NVD). To help prioritize response efforts, vulnerabilities in the NVD are assigned severity scores using the Common Vulnerability and Scoring System (CVSS). As the rate of discovered vulnerabilities has increased in recent years, the need for efficient identification and prioritization has become more crucial."
"Without much hyperparameter tuning on the development set, the convolutional neural network consistently achieves higher precision at the same level of recall as compared to logistic regression".
"Specifically, given a named entity and tweet, our goal is to estimate the probability the tweet describes a cybersecurity threat towards the entity, pthreat, and also the probability that the threat is severe." The researchers used logistic regression for a baseline and a 1D CNN for threat severity prediction.
"6,000 tweets annotated with opinions toward threat severity... To collect tweets describing cybersecurity events for annotation, we tracked the keywords “ddos” and “vulnerability” from Dec 2017 to July 2018 using the Twitter API... For threat existence classification, we randomly split our dataset of 6,000 tweets into a training set of 4,000 tweets, a development set of 1,000 tweets,
and test set of 1,000 tweets. For the threat severity classifier, we only used data from 2nd phase of annotation. This dataset consists of 1,966 tweets that were judged by the mechanical turk workers".