AI Case Study
The LawGeex AI algorithm outperforms human lawyers by 9% in identifying issues with historic non-disclosure agreements
The LawGeex AI algorithm was tested against a group of human lawyers in finding issues commonly identified in non-disclosure agreements for a set of corporate NDAs in a given amount of time
"In a landmark study, US lawyers with decades of experience in corporate law and contract review were pitted against the LawGeex AI algorithm to spot issues in five Non-Disclosure Agreements (NDAs), which are a contractual basis for most business deals. Twenty US-trained lawyers, with decades of legal experience ranging from law firms to corporations, were asked to issue-spot legal issues in five standard NDAs. They competed
against a LawGeex AI system that has been developed for three years and trained on tens of thousands of contracts. The research was conducted with input from academics, data scientists, and legal and machine-learning experts, and was overseen by an independent consultant and lawyer."
"LawGeex Artificial Intelligence achieved an average 94% accuracy rate, ahead of the lawyers who achieved an average rate of 85%."
"The LawGeex AI has been trained to detect issues on more than a dozen different legal contracts, ranging from software agreements to services agreements to purchase orders. This specific research focused solely on NDAs – the most common form of business
contract. NDAs are typically used to create a legal obligation to secrecy, and compel those who agree to them to keep information confidential and secure. The LawGeex AI was trained on tens of thousands of NDAs, using custom-built machine learning and deep learning technology. The machine was trained based on an exclusive corpus of documents that presented the LawGeex algorithm with a variety of examples, which allowed it to distinguish between different legal concepts. This level of technology for analyzing legal documents has only been possible with advances in computing over the last five years. Computers convert the text into a numeric representation. The image below is a visualization of how computers read text. Each dot represents one paragraph in the semantic space. The different colors shown represent different legal issues. Pink dots, for example, represent samples of non-compete issues, and purple ones represent governing law sections.
LawGeex created proprietary Legal Language Processing (LLP) and Legal Language Understanding (LLU) models for the task. Teams of lawyers and engineers taught LawGeex AI legalese by exposing the AI to a wide range of legal documents. Once the AI learned legalese, legal trainers pointed out the concepts it is required to recognize. The LLP technology allows the algorithm to identify these concepts even if they were worded in ways never seen before. Monitoring concepts, not keywords — LawGeex AI operates in a far more sophisticated manner than a blunt “keyword search.” Keyword searches can be over- and under-inclusive, as words may be absent from relevant documents, or present in irrelevant documents. True AI recognizes a concept however it is phrased or wherever it appears in a document. Unsupervised learning was used for teaching the AI engine the core legalese language. Thereafter, supervised learning, using deep learning multi-layer LSTM and convolution technology, was used to train the system for the fine-tuned issue-spotting. Supervision was performed based on human-annotated documents, using legal experts. A unique augmentation algorithm was applied to boost learning from these examples."
R And D
Core Research And Development
"The study is a response to a major business problem experienced by every company of any size that requires contracts to engage with partners, suppliers, or vendors. The typical Fortune 1000 company maintains 20,000 to 40,000 active contracts at any given time, while The International Association for Contract & Commercial Management (IACCM) has found that 83% of businesses are dissatisfied with their organization’s contracting
process. In addition, NDAs take companies a week or longer to approve – a process that frustrates other departments and slows down deals. Businesses have reduced their reliance on outside law firms, as they want to pay less for legal services, but they are
seeing no reduction in legal work."
"Five publicly available NDA agreements from the Enron Data Set, which has become the industry standard corpus for common documents for technology providers, scientists, and researchers, were selected by consultant and referee, Christopher Ray. The NDAs were real, everyday agreements used by companies in the US, including Enron, InterGen, Pacific Gas and Electric Company, and Cargill. The five contracts were various forms of commercial NDAs – one 2-page NDA, one 3-page NDA, two 4-page NDAs, and one 5-page NDA."