AI Case Study
China's Ministry of Education grades students's essays in one in four Chinese schools using deep learning
The deep learning system, whose actual status is still unclear, has been used in around 60,000 schools and 120 million students have even graded over the last 10 years. The system uses deep learning to compare essays with established teaching experience and has built its own body of knowledge. Some anecdotal suggestion that 'brilliant' outliers may not be appropriately rewarded.
Public And Social Sector
Education And Academia
"The technology is designed to understand the general logic and meaning of the text and make a reasonable, human-like judgment about the essay’s overall quality. It then grades the work, adding recommended improvements in areas such as writing style, structure and theme.
The technology, which is being used in around 60,000 schools, is supposed to “think” more deeply and do more than a standard spellchecker. For instance, if a paragraph starts trailing off topic, the computer would mark it down."
According to a government document seen by the South China Morning Post, the tests involved 60,000 schools with more than 120 million people involved. The AI and human grader gave the same score 92% of the time, but the article reports that schools have experienced 'brilliant' writing scoring lower marks. Still only used for internal tests and not yet for official exams.
According to the South China Morning Post developers say the machine deployment is "already 10 years old and they are increasingly confident about its potential. The essay grading machine is embedded in a cluster of fast computers in Beijing, is improving its ability to understand human language by using deep learning algorithms to plough through essays written by Chinese students and “compare notes” with human teachers’ grading and comments.
It is also able to collect and build its own “knowledge base” with little or no human intervention. “It has evolved continuously and become so complex, we no longer know for sure what it was thinking and how it made a judgment,” said the researcher, who requested not to be named due to the sensitivity of the project."