Autograding in Education Using Artificial Intelligence

Supervised by: Sanjan Das as part of an OxBright Computer Science Internship. Sanjan is a final year Engineering student at the University of Cambridge. Over his degree, he has specialised in Bioengineering and Information Engineering. He is set to begin an MS in Artificial Intelligence and Innovation at Carnegie Mellon University, U.S.A later this year. This type of interdisciplinary study, blending technology with practical application, echoes the academic rigour and exploratory spirit students encounter at Oxford Scholastica’s Oxford Summer School, where they are encouraged to delve into complex subjects like Computer Science and Programming to prepare for future academic and career pursuits.

Abstract

The development of artificial intelligence components in autograding technology has been a lengthy process, but results are finally coming to fruition. Here, we aim to offer a brief overview of the different components of artificial intelligence involved in autograding technology. Research has been gathered to produce a clear and concise idea of the methodology behind artificial intelligence, and its use in autograding technology. This encompasses an introduction to both autograding technology and artificial intelligence; a summary of specifics pertinent to artificial intelligence in autograding technology; and, finally, an explanation of the use of Convolutional Neural Networks in such technology.

1. Introduction to autograding technology

1.1 Demand for autograding technology

From primary school to college, the role of grading students’ assignments in assessing progress and ability is obvious. Nevertheless, grading is considered to be one of the most repetitive and time-consuming aspects of teaching. Teachers often find that manual grading takes up a significant amount of valuable time that could be used to maximise the student experience. Autograding could, therefore, allow instructors to focus more on what is really important – in-class activities, planning more engaging and personalised classes and direct interaction with students.

Furthermore, autograding systems could be used in competitive examinations held at all levels. For instance, job examinations, state-level examinations, and entrance examinations for university courses. These examinations have a lot of applicants, so quick and accurate grading is required.

1.2 Benefits of autograding

Other than saving time and decreasing workload on instructors, there are other beneficial features of automatic grading [1].

Usage of artificial intelligence (AI) allows the grading tool to update and improve its grading technique and adjust its grading settings by scanning papers that were once graded manually by humans.
Essays can be graded autonomously by autograding tools based on optical scanning of papers, also known as computer vision (CV).
Automatic grading can be conducted in multiple languages.

2. Introduction to AI

This research paper will mostly focus on autograding technology that is based on AI.

2.1 AI and its architecture

AIs are systems created entirely by humans to solve problems without being directly instructed. AIs solve problems that are impractical or impossible for humans. Subsections of AI include machine learning (ML) and deep learning (DL) [2].

2.2 Machine learning

Machine learning is a system with the ability to learn from experience without being explicitly programmed. A machine learning device can analyze data and learn from it with minimum human supervision. ML algorithms automatically improve by accessing data sets to compare with examples of the final output.

Figure 1: AI and its subsets

2.3 Deep learnings

Deep Learning is a machine learning technique inspired by the design of the human brain. A machine with deep learning processes inputs through layers in order to classify, infer and predict the information. Thus deep learning tries to make algorithms efficient and simpler to use with human brain simulations. Moreover, since deep learning processes data similarly to the way the human brain does, it is mostly used in operations usually performed by humans, such as driving vehicles [3].

2.4 Types of AI

There are three different types of AI: ANI (Artificial Narrow Intelligence), AGI (Artificial General Intelligence) and ASI (Artificial Super Intelligence).

Figure 2: Types of AI

Artificial Narrow Intelligence has no consciousness and requires human supervision. It specializes in one area and can only solve one particular problem that it is designed and trained to solve. Examples of ANI are Siri, Alexa, auto-piloting in an airplane, chatbots and self-driving cars.
Artificial General Intelligence can be defined as a strong, human-level AI. Machines with AGI have the ability to act as a human would in a given situation. AGIs do not exist yet but are considered likely to be developed in the future.
Artificial Super Intelligence are AIs that will be far more superior than any human being in every field. ASIs have been theorized to be developed in the distant future, but for now it is just a theme in science-fiction.

2.5 How do AI models work?

The simplest model for understanding the process through which an AI processes data is shown below in figure 3.

Figure 3: AI data processing

More detailed steps are as shown below in figure 4, which the AI follows when in use.

Figure 4: Steps of AI model working

2.6 Field of AI

There are several main fields that involve AI [4]. For education and autograding, computer vision is mainly used.

Neural networks. Neural networks work on similar principles as human neural cells, finding relationships between data and processing it as the human brain does.
Natural language processing (NLP) is the field of studying, understanding and interpreting language by a machine. NLP machines respond accordingly to commands from the user through language processing.
Computer vision (CV) algorithms let machines understand images by breaking them down to more simple fragments and studying different parts of objects. This allows a machine to classify and learn from images.
Cognitive Computing (CC) algorithms analyze speech/text/images/objects in a way similar to the human brain in order to get the desired output.

3. AI in Autograding

3.1 How is AI used in autograding

As AI grows as an industry, more investment is poured into AI applications in education, especially autograding. Advancements in this field, as we previously suggested, would significantly reduce teachers’ workloads.

Current AIs involved with autograding can only assign grades to assignments submitted, and occasionally issue set suggestions and modifications. This replaces the need of the teacher to manually mark each paper. As technology develops, an AI could even give personalised feedback to each student – something that teachers are unable to do without a significant time commitment.

A potential issue with autograding, however, has been raised in the USA, where some students have found ways to outsmart autograding systems. By inputting a chain of keywords, the students wouldn’t even have to write coherent sentences to score top marks. As long as the answers included all relevant keywords, the AI could be manipulated into giving good results [5].

3.2 Examples of AI in autograding

China has been using AI systems with increasing volume over the last few years, including AI teaching platforms and autograding platforms. Around 1 in 4 schools in the country are testing a machine learning autograding platform that can even give suggestions on work done [6]. This platform has been successful so far and is grading tests nearly as well as teachers.

Similar projects exist in the US, albeit on a smaller scale than in China. More than 21 states have implemented automatic scoring systems, from middle school to college, with varying degrees of success [7]. The problems mentioned above persist, yet the advantages must outweigh the disadvantages, as there is no intention to revert back to hand grading, even after receiving major backlash from parents.

3.3 The AI used in autograding

There are many types of AI that will help with autograding, and the most developed applications are known as Automated Essay Scoring (AES) or Automatic Essay Grading (AEG). The architecture of the AI is as follows. Through the use of CNNs, set features are extracted from the text of essays and are then used in comparison with algorithms set by the AI, as shown below in Figure 5 [8].

In this example from figure 5, four main features are extracted, with each used to different effectiveness.

Word and sentence count. While these features seem unimportant to grading on first glance, an AI could determine appropriate length of an essay with the data.
Parts of Speech (POS). POS, like nouns or verbs, are important to keep track of, and the AI uses the count of different POS to effectively grade and evaluate essays.

Figure 5: System architecture [8]

Spelling mistakes. Spelling mistakes are obviously important to the grading of any essay, and the number of mistakes is modeled with the use of a spell checker.
Domain Information Content. This is the most important feature of the AI, as this aims to understand the context and information within an essay. The method to achieve this uses keywords taken from the best scores of the training set, and the count of each keyword or similar is inserted into an algorithm.

4. How CNN extracts text from images

4.1 What is a CNN?

A Convolutional Neural Network is a system designed to process pixel data. CNNs are best suited to handle 2D data but can also handle 1D and 3D data. Their role is to reduce images, making them easier to process while still leaving features that are important to get a good prediction of what the input image was. This reduction in dimensionality allows the software to be scaled to larger data sets. Normal neural networks are not great at image processing, so must be given images with reduced resolution. CNNs have their neurons arranged in a similar format to the frontal lobe of the human brain (the part of the brain responsible for dealing with visual data). This means that they are better suited for dealing with images than normal neural networks.

4.2 Layers of a CNN

A CNN has multiple different layers to extract information from an image. Each layer has a unique job to bring information out of the input image.

The Convolution Layer

The convolution layer is the first layer of the system that reduces the dimensions of the input image. This is done by extracting features from the input image by using ‘filters’ or ‘kernels’. These filters are grids smaller than the input image (n) is in both width and length, measured in pixels. The input is divided into multiple sections of size n × n. The filter has a step size of k, where k < n to allow some overlap between the sections so the filter can maintain its bearings [9]. The filter will go over the input and capture each sections’ features, translate that square of pixels into a smaller section and then move on to the next region k pixels away. This means the filter captures all the important information it needs while also reducing the dimensions of the image so it can be processed more easily.

Figure 6: Figure 6: Here we can see the different outputs you can get from a different step value. On the left, k = 1 and on the right, k = 2. When k = 1, we end up with a larger output than when k = 2.

There can be more than one convolution layer. The first is usually used for capturing low-level features such as edges, colour and gradient orientation. After these features have been captured, the system moves onto understanding the data set better by capturing high-level features such as clarity [9].

The Pooling Layer

The pooling layer reduces the dimensions even further by carrying out one of two processes:

Max pooling
Average pooling

Max pooling, the most popular form of pooling, extracts sections of the input, takes the maximum value from each section and discards all the other values [10]. There are two kinds of average pooling, local average pooling and global average pooling. Local average pooling extracts a section and then takes the average value of all values in that section.

Global average pooling takes the average value of all values in the feature map and downsizes them into a 1 × 1 display. This is a very extreme form of downsizing but it has some advantages, such as allowing the system to accept inputs of different sizes [11].

Max pooling is usually preferred to average pooling because it reduces dimensions of the feature map while also carrying out noise dampening. It gets rid of the less useful, lower value information and keeps the higher value information. However, average pooling just performs noise dampening which in turn reduces the dimensions of the feature map [12].

Figure 7: This shows the difference between maximum and average pooling. If global average pooling was shown as well, its value would be 7 as 110/16 = 6.875

Neurons make up neural networks. All neurons compute using linear functions to send the input to the output. If you take a linear function of a linear function, you will end up with a linear function. This would mean the neural network would only be able to model linear functions between the input and output [9]. In order to give the CNN more flexibility, the nonlinearity layer is used. This allows it to model functions that may fit the requirements better.

In the fully connected layer, a neuron will receive input from every neuron in the layer before it. This gives each neuron a much larger data set of weights than other layers such as the convolution layer, where each neuron will only receive one input from the filter. The fully connected layer is usually one of the last layers in the CNN and there are normally multiple layers.

Before the data is input into the fully connected layer from the final pooling or convolution layer, it must be flattened to 1D which is done by turning it into column vectors. It can then be sent to the fully connected layer.

Simplified Neural Network Showing Activated Neurons in Each Layer

Figure 8: The process of flattening an image and then iterating through each layer.

5. Conclusion

In summary, this research paper set out to explore the viability of autograding technology for use in schools. This has been achieved by offering an introduction to autograding and the AI that powers it; summarising the AI specifics that relate to autograding; and considering the use of CNNs in the pursuit of autograding.

While there are, inevitably, some issues with the technology, autograding has already taken up a significant position in many Chinese schools, and seems set to take off in the US too. It saves teachers’ time, and allows them to focus more on the teaching experience their pupils receive.

Bibliography

[1] Anmol Kumar. AI’s New Role In Education: Automated Grading. May 2021. url: https://elearningindustry.com/artificial- intelligence- new- role- in education-automated-paper-grading.

[2] Rupali Roy. Understanding the difference between AI, ML and DL. May 2020. url: https://towardsdatascience.com/understanding-the-difference-between ai-ml-and-dl-cceb63252a6c.

[3] Jeremy Cohen. Deep Reinforcement Learning for Self-Driving Cars – An intro. Feb. 2021. url: https://thinkautonomous.medium.com/deep-reinforcement-learning-for-self-driving-cars-an-intro-4c8c08e6d06b.

[4] Vaishali Advani. What is Artificial Intelligence? How Does AI Work, Applications and Future? July 2021. url: https://www.mygreatlearning.com/blog/what is-artificial-intelligence/.

[5] Fabienne Lang. Students Crack AI Auto-Grading Algorithms for Better Grades. Sept. 2020. url: https://interestingengineering.com/students-crack-ai auto-grading-algorithms-for-better-grades.

[6] Kyle Wiggers. Chinese schools are testing AI that grades papers almost as well as teachers. May 2018. url: https://venturebeat.com/2018/05/28/chinese-schools-are-testing-ai-that-grades-papers-almost-as-well-as-teachers/.

[7] Vishal Chawla. Can AI Replace Teachers To Grade Student Essays? A Lesson From US Schools. Nov. 2019. url: https://analyticsindiamag.com/artificial-intelligence-grade-essay-student/.

[8] V. V. Ramalingam et al. “Automated Essay Grading using Machine Learning Algo rithm”. In: Journal of Physics: Conference Series 1000 (Apr. 2018), p. 012030. url: https://doi.org/10.1088/1742- 6596/1000/1/012030.

[9] Grond Marco Marten. “Text detection in natural images using convolutional neural networks”. PhD thesis. Feb. 2017. url: https://scholar.sun.ac.za/handle/ 10019.1/100999?show=full.

[10] Prabhu. Understanding of Convolutional Neural Network (CNN) – Deep Learning. Nov. 2019. url: https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148.

[11] Rikiya Yamashita et al. Convolutional neural networks: an overview and application in radiology. July 2018. url: https://insightsimaging.springeropen.com/articles/10.1007/s13244-018-0639-9.

[12] Sumit Saha. A Comprehensive Guide to Convolutional Neural Networks-the ELI5 way. Dec. 2018. url: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53.