Opinion: AI can grade a student essay as well as a human. But it cannot replace a teacher
Perpetual Baffour | September 20, 2023
Your donation will help us produce journalism like this. Please give today.
I’ve been working on these questions for years together with a group of colleagues since long before the advent of ChatGPT. After working with hundreds of data scientists from around the globe, we have found that the answer is clear: Artificial intelligence is now as good as a human at evaluating a standard five-paragraph essay and giving feedback on its logic and persuasion.
But our work also revealed that AI alone is not enough, and that perhaps some of the best uses of this technology in writing are helping teachers, not giving students a shortcut in drafting essays.
Since 2019, our team of experts from The Learning Agency Lab, Georgia State University and Vanderbilt University has been working on this issue, overseeing competitions that challenged data scientists to build models that could label argumentative elements in pieces of writing and then evaluate the quality of these elements in thousands of student essays.
The best-performing algorithms did just as well as humans, achieving an accuracy rate of 75%. That’s comparable to the human readers who annotated the data. In short, these AI models can identify and evaluate the lead, position statement, supporting claims and evidence as well as a human. They also were able to evaluate how well a student organized an essay and developed arguments.
This technological advance is important because becoming a good writer requires a lot of practice and some expert coaching. It’s not unlike learning to play the piano or to shoot a jump shot. Young writers need to put in the work in order to improve.
However, research shows that too many students do not get the instruction or opportunities needed to master this most important skill. National Assessment of Educational Progress surveys of students have revealed that only 25% spend more than 30 minutes of their school day writing, the minimum recommended by the Institute of Education Sciences’s What Works Clearinghouse.
One big reason why teachers assign so little writing is that grading and coaching are labor-intensive. Even if a teacher spends just 10 minutes reviewing a two-page writing assignment, it would still take nearly 21 hours to grade them all, assuming the teacher sees 125 students over the course of a week. And that’s just one relatively minor assignment.
This is where new writing tools and technologies like AI can make a huge difference. In some research, these tools have been shown to reduce the amount of time teachers spend on grading by half. Other studies suggest they can raise student outcomes well above state averages.
These findings demonstrate that there are real upsides to bringing artificial intelligence into the classroom.
AI’s biggest potential when it comes to the writing classroom is in helping educators better identify areas where students struggle. Practically speaking, that means AI could help a teacher identify common mistakes made by students across all classes, which indicate a weakness in instruction or the curriculum. Such information is also helpful when identifying students in need of a specific intervention or remediation.
The algorithm could also be used to push students who are close to mastering a concept by giving them feedback, for example, on their argumentation and prodding them to go deeper or to think more about their thinking — an effective learning strategy called metacognition.
Still, as exciting as this is, even the best AI-powered writing tools cannot meet all the complex needs of students. Technology, no matter how precise and fine-tuned, will never be as effective as an engaged teacher at motivating students or mitigating a classroom’s social dynamics. That’s why I believe that all classroom-level AI tools need to be developed and introduced in close partnership with teachers, from start to finish.
For example, our project included focus groups with more than 70 teachers. We learned that they found the AI tools exhibited bias toward students with non-standard English dialects, especially those from marginalized backgrounds. The teachers wanted developers to mitigate that bias and ensure tools were culturally sensitive so students would see their experiences reflected in the AI results. When developers hear from educators early in the development phase, their products are better able to give teachers what they want and students what they need: targeted, relevant writing support.
Isn’t that what most parents want for their children as well? Personalized assistance from teachers who strive to make learning both personal and stimulating? When done right, AI can help educators deliver on that ideal.
Perpetual Baffour is research director at The Learning Agency Lab.