Automating Homework Verification Through LLM Assistants
Keywords:
large language models, automated homework assessment, intelligent tutoring systems, retrieval-augmented generation, hybrid learning, LLM assistantAbstract
This article examines the automation of homework assessment through LLM assistants. A comprehensive architecture is proposed, comprising an Instruction Chains Generator for task decomposition, a Previous Action Description module for generating step summaries, an Action Prediction & Executor for planning and executing verification steps, and a Controllable Calibration component for refining outcomes. To ensure pedagogical soundness and increase reliability, the system integrates with Intelligent Tutoring System (ITS) logs and employs Retrieval-Augmented Generation (RAG) to mitigate model hallucinations. A prototype built on Llama 3 Instruct and the Ollama framework was evaluated in an online algebra course and the GSM8K benchmark (“problem + solution”). User studies with instructors confirmed the approach’s high explainability and the diagnostic value of its feedback. The results demonstrate the efficacy of a hybrid human + LLM workflow for automated homework grading. These findings will interest educational-technology researchers and AI developers aiming to embed next-generation language models in automated verification of student work, grounded in cognitive analysis and adaptive-learning methodologies. In addition to EdTech scholars and AI engineers, practicing educators and educational administrators focused on improving assessment quality and reducing grading workload through LLM assistants will find this work valuable.
References
. Deriyeva A., Dannath J., Paaßen B. Case study: Using LLMs to assist with solving programming homework assignments //Proceedings of DELFI Workshops 2024. – Gesellschaft für Informatik eV, 2024. – pp. 1-9.
. Venugopalan D. et al. Combining large language models with tutoring system intelligence: A case study in caregiver homework support //Proceedings of the 15th International Learning Analytics and Knowledge Conference. – 2025. – pp. 373-383.
. Guan Y. et al. Intelligent agents with LLM-based process automation //Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. – 2024. – pp. 5018-5027.
. Dhuliawala S. et al. Chain-of-verification reduces hallucination in large language models //arXiv preprint arXiv:2309.11495. – 2023. – pp. 1-8.
. Liu P. et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing //ACM computing surveys. – 2023. – Vol. 55 (9). – pp. 1-35.
. Liu X. et al. Agentbench: Evaluating LLMs as agents //arXiv preprint arXiv:2308.03688. – 2023. – pp.1-9.
. Dong X. L. et al. Towards next-generation intelligent assistants leveraging LLM techniques //Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. – 2023. – pp. 5792-5793.
. Moore S. et al. Automated generation and tagging of knowledge components from multiple-choice questions //Proceedings of the eleventh ACM conference on learning@ scale. – 2024. – pp. 122-133.
. Thomas D. R. et al. Improving student learning with hybrid human-AI tutoring: A three-study quasi-experimental investigation //Proceedings of the 14th Learning Analytics and Knowledge Conference. – 2024. – pp. 404-415.
. Mrazek A. J. et al. Teenagers’ smartphone use during homework: an analysis of beliefs and behaviors around digital multitasking //Education Sciences. – 2021. – Vol. 11 (11). – pp. 713.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Nikita Gladkikh

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who submit papers with this journal agree to the following terms.
