Transforming Education Through Innovative AI: A Case Study of Automated Scoring in Mathematics Assessments

The integration of innovative AI applications into various aspects of our lives is reshaping the landscape of education, offering profound implications for student assessments. In the summer of 2023, the National Center for Education Statistics (NCES) embarked on a groundbreaking initiative, inviting researchers to develop automated scoring applications for the mathematics segment of the National Assessment of Educational Progress (NAEP) assessment. This article delves into the collaborative efforts led by Dr. Li Feng from Texas State University, in partnership with Gamma State, to tackle this challenge.

The Challenge

Building on the success of a similar challenge in 2021, which focused on automated scoring systems for reading comprehension and writing, the NAEP math challenge presented ten open-response questions. These questions, used in the 2017 and 2019 assessments for 4th and 8th graders, covered a spectrum of topics and difficulty levels. The aim was to create an unbiased automated scoring system that mirrored human grading, considering the nuances of student responses, including grammar and spelling errors.

Approach

Dr. Feng and the Gamma State team adopted a novel machine learning approach, emphasizing transparency. They meticulously processed the data, employing a Snowflake warehouse for collaborative work, and designed a model pipeline that included data cleaning, preprocessing, and multiple machine learning models tailored to each question. Feature engineering played a crucial role, with creative efforts to prepare the data and generate relevant features for each question.

A noteworthy aspect of their approach was the use of Bag-of-Words (BoW), a fundamental natural language processing (NLP) technique. BoW was employed to create a sparse matrix representing word counts within student responses, contributing to a multinomial Bayesian technique for estimating the likelihood of a correct overall score based on the words used.

Findings

The team's findings highlighted the potential of automated scoring systems to mitigate grader bias, reduce fatigue, enhance speed, and cut costs in administering exams. The BoW model, in particular, aimed to level the playing field for students of diverse backgrounds. However, while the final predictions demonstrated accuracy ranging from 80% to 95%, challenges emerged in imbalanced class situations.

Conclusion

The intersection of AI and education, as showcased in the NAEP challenge, holds promise for revolutionizing assessment processes. Dr. Feng and Gamma State's intentionally interpretable system architecture sets a precedent for balancing accuracy and transparency in automated scoring systems. While there is room for improvement, the lessons learned pave the way for future advancements in the application of AI in education. Gamma State remains committed to pushing boundaries and contributing to the ongoing progress in this dynamic field.

Previous
Previous

Finding the Balance: When to Automate Data Transfer and When to Embrace AI

Next
Next

Empowering Small to Medium-Sized Businesses: The Data Advantage