The NAEP research community was hard at work in 2024 exploring new ways to deepen our understanding of student academic achievement and assessment in the United States. We here at the NAEP R&D Hub have shared it all along the way, highlighting the work of innovative researchers (especially in emergent fields such as AI), disseminating news about new and improved tools for researchers, and bringing awareness to relevant conference opportunities and presentations. As we enjoy the holiday season and look forward to a new year and new NAEP internship opportunities, we’d like to look back on some of the NAEP R&D Hub highlights of 2024.
Innovative Research
- New Focus on NAEP: “Exploring Process Data in TEL”
This addition to the Focus on NAEP series examines data from the 2014 Technology and Engineering Literacy (TEL) assessment to investigate students’ digital familiarity and problem-solving efficiency as well as connections between test-taking processes and student experiences in and out of school.
- New Working Paper: Evaluating the Surge in Chatbot Development
This full-text working paper explores the landscape of chatbot development and evaluation, focusing on their use in educational contexts. The authors examine diverse evaluation techniques and criteria and offer their recommendations and thoughts on potential limitations.
- New Research Article Comparing Methods for Estimating Unreported Subgroup Achievement on NAEP
This research article discusses results from a comparison of methods for estimating NAEP achievement of minority subgroups when sample sizes fall below the required 62-student threshold. It highlights small area estimation as a reliable statistical framework for addressing sample size limitations and producing more accurate subgroup achievement estimates.
- New Research Paper on Advances and Challenges in Evaluating LLM-Based Applications
This research paper focuses on the evaluation of large language models (LLMs), highlighting the challenges that stem from narrowly focused methods and the pros and cons of human versus automated evaluation. The authors conduct a rigorous comparison of three evaluation strategies: automated metrics, traditional human evaluation, and LLM-based evaluation.
- Research First Look: Passage Text Difficulty in the Hive of Language Models
This white paper preview discusses the potential advantages and limitations of existing metrics for the evaluation of the reading difficulty levels of passage texts for educational assessments. The authors stress the value of a potential unified model that integrates various metrics and methods as the educational assessment community begins exploring automated text generation.
- Research First Look: Can Large Language Models Transform Automated Scoring Further?
This working paper preview, produced through the new AI topic area of the 2024 NAEP Doctoral Student Internship Program, shares insights from a literature review on the use of large language models for automated scoring of constructive response items.
- New From the NAEP Validity Studies Panel: Study of Changes in Public School Composition During the Pandemic and Their Relationship With NAEP Performance
This NAEP Validity Studies Panel study explores possible context for the major declines in NAEP grade 8 mathematics achievement between 2019 and 2022. The authors focus on two primary goals: “(a) to examine changes in the demographic composition of students among eighth graders in U.S. public schools between 2019 and 2022 and (b) to estimate the extent to which the compositional changes relate to NAEP eighth-grade mathematics achievement scores in the same time period.”
- New Working Paper: Linking Sentiment to Student Test-Taking Behaviors
This full-text working paper investigates how test-taking behaviors are impacted by the perceived sentiment of assessment questions as well as how AI-powered sentiment analysis can be used to analyze survey item questions and responses like those from the NAEP survey questionnaires.
- New From the NAEP Validity Studies Panel: An Exploration of Opportunity to Learn and Implications for NAEP
This NAEP Validity Studies Panel white paper discusses the concept of “opportunity to learn” and its targeting and measurement in NAEP survey questionnaires. The authors examine how environments for student learning inside and outside the classroom can be used to better understand their achievement as well as how survey items, data-gathering strategies, and special studies could better capture this important context.
Research Tools
- EdSurvey 4.0.7 Released With New Assessment Support and Rounding Features
In this post we highlight the most recent update to EdSurvey, an R statistical package developed by the American Institutes for Research (AIR) on behalf of the National Center for Education Statistics (NCES) that provides free and easy-to-use tools for downloading, processing, manipulating, and analyzing NCES data, including NAEP. This newest update enabled EdSurvey to support analysis of more recent data from NAEP and numerous other NCES studies and assessments. It also added a new rounding feature suited to NCES statistical standards or user-defined criteria.
Conferences and Presentations
Take a look at some of the posts from this year highlighting conference presentations from NAEP researchers and NAEP Doctoral Internship Program alumni! Presentations covered a range of topics, from algorithmic bias and automated passage text generation to new applications built on the R package to implement Bayesian historical borrowing in large-scale assessments.
We’re also excited to share the recent success of the Natural Language Processing (NLP) and Artificial Intelligence (AI) training session at the 2024 APPAM conference just last month. This session, co-led by NAEP R&D Researchers Ruhan Circi and Bhashithe Abeysinghe, brought together a dynamic group of researchers eager to explore the power of NLP and AI in their fields. The training provided participants with a hands-on introduction to key NLP concepts, including working with unstructured text data, feature engineering, and advanced modeling techniques like BERTopic and GPT. Attendees participated in breakout activities to explore how these tools could enhance their research and gained practical insights into leveraging NLP for text analysis across diverse domains, such as user reviews, public policy documents, and other text-rich datasets. The session's format encouraged rich discussions and idea sharing, highlighting the potential for NLP and AI to address real-world challenges in innovative ways. Several participants noted how these tools could transform their approach to analyzing complex datasets, making internal use of data as well as research more efficient and impactful.
Subscribe to our mailing list to be kept up to date on the conference world as it relates to NAEP research and to hear about opportunities that could bring your research to the conference floor! Applications for the Summer 2025 NAEP Doctoral Student Internship Program are open now; stay tuned for more updates and information.