R&D Hub

New Research Paper on Advances and Challenges in Evaluating LLM-Based Applications

Published On Friday, June 7, 2024

This month, we are excited to share the latest research from the NAEP R&D community, focusing on the evaluation of large language models (LLMs). “The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches” will be presented at the LLM4Eval workshop at the 2024 Association for Computer Machinery Special Interest Group on Information Retrieval (ACM SIGIR) conference on July 18, 2024. This paper from AIR researchers Bhashithe Abeysinghe and Ruhan Circi explores innovative approaches to evaluating custom AI applications, addressing a crucial barrier to faster progress in generative AI.

New Working Paper: Evaluating the Surge in Chatbot Development

Published On Friday, February 23, 2024

Explore the latest insights in research and development from NAEP researchers in the R&D program. Given the current buzz around AI, this month we’re excited to share a working paper on the evaluation of chatbots by Ruhan Circi and Bhashithe Abeysinghe, NAEP researchers from the American Institutes for Research (AIR). The full text of the working paper is included below. Don’t miss out on these valuable perspectives – subscribe now to stay ahead in technology and innovation!

New Focus on NAEP: “Exploring Process Data in TEL”

Published On Friday, February 9, 2024

Those interested in process data in digitally based assessments (DBAs) will want to check out the newest addition to the Focus on NAEP series: “Exploring Process Data in TEL.” This examination of process data from the 2014 Technology and Engineering Literacy (TEL) assessment explores how data collected from digitally based assessments can be used to investigate digital familiarity and efficiency.

EdSurvey Team Publishes Article on NAEP Analysis Using Dire

Published On Friday, September 15, 2023

Paul Bailey and Blue Webb, members of the EdSurvey team, recently published an article, “Expanding NAEP and TIMSS Analysis to Include Additional Variables or a New Scoring Model Using the R Package Dire.” Featured in the journal Psych in its special issue on computational psychometrics, this article showcases the new Dire software and its utility for ushering in a new era of data use at the National Center for Education Statistics (NCES).

12
title of plugged in news

The Summer 2024 NAEP Data Training Workshop - Applications Open

04-12-2024

Applications are now open for the summer 2024 NAEP Data Training Workshop! This workshop is for quantitative researchers with strong statistical skills who are interested in conducting data analyses using NAEP data. For the first time, participants in this year's training will get an introduction to COVID data collections. Learn more here!

EdSurvey e-book now available!

02-14-2022

Analyzing NCES Data Using EdSurvey: A User's Guide is now available for input from the research community online here.  Check it out and give the team your feedback.

«July 2024»
MonTueWedThuFriSatSun
24252627282930
1234567
891011121314
15161718192021
22232425262728
2930311234