NAEP’s primary purpose is to measure student achievement in different subject areas and to collect contextual data for the reporting and interpretation of assessment results. However, as students do not receive a NAEP score, the assessment is considered “low-stakes” for them. There are long-standing questions about their level of engagement and effort. For assessment results to be valid, students should be sufficiently engaged with the assessment tasks to demonstrate their knowledge and skills.
Consistently striving to improve assessment in a way that accurately reflects the knowledge and abilities of students is a critical and ever-evolving endeavor. This study contributes to the exciting exploration of innovations designed to improve engagement in assessments. As our understanding of the factors influencing engagement expands, so too do our abilities to influence those factors in a way that improves the validity of assessments and the degree to which all students’ abilities are accurately represented.
In this study, we investigated the effects of engagement-enhancing features (EEFs) on the performance of 8th-graders on digitally-based assessments. The goal of implementing EEFs is to reduce the likelihood that assessment scores underestimate students’ knowledge and skills.
During the first stage of the study, the research team identified low-engagement interactive computer tasks (ICTs) from tasks developed for the 2015 8th-grade NAEP science assessment but dropped after the pilot test because of item performance issues. The identification of low-engagement items focused on events observed in the process data, including missing responses, correctness of responses, and time spent on each item and the entire task. One task was selected as the candidate for adding EEFs.
In the second stage of the study, the selected task was augmented with engagement-enhancing features. The EEFs focused on three primary design features: the social contextualization of the task scenario, the reduction of cognitive load, and the inclusion of metacognition opportunities.
Source: ACT, Inc.
Items were augmented with more than one feature if needed. The team then interviewed over 80 students assigned to one of three groups through virtual cognitive labs. The first group of students experienced the original task followed by the version augmented with EEFs. The second group experienced the same tasks, but in reverse order, and the third group—the control group—experienced only the augmented task. The team analyzed the impact of the EEFs on student performance and preferences and focused on high- and low-performing students across the groups.
The ACT study found support for engagement-enhancing features as positively reflecting the preferences of students as they relate to improved engagement. The results also show that these EEFs improve performance, perhaps through improved engagement. Most importantly—and originally unexpectedly—we identified that the EEFs, while improving engagement and performance, might do so disproportionately in support of lower performing students. This indicates the feature’s potential importance in improving the equity, as well as the validity, of assessments.
Our findings indicate that reducing cognitive load by changing item types, removing dependencies, breaking grouped items into separate items, and modifying the mode of data generation had a positive impact on student performance as well as preference in terms of making the task seem more engaging and achievable.
We did not see a direct impact of social contextualization enhancements and metacognition-based enhancements on student performance; however, we did see a positive impact of social contextualization on students’ preferences in terms of making the task seem more engaging, collaborative, realistic, and achievable. We also saw a positive impact of metacognition-based enhancements on students’ preferences in terms of improved understanding and perceptions of the task as more engaging and more achievable.
These findings are rooted in, and illustrative of, NAEP’s deep commitment to innovation and the improvement of assessment. More broadly, these findings provide insights that can be applied to assessment design generally, and more specifically, they provide a particular value to assessments that might be perceived by students as low-stakes. ACT recently shared a high-level overview of these findings at the Beyond Multiple Choice Conference, and the findings will also be shared in a session at the 2022 American Educational Research Association (AERA) conference focusing on the impact of item, test, and test-taker characteristics on engagement during NAEP assessments. ACT is excited to get the word out about these findings and help improve engagement as a rising tide that lifts all boats.
Team: This study was a highly collaborative effort between ACT, NCES, AIR, and ETS featuring contributions from the following team: Kristin Stoeffler (project director, assessment design and cognitive lab lead, ACT); Dr. Yigal Rosen (formerly principal investigator, ACT); Dr. Yan Wang (AIR); Dr. Emmanuel Sikali (NCES); Dr. Sweet San Pedro (phase 2 analysis, ACT); Dr. Joyce Schnieders (phase 2 analysis, ACT); a team of very talented research scientists and cognitive lab facilitators at ACT; Dr. Vanessa Simmering (formerly co-principal investigator – phase 1, ACT); Dr. Benjamin Deonovic (former data analysis – phase 1, ACT); Dr. Michael Yudelson (formerly data analysis – phase 1, ACT); Laurel Ozersky (formerly assessment design – phase 1, ACT).