On June 1, 2022, I had the honor of giving a lightning talk about my research on the Challenges of Scaling Programming based Behavioral Metrics at the opening reception for the 2022 ACM Conference on Learning at Scale in New York City.
What Are Behavioral Metrics?
Behavioral metrics are scores generated from a student’s programming data that help in understanding student process and performance. They’re designed to quantify how students solve coding problems and their difficulty with compiler and runtime errors and provide a deeper look into student work.
The programming data that these metrics use can include Keystroke data, i.e., the actual work done by the student step by step, and Events data, i.e., a student’s compile/run attempts as well as submission attempts.
The three metrics we chose to replicate in this study are Error Quotient (EQ), Watwin Score, and Repeated Error Density (RED) score. EQ quantifies how much a student struggles with errors popping up in their code, Watwin scores focus on how a student responds to errors compared to their peers in terms of time spent to try to solve the mistakes, and RED scores quantify repeated errors in a student’s programming attempt.
Each of these metrics highlights a different aspect of a student’s programming process, how they got to the final solution they submitted and can be a significant indicator of student performance—something that grades alone may not be able to capture.
This research replicates findings from previous work that these metrics do not retain predictive power on grades, on anonymized data collected from thousands of learners across the globe that use the Codio platform and discusses the context-dependent nature of these behavioral metrics and grades noted in previous work as potential reasons for this finding.
We also found that a student’s behavioral metrics vary considerably more than their grades across assignments. The variation in process-based behavioral metrics could be due to students struggling, an aspect not always captured in grades. This suggests a promising future direction for programming-based behavioral metrics at scale.
One such improvement would be instructor-facing dashboards to identify struggling students. The fact that these behavioral metrics can be surfaced in almost real-time means that they can serve as an early indicator of struggling students.
At Codio, we believe in bridging the gap between cutting-edge academic research and its application in industry. To that end, we plan on acting on our proposal from this research project and making these metrics available to instructors that use the Codio platform as part of our Learning Insights ecosystem.
These metrics would bring in a new wave of student process-focused Learning Insights, allowing instructors to get a temperature check of their students in real-time and at scale, making early intervention not only feasible but possible.
Challenges of Scaling Behavioral Metrics at Scale has been accepted as a Work-in-Progress paper at L@S’22 and was presented during the poster session on June 2, 2022, at the Verizon Executive Education Center located on Cornell Tech’s Roosevelt Island campus.