Skip to main content
 

Clustering in monitored hands-on exercises to find common paths and outliers (2024)

Undergraduate: Zeqi Zhou


Faculty Advisor: Prasun Dewan
Department: Computer Science


Interest in computer science has been growing, leading to increased enrollment in computer science courses. Hands-on coding exercises are a key component of these courses, allowing instructors to evaluate student understanding and track progress. Code clustering is an effective method for grading, providing collective feedback, and identifying common errors or unique solutions. Despite its benefits, existing research on code clustering has limitations, including subjective evaluations and a lack of focus on outlier detection. Our study assesses the accuracy and performance of three clustering methods and ChatGPT-4 using a newly created labeled dataset. In addition, we propose a new outlier detection algorithm, Weight-based Agglomerative Clustering (WAC), designed to identify unique or erroneous code solutions. We compared our algorithm's accuracy and performance with existing outlier detection methods. Our results demonstrate a clearer understanding of the effectiveness of the evaluated clustering methods. Moreover, the evaluation of our outlier detection algorithm indicating our method outperforms existing methods.