TY - GEN
T1 - Automatic Grading of Student Code with Similarity Measurement
AU - Wang, Dongxia
AU - Zhang, En
AU - Lu, Xuesong
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Nowadays, online judges are extensively used for automatically grading student code. However, they grade code by only counting the number of passed test cases, which is not fair for assessing the overall quality of a code snippet. On the other hand, existing studies have used machine learning techniques for code grading. However, they usually require large amounts of labeled code to enable supervised learning and heavily rely on feature engineering. In this work, we design SimGrader, a code grading system that grades student code based on the measurement of similarity to the “good” code, and thus save the effort for code labeling. We extract three types of features to capture the overall quality of a code snippet, and design specific methods to enhance the feature discrimination, which facilitates the similarity measurement. We conduct extensive experiments to show the superiority of SimGrader over existing methods and justify the effect of the its system components. We deploy SimGrader to grade the student code submitted in an introductory programming course.
AB - Nowadays, online judges are extensively used for automatically grading student code. However, they grade code by only counting the number of passed test cases, which is not fair for assessing the overall quality of a code snippet. On the other hand, existing studies have used machine learning techniques for code grading. However, they usually require large amounts of labeled code to enable supervised learning and heavily rely on feature engineering. In this work, we design SimGrader, a code grading system that grades student code based on the measurement of similarity to the “good” code, and thus save the effort for code labeling. We extract three types of features to capture the overall quality of a code snippet, and design specific methods to enhance the feature discrimination, which facilitates the similarity measurement. We conduct extensive experiments to show the superiority of SimGrader over existing methods and justify the effect of the its system components. We deploy SimGrader to grade the student code submitted in an introductory programming course.
KW - Code grading
KW - Contrastive learning
KW - Discriminative feature
KW - Tree edit distance
UR - https://www.scopus.com/pages/publications/85150938793
U2 - 10.1007/978-3-031-26422-1_18
DO - 10.1007/978-3-031-26422-1_18
M3 - 会议稿件
AN - SCOPUS:85150938793
SN - 9783031264214
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 286
EP - 301
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2022, Proceedings
A2 - Amini, Massih-Reza
A2 - Canu, Stéphane
A2 - Fischer, Asja
A2 - Guns, Tias
A2 - Kralj Novak, Petra
A2 - Tsoumakas, Grigorios
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022
Y2 - 19 September 2022 through 23 September 2022
ER -