跳到主要导航 跳到搜索 跳到主要内容

CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure

  • East China Normal University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. We first denoise the input code sequences based on the token types pre-defined by the compilers to filter those tokens whose attention scores are too small. After that, we define a new metric CAT-score to measure the commonality between the token-level attention scores generated in CodePTMs and the pair-wise distances between corresponding AST nodes. The higher the CAT-score, the stronger the ability of CodePTMs to capture code structure. We conduct extensive experiments to integrate CAT-probing with representative CodePTMs for different programming languages. Experimental results show the effectiveness of CAT-probing in CodePTM interpretation. Our codes and data are publicly available at https://github.com/nchen909/CodeAttention.

源语言英语
主期刊名Findings of the Association for Computational Linguistics
主期刊副标题EMNLP 2022
编辑Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
出版商Association for Computational Linguistics (ACL)
4029-4037
页数9
ISBN(电子版)9781959429432
DOI
出版状态已出版 - 2022
活动2022 Findings of the Association for Computational Linguistics: EMNLP 2022 - Hybrid, Abu Dhabi, 阿拉伯联合酋长国
期限: 7 12月 202211 12月 2022

出版系列

姓名Findings of the Association for Computational Linguistics: EMNLP 2022

会议

会议2022 Findings of the Association for Computational Linguistics: EMNLP 2022
国家/地区阿拉伯联合酋长国
Hybrid, Abu Dhabi
时期7/12/2211/12/22

指纹

探究 'CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure' 的科研主题。它们共同构成独一无二的指纹。

引用此