A novel self-attention based automatic code completion neural network

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Code completion is one branch of source code modeling tasks. Using a deep learning method to implement it has explored the possibilities of modeling source code with a statistic language model. Recurrent Neural Network (RNN) is a universal feature extractor of Natural Language Processing (NLP), which is used in the code completion field commonly. However, RNN based models are lack of long-range context dependency and have a poor performance in training speed. Besides, some previous models have not handled the issue of out of vocabulary (OOV) well, which hinders further improvements in prediction accuracy. This paper presents a novel automatic code completion neural network, which is based on a self-attention mechanism with open vocabulary to address issues of OOV, slow training speed, and lacking long context-dependency. Experiments in this paper show that our model has a better performance of predicting tokens compared with the traditional N-gram model and RNN based model. In the meantime, we reduced training time significantly. More broadly, the combination of self-attention and open vocabulary has a potential application in the source code modeling field.

Original languageEnglish
Title of host publicationSEKE 2020 - Proceedings of the 32nd International Conference on Software Engineering and Knowledge Engineering
PublisherKnowledge Systems Institute Graduate School
Pages386-391
Number of pages6
ISBN (Electronic)1891706500
DOIs
StatePublished - 2020
Event32nd International Conference on Software Engineering and Knowledge Engineering, SEKE 2020 - Pittsburgh, Virtual, United States
Duration: 9 Jul 202019 Jul 2020

Publication series

NameProceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE
VolumePartF162440
ISSN (Print)2325-9000
ISSN (Electronic)2325-9086

Conference

Conference32nd International Conference on Software Engineering and Knowledge Engineering, SEKE 2020
Country/TerritoryUnited States
CityPittsburgh, Virtual
Period9/07/2019/07/20

Keywords

  • Code Completion
  • Open Vocabulary
  • Self-Attention
  • Source Code Modeling

Fingerprint

Dive into the research topics of 'A novel self-attention based automatic code completion neural network'. Together they form a unique fingerprint.

Cite this