Two papers were accepted to NAACL 2021

Статьи Надежды Чирковой и Сергея Трошина приняты в программу конференции NAACL, одной из ведущих по тематике обработки естественного языка.

An illustration of the proposed method for modeling the semantics of variables in a program

Nadezhda Chirkova

Two papers were accepted to the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021):

The final versions of the papers and the source code will be released soon. The research is conducted with the use of the computational resources of the HSE Supercomputer Modeling Unit.

Both papers address the problem of improving the quality of deep learning models for source code by utilizing the specifics of variables and identifiers. The first paper proposes a recurrent architecture that explicitly models the semantic meaning of each variable in the program. The second paper proposes a simple method for preprocessing rarely used identifiers in the program so that a neural network (particularly, Transformer architecture) would better recognize the patterns in the program. The proposed methods were shown to significantly improve the quality of code completion and variable misuse detection.

Date

11 March 2021

Topics

Research & Expertise

Keywords

publications automatic source code analysis deep learning

About

Centre of Deep Learning and Bayesian Methods

About persons

Sergey Troshin

Nadezhda Chirkova