• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Two papers were accepted to NAACL 2021

Статьи Надежды Чирковой и Сергея Трошина приняты в программу конференции NAACL, одной из ведущих по тематике обработки естественного языка.

An illustration of the proposed method for modeling the semantics of variables in a program

An illustration of the proposed method for modeling the semantics of variables in a program
Nadezhda Chirkova

The final versions of the papers and the source code will be released soon. The research is conducted with the use of the computational resources of the HSE Supercomputer Modeling Unit.

Both papers address the problem of improving the quality of deep learning models for source code by utilizing the specifics of variables and identifiers. The first paper proposes a recurrent architecture that explicitly models the semantic meaning of each variable in the program. The second paper proposes a simple method for preprocessing rarely used identifiers in the program so that a neural network (particularly, Transformer architecture) would better recognize the patterns in the program. The proposed methods were shown to significantly improve the quality of code completion and variable misuse detection.