We use cookies in order to improve the quality and usability of the HSE website. More information about the use of cookies is available here, and the regulations on processing personal data can be found here. By continuing to use the site, you hereby confirm that you have been informed of the use of cookies by the HSE website and agree with our rules for processing personal data. You may disable cookies in your browser settings.

  • A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Our Lab held the RuREBus shared task

More information about the corpus, types of relations and entities can be found in the repository of competition, which we held at the Dialog 2020 conference on RuREBus data.

Our Lab held the RuREBus shared task

Named entity recognition (NER) is a well-studied task, with a plenty of annotated data, on which SOTA models show high quality. At the same time, achieving the same good results in business cases often is difficult: documents and entities are domain-specific, text is written with clerical language (e.g., business documents), or, conversely, contains colloquial language (for example, dialogs in chat bots). In addition, it may be useful to extract not only entities, but also relations between them, and for this task there is less annotated data.


We present RuREBus (Russian Relation Extraction for Business) corpus – strategic planning documents of the Ministry of Economic Development of the Russian Federation with annotated entities and relationships. More information about the corpus, types of relations and entities can be found in the repository of competition, which we held at the Dialog 2020 conference on RuREBus data.


We also carried out a research on the obtained corpus, the results are presented in our article, “So, what is the plan? Mining Strategic Planning Documents” for the Digital Transformation and Global Society conference (DTGS 2020):



For more details see the paper: 

Ivanin, Vitaly and Artemova, Ekaterina and Batura, Tatiana and Ivanov, Vladimir and Sarkisyan, Veronika and Tutubalina, Elena and Smurov, Ivan  “RuREBus-2020 Shared Task: Russian Relation Extraction for Business”, Computational  Linguistics  and  Intellectual  Technologies:  Proceedings of the International Conference “Dialog” [Komp’iuternaia Lingvistika  i  Intellektual’nye  Tehnologii:  Trudy  Mezhdunarodnoj  Konferentsii  “Dialog”, 2020, Moscow, Russia