The area of expertise of Lab is unstructured data analysis. We study recommending systems and services, develop methods for multimodal clustering and classification that allow profiling user interests based on various modalities. We do not treat data mining and machine learning models as black boxes and focus on developing interpretable algorithms.
We work in natural language processing (NLP). The focus of research lies in such areas as question-answering and information extraction. The Laboratory examines learning methods, in particular, transfer learning and domain adaptation techniques in multilingual settings, and applies these methods in practice. The Laboratory advances digital Russian studies by creating new annotated data sources, that represent society changes and such complex phenomena as education and economy digitalization.
The annual Conference on Empirical Methods in Natural Language Processing (EMNLP) takes place on November 7-11, 2021.
More information about the corpus, types of relations and entities can be found in the repository of competition, which we held at the Dialog 2020 conference on RuREBus data.