• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Павел Браславский выступил на семинаре компании Huawei

Старший научный сотрудник научно-учебной лаборатории моделей и методов вычислительной прагматики представил свой доклад на семинаре "NLP/ML/AI for Search-Engine Efficiency".

Павел Браславский выступил на семинаре компании Huawei

16 декабря 2022г. состоялся семинар компании Huawei "NLP/ML/AI for Search-Engine Efficiency", где со своим докладом на тему "Cross-Lingual Adjustment of Contextual Word Representations for Zero-Shot Transfer" выступил старший научный сотрудник НУЛ моделей и методов вычислительной прагматики Павел Браславский. 

Аннотация: Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. A cross-lingual adjustment of these models using a small parallel corpus, or a bilingual dictionary can potentially further improve results. This is a more data efficient method compared to training a machine-translation system or a multi-lingual model from scratch using only parallel data. In this study, we experiment with zero-shot transfer of English models to four typologically different languages (Spanish, Russian, Vietnamese, and Hindi) and four IR/NLP tasks (XSR, QA, NLI, and NER). We carry out a cross-lingual adjustment of an off-the-shelf mBERT model. Our study showed gains in NLI for four languages; improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words across languages showed that fine-tuning leads to “forgetting” some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning. The study contributes to a better understanding of cross-lingual transfer capabilities of large multilingual language models and of effectiveness of their cross-lingual adjustment across various tasks and languages.