• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Семинар ММИТ: Два доклада профессора Тристана Миллера

Тема первого доклада: "Adjusting sense representations for knowledge-based word sense disambiguation and automatic pun interpretation"
Тема второго доклада: "Introduction to CodaLab Competitions / LaTeX for NLP researchers"

Место проведения: Кочновский проезд, 3. ауд. 317, 16:40

16:40  Title: Adjusting sense representations for knowledge-based word sense disambiguation and automatic pun interpretation

Speaker: Tristan Miller, Technische Universität Darmstadt (Germany)

Abstract: Word sense disambiguation (WSD) – the task of determining which meaning a word carries in a particular context – is a core research problem in computational linguistics.  Though it has long been recognized that supervised (i.e., machine learning–based) approaches to WSD can yield impressive results, they require an amount of manually annotated training data that is often too expensive or impractical to obtain.  This is a particular problem for under-resourced languages and text domains, and is also a hurdle in well-resourced languages when processing the sort of lexical-semantic anomalies employed for deliberate effect in humour and wordplay.  In contrast to supervised systems are knowledge-based techniques, which rely only on pre-existing lexical-semantic resources (LSRs) such as dictionaries and thesauri. These techniques are of more general applicability but tend to suffer from lower performance due to the informational gap between the target word's context and the sense descriptions provided by the LSR. In this seminar, we treat the task of extending the efficacy and applicability of knowledge-based WSD, both generally and for the particular case of English puns.  In the first part of the talk, we present two approaches for bridging the information gap and thereby improving WSD coverage and accuracy.  In the first approach, we supplement the word's context and the LSR's sense descriptions with entries from a distributional thesaurus.  The second approach enriches an LSR's sense information by aligning it to other, complementary LSRs. In the second part of the talk, we describe how these techniques, along with evaluation methodologies from traditional WSD, can be adapted for the "disambiguation" of puns, or rather for the automatic identification of their double meanings.

18.10 Title: Introduction to CodaLab Competitions / LaTeX for NLP researchers

Abstract:This workshop will focus on tools that researchers and teachers in computer science and computational linguistics can use to evaluate and disseminate results.  The first half will introduce CodaLab Competitions, a platform for running comparative evaluations of data analytics software.  CodaLab Competitions can be used in the classroom to automate the evaluation of AI programming projects.  It can also be used by researchers to run collaborative or competitive tasks on shared data sets.  The second half of the workshop will cover LaTeX, the popular document preparation and typesetting system.  Topics covered will be of greatest interest to those conducting teaching and research in natural language processing, and will include overviews of packages for linguistic and multilingual typesetting, and for the preparation of slides, homework exercises, and exams.

Tristan’s bio:

Tristan Miller holds a doctorate in computer science from Technische Universität Darmstadt (Germany), where he is engaged as a Research Scientist in the Ubiquitous Knowledge Processing Lab.  He has previously held research and teaching appointments at the German Research Center for Artificial Intelligence (Germany), Griffith University (Australia), and the University of Toronto (Canada).  From 2008 to 2011 he worked as a language engineer and business analyst at InQuira, an enterprise knowledge management company subsequently acquired by Oracle. Dr. Miller's research interests lie mainly in natural language processing, and more specifically in computational lexical semantics. He has published on topics such as argumentation mining, word sense disambiguation, lexical substitution, and computational detection and interpretation of humour.  He is also an ardent science popularizer, serving as an advisory panel member or contributor to non-specialist linguistics publications such as Babel: The Language Magazine and Word Ways: The Journal of Recreational Linguistics.

Приглашаются все желающие. Для студентов, аспирантов, преподавателей и сотрудников Высшей школы экономики - вход свободный.

При необходимости заказа пропуска в здание НИУ ВШЭ просьба сообщить по e-mail: lantropova@hse.ru.