• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Seminars 2017

 

25.05.2017
Title: 
Adjusting sense representations for knowledge-based word sense disambiguation and automatic pun interpretation
Speaker: Tristan Miller, Technische Universität Darmstadt (Germany)
Place: Moscow, 3 Kochnovsky Proezd, room 317, 16:40
Time:16:40-18:40
Abstract: Word sense disambiguation (WSD) – the task of determining which meaning a word carries in a particular context – is a core research problem in computational linguistics.  Though it has long been recognized that supervised (i.e., machine learning–based) approaches to WSD can yield impressive results, they require an amount of manually annotated training data that is often too expensive or impractical to obtain.  This is a particular problem for under-resourced languages and text domains, and is also a hurdle in well-resourced languages when processing the sort of lexical-semantic anomalies employed for deliberate effect in humour and wordplay.  In contrast to supervised systems are knowledge-based techniques, which rely only on pre-existing lexical-semantic resources (LSRs) such as dictionaries and thesauri. These techniques are of more general applicability but tend to suffer from lower performance due to the informational gap between the target word's context and the sense descriptions provided by the LSR. In this seminar, we treat the task of extending the efficacy and applicability of knowledge-based WSD, both generally and for the particular case of English puns.  In the first part of the talk, we present two approaches for bridging the information gap and thereby improving WSD coverage and accuracy.  In the first approach, we supplement the word's context and the LSR's sense descriptions with entries from a distributional thesaurus.  The second approach enriches an LSR's sense information by aligning it to other, complementary LSRs. In the second part of the talk, we describe how these techniques, along with evaluation methodologies from traditional WSD, can be adapted for the "disambiguation" of puns, or rather for the automatic identification of their double meanings.


25.05.2017
Title: 
Introduction to CodaLab Competitions / LaTeX for NLP researchers
Speaker: Tristan Miller, Technische Universität Darmstadt (Germany)
Place: Moscow, 3 Kochnovsky Proezd, room 317, 16:40
Time:16:40-18:40
Abstract
:This workshop will focus on tools that researchers and teachers in computer science and computational linguistics can use to evaluate and disseminate results.  The first half will introduce CodaLab Competitions, a platform for running comparative evaluations of data analytics software.  CodaLab Competitions can be used in the classroom to automate the evaluation of AI programming projects.  It can also be used by researchers to run collaborative or competitive tasks on shared data sets.  The second half of the workshop will cover LaTeX, the popular document preparation and typesetting system.  Topics covered will be of greatest interest to those conducting teaching and research in natural language processing, and will include overviews of packages for linguistic and multilingual typesetting, and for the preparation of slides, homework exercises, and exams.

22.05.2017
Title:
Flow-networks: graph theoretical approach to study flow systems
Speaker
: Liubov Tupikina, Ecole Polytechnique (Paris, France)
Time:16:40-18:10
Place: Moscow, 3 Kochnovsky Proezd, room 317,

Abstract: Complex network theory provides an elegant and powerful framework to statistically investigate different types of systems such as society, brain or the structure of local and long-range dynamical interrelationships in the climate system. Network links in correlation, so-called climate networks typically imply information, mass or energy exchange. However, the specific connection between oceanic or atmospheric flows and the climate network’s structure is still unclear. We propose a theoretical approach of flow-networks for verifying relations between the correlation matrix and the flow structure, generalizing previous studies and overcoming the restriction to stationary flows [1]. We studied a complex interrelation between the velocity field and the correlation network measures. Our methods are developed for correlations of a scalar quantity (temperature, for example) which satisfies an advection-diffusion dynamics in the presence of forcing and dissipation. Our approach reveals the insensitivity of correlation networks to steady sources and sinks and the profound impact of the signal decay rate on the network topology. We illustrate our results with calculations of degree and clustering for a meandering flow resembling a geophysical ocean jet. Moreover, we discuss the follow-up approaches and application of the flow-networks method [2].

[1] "Correlation networks from flows. The case of forced and time-dependent advectiondiffusion dynamics" L.Tupikina, N.Molkenthin, C.Lopez, E.Hernandes-Garcia, N.Marwan, J.Kurths, Plos One. 2016

[2] "A geometric perspective on spatially embedded networks. Quantification of edge anisotropy and application to flow networks", H.Kutza, N.Molkenthin, L.Tupikina, J.Donges, N.Marwan, U.Feudel, J.Kurths, R.Donner, Chaos, 2016

22.05.2017
Title:
Natural language processing with UIMA and DKPro
Speaker: 
Tristan Miller, Technische Universität Darmstadt (Germany)
Time:18:10-19:40
Place: 
Moscow, 3 Kochnovsky Proezd, room 317
Abstract: This talk introduces UIMA (Unstructured Information Management Architecture), an industry-standard software architecture for content analytics.  UIMA provides extensible data, component, and process models for annotating, exchanging, and analyzing unstructured data such as natural-language text.  We also introduce DKPro, a family of ready-to-use natural language processing (NLP) components built on UIMA. Using UIMA and DKPro, students and researchers can rapidly develop and deploy experimental text processing pipelines.  In a classroom setting, these tools are valuable because they significantly reduce the barriers to entry for learning and applying advanced NLP techniques.  Using DKPro, students can start projects in text classification, discourse analysis, etc., without needing to spend time implementing lower-level NLP tasks such as morphological analysis, word sense disambiguation, or text similarity.  In a graduate-level research setting, UIMA and DKPro facilitate conducting experiments in a fully reproducible manner.  The talk will provide a tutorial-style overview of both frameworks, including code snippets and sample applications.

Tristan’s bio:
Tristan Miller holds a doctorate in computer science from Technische Universität Darmstadt (Germany), where he is engaged as a Research Scientist in the Ubiquitous Knowledge Processing Lab.  He has previously held research and teaching appointments at the German Research Center for Artificial Intelligence (Germany), Griffith University (Australia), and the University of Toronto (Canada).  From 2008 to 2011 he worked as a language engineer and business analyst at InQuira, an enterprise knowledge management company subsequently acquired by Oracle. Dr. Miller's research interests lie mainly in natural language processing, and more specifically in computational lexical semantics. He has published on topics such as argumentation mining, word sense disambiguation, lexical substitution, and computational detection and interpretation of humour.  He is also an ardent science popularizer, serving as an advisory panel member or contributor to non-specialist linguistics publications such as Babel: The Language Magazine and Word Ways: The Journal of Recreational Linguistics.

16.03.2017
International workshop: Formal Concept Analysis for Knowledge Discovery
Official site
Abstract: The International Workshop “Formal Concept Analysis for Knowledge Discovery” was held at the Faculty of Computer Science. The event brought together   scientists and specialists of Data Analysis from St. Catherines (Canada), St. Petersburg, Novosibirsk, Tula, Kazan, Perm and other cities. Prof. Ivo Duentsch from Brock University (St. Catherines) made keynote talk Knowledge structures and skill assignments: Structural tools for diagnostic assessment.
 

06.03.2017
Title: Reactive Systems: A Powerful Paradigm for Modeling and Analysis from Engineering to Biology
SpeakerThomas A. Henzinger (IST Austria)
Place: Moscow, 3 Kochnovsky Proezd, room 317, 15:10
Abstract: A reactive system is a dynamic system that evolves in time by reacting to external events. Hardware components and software processes are reactive systems that interact with each other and with their physical environment. Computer science has developed powerful models, theories, algorithms, and tools for analyzing and predicting the behavior of reactive systems. These techniques are based on mathematical logic, theory of computation, programming languages, and game theory. They were originally developed to let us build a more dependable computer infrastructure, but their utility transcends computer science. For example, both an aircraft and a living organism are complex reactive systems. Our understanding and the design of such systems can benefit greatly from reactive modeling and analysis techniques such as execution, composition, and abstraction. 


06.03.2017
Title: Vellvm - Verifying the LLVM
SpeakerSteve Zdancewic (University of Pennsylvania)
Place: Moscow, 3 Kochnovsky Proezd, room 317 
Abstract: The Low-Level Virtual Machine (LLVM) compiler provides a modern, industrial-strength SSA-based intermediate representation (IR) along with infrastructure support for many source languages and target platforms. Much of the LLVM compiler is structured as IR to IR translation passes that apply various optimizations and analyses.

In this talk, I will describe the Vellvm project, which seeks to provide a formal framework for developing machine-checkable proofs about LLVM IR programs and translation passes. I'll discuss some of the subtleties of modeling the LLVM IR semantics. I'll also describe some of the proof techniques that we have used for reasoning about LLVM IR transformations and sketch some example applications including verified memory-safety instrumentation and program optimizations.
Vellvm is implemented in the Coq theorem prover and provides facilities for extracting LLVM IR transformation passes and plugging them into the LLVM compiler, thus enabling us to create verified optimization passes for LLVM and evaluate them against their unverified counterparts.
This is joint work with many collaborators at Penn, and Vellvm is part of the NSF Expeditions project: The Science of Deep Specications.


21.02.2017
Title: Probably Approximately Correct Computation of the Canonical Basis
SpeakerDaniel Borchmann, Postdoctoral Research Associate, Technische Universität Dresden
Place: Moscow, 3 Kochnovsky Proezd, room 205 
Abstract: To learn knowledge from relational data, extracting functional dependencies is a common approach.  A way to achieve this extraction is to convert the given data into so-called formal contexts and afterwards compute exact implicational bases of them.  A particularly interesting such basis is the so-called canonical basis, which is not only a basis of minimal cardinality, but for which also algorithms are known that can perform well in practice.  However, all these algorithms are of high runtime complexity, i.e., are not output-polynomial, and are thus likely to fail in certain situations.  On the other hand, most data sets stemming from real world applications are faulty to a certain degree, and an exact representation of its implicational knowledge – as provided by the canonical basis – may not helpful anyway.  Usual approaches of considering association rules instead of implications usually do not solve this problem satisfactorily, as they still require to compute exact implication bases.

This talk wants to investigate an alternative approach of learning approximations of implicational knowledge from data.  For this, we revisit the notion of probably approximately correct implication bases (PAC bases), survey known approaches and results about the feasibility of computing such bases, and shall discuss first experimental results showing their usefulness.  In particular, we shall show how methods from query learning can be leveraged to obtain an algorithm that allows to compute PAC bases in output polynomial time. Finally, we shall give an outlook how attribute exploration, an interactive learning approach based on querying domain experts, can be combined with PAC bases to obtain a probably approximately correct attribute exploration algorithm.

15.02.2017 - 22.02.2017
Title
: From digital pixels to life 
Speaker: Prof. Peter Horvath, to Institute for Molecular Medicine Finland (FIMM)
Place: Moscow, 3 Kochnovsky Proezd,  
Abstract: In his course Prof. Peter Horvath made stress on high-content screening (HCS), which includes cell biology, automated high resolution microscopy, informatics and robotics. High-content screening aims to discover small and large molecules (such as drugs, siRNAs) that change the phenotypes of cell in a desired manner. High-content analysis (HCA) refers to the analysis and evaluation of large data produced during an HCS scenario. Despite the fact that informatics was revolutionized recently, HCA suffers from the lack of solutions to the computational problems that arise and the limited computational capacity. To overcome these, recently numerous image analysis and machine learning approaches were proposed. This course gave an insight into the different most popular methods including automated microscopy, image processing, and multiparametric analysis of the data. During this course 10.000-100.000 images (virtually in this case) were created and we also  developed methods to analyze them using image segmentation and supervised machine learning.




 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!