About the Laboratory

The Laboratory for Semantic Analysis (LSA), established within the Centre for Language and Semantic Technologies, studies natural language as a unified whole within the natural science paradigm using methods from computer science and applied mathematics. The underlying assumption is that language is not so much a collection of words as a collection of meanings, which, in modern understanding, can be represented as vectors in a Euclidean (semantic) space (embeddings). This representation enables large‑scale theoretical analysis of language using methods from complex systems theory, topological data analysis, chaotic systems theory, methods for analysing nonlinear partial differential equations, and complex network theory, with the aim of revealing its large‑scale structure (intrinsic dimensionality, ‘holes’ in language, properties of semantic trajectories, etc.). This analysis is carried out on as wide a range of languages as possible.

The practical applications of these theoretical results give rise to new approaches for building large language models (within manifold learning) and large language models (within interpretable artificial intelligence methods). Practical projects (with high commercialisation potential) include:

‘Catch the Bot’ (implemented within Strategic Project 5 of the Priority‑2030 programme)
‘Jokes Aside’ (generation of humorous texts)
‘Tree of Knowledge’ (creating a structure of scientific and engineering knowledge using interpretable artificial intelligence methods).

The laboratory is inherently interdisciplinary and actively collaborates with other faculties and laboratories at HSE University: the Faculty of Humanities (Ekaterina Rakhilina, Eduard Klyshinsky); MIEM (I.A. Lubantsevsky); the Faculty of Computer Science (Anna Shestakova, Vasily Klyucharev, Alexey Ossadtchi); ATA Lab at the Faculty of Computer Science (Victor Buchstaber, Alexandra Bernadotte); the DeCAn Centre at the Faculty of Economic Sciences (Fuad Aleskerov).

The LSA aims to create a large‑scale model of natural language by studying the entire set of n‑grams of natural language as a single object. From a practical standpoint, this facilitates the development of new approaches to building large language models within manifold learning. We also examine the possibilities for creating a new class of semantic technologies.

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!
To be used only for spelling or punctuation mistakes.

Contacts

Laboratory for Semantic Analysis

Contacts

About the Laboratory