Fundamental and generative models

At the AI Centre the mechanisms of LLM operation are studied, tools for generating realistic content and designing virtual environments are created. Multimodal approaches that combine language and visualisation for solving scientific and applied tasks are also implemented. Specialists are engaged in accelerating models through effective knowledge transfer and augmenting LLM with the ability to integrate with external tools and databases, thereby increasing the autonomy and analytical potential of AI.

Development of new training, fine-tuning, and acceleration algorithms for fundamental and generative models

Tasks

Development of effective fine-tuning algorithms for generative models of various types (GAN, diffusion models, LLM) using low-parametric matrix representations.
Creation of new tensor algorithms for approximating model parameters and regularisation methods that take into account sparsity, low-rank structure, and architectural features of generative models, which will enable faster fine-tuning, prevent overfitting, and improve generation quality.
Development of advanced regularisation methods and exploration of ways to combine fine-tuned models in the context of continuous learning and personalised generation.

Creation of a generative artificial intelligence model using omics data

Tasks

Creation of a generative artificial intelligence model based on omics data
Development of xAI (explainable AI) methods for interpreting the significance of omics features for deep learning models and foundational models
Development of methods for generating functional genomic elements (FGE) and omics features

Effects

Reduction of risks associated with erroneous decisions in genetic diagnostics
Reduction of the labour intensity of automating molecular genetic research through the implementation of intelligent data analysis components and machine learning
Enhancement of efficiency in handling large molecular biology data by combining traditional modelling, optimisation methods, and artificial intelligence, which will yield the best results based on both domain theory and data processing outcomes
Increased accessibility of artificial intelligence for use in personalised diagnostics, early detection of genetic diseases, and the advancement of applied scientific research in the field of bioinformatics

AI and Cardio-genetics

Tasks

Development of algorithms and mathematical models in the fields of molecular biology, genetics, and bioinformatics aimed at enhancing the accuracy and efficiency of genomic data analysis for the study of cardiovascular diseases.
Creation of tools for personalised medicine based on fundamental models that allow for the assessment of the risk of developing cardiovascular diseases, myocardial infarction, and sudden death.

Modelling Neurocognitive Health Based on Speech and Structural Characteristics of the Brain

Tasks

Development of a model for identifying post-stroke aphasia through spontaneous speech, and models for detecting age-related cognitive decline based on spontaneous speech, including methods of deep learning and large language models.

Effects

Reduction of the risks of erroneous decisions in neurocognitive diagnostics: speech, reading, and cognitive impairments;
Decreased labour intensity of diagnostics through the implementation of intelligent components for data analysis and machine learning.

Creation of an adapted LLM for the field of science, technology, and innovation

Tasks

Conducting experiments on the fine-tuning of open-source large language models (LLM) to adapt them for solving problems in the fields of science, technology, and innovation
Developing methods for implementing large language models on limited computational resources, specifically researching and selecting methods for accelerating (optimising) computations
Investigating approaches to training multimodal LLM that take into account the relationships and structure of documents when generating responses — analysing additional modalities allows models to better understand the context of user queries
Developing an optimal solution to enhance response generation in the fields of science, technology, and innovation
Selecting and implementing optimal reasoning methods for working with scientific and technical data and integrating the chosen solution into a multi-agent architecture

Practical results

An innovative software solution for analysing information on R&D using artificial intelligence technologies, applicable in the domain of «Science and Innovation» for comprehensive research on trends in Russian science and identifying centres of competence with the potential to implement national projects in technological leadership

Effects

Reduction in the volume of computational resources required for using LLM in tasks of search, synthesis, and data analysis, potentially achieved through effective distillation and quantisation methods
Application of the adapted LLM with optimal parameters

Fine-tuning LLM for a career and educational trajectory recommendation system for applicants

Tasks

Development of an advanced model for generating personalised recommendations taking into account individual user characteristics, educational trajectory, and the current dynamics of the labour market based on modern natural language processing and machine learning methods

Effects

The model will be able to analyse the interests of the applicant expressed in free form, their characteristics, as well as salary predictions obtained using machine learning models, and based on these, formulate recommendations for choosing an educational trajectory and future career

Innovative software solution for entrepreneurs based on a fine-tuned large language model

Tasks

Creation of an innovative software solution for entrepreneurs based on a fine-tuned large language model

Effects

Increased awareness among entrepreneurs about government support
Provision of quality navigation through government support measures
Stimulation of proactive behaviour among entrepreneurs
Reduction of processing time for requests in small and medium-sized enterprise (SME) support centres
Improvement in the quality of decision-making and reduction of the workload on SME support service specialists

Multimodal Visual-Textual AI Models

Tasks

Creation of a software library for building systems for the comprehensive processing of heterogeneous data for tasks in industry, medicine, and business, which will simplify the application of large foundational models in innovative artificial intelligence software solutions.
Development of generative models tailored to specific domains

Empathetic AI: A multimodal model for predicting human emotional states

Tasks

Development of mathematical and software tools that allow for the analysis and interpretation of emotional states and personal characteristics of individuals based on multimodal data, including:

Methods for automatic emotion recognition
Assessment of personality traits
Creation of digital emotionally expressive avatars
Development of interfaces that adapt to the user's emotional state

Practical Results

Creation of highly accurate neural network models for emotion recognition and interpretation.
Development of generative neural network models for creating digital emotionally expressive avatars and intelligent multimodal interfaces to enhance user interaction across various fields.

Effects

Increased accessibility of artificial intelligence for use in everyday life.