Multimodal multitask models as a tool for generative artificial intelligence
-
Multimodal architectures - a way to artificial general intelligence
We will talk about approaches to multimodal solutions design, ways of modalities fusion, modern solutions and experimental results. We will focus on text, images, audio and video as part of modalities
-
Generative artificial intelligence models for high quality multimedia data fusion
We will talk about the process of diffusion, how it can be applied, and metrics for quality assessment. We will look at different solutions for image and video generation by text descriptions, and discuss how to teach a generative model to understand physics and geometry.
Lecturers
Sber AI, AIRI, Samara National Research University
Sber AI
Sber AI