Семинары 2023

"The Power of Demonstrations in Machine Learning"

Дата: 11.05.2023

Тема: "The Power of Demonstrations in Machine Learning".

Докладчик: Павел Сулимов, Кандидат компьютерных наук, научный сотрудник Института прикладных информационных технологий (Zurich University of Applied Sciences).

Аннотация: Classical supervised learning might suffer from the lack of labeled data available for training. In unsupervised learning, where no labels/targets are given, learning process can be quite expensive due to need of patterns understanding from scratch. As an attempt to find a "golden middle" between supervised and unsupervised learning, weak supervision (semi-supervision) has come around with combining a small amount of labeled data with a large amount of unlabeled data during training.
The other approach that tries to get rid of problems with data is a reinforcement learning, where it's suggested to collect the data along with training via running through the episodes of the task e.g. the game playing. However, not every problem could be formulated through the principles of reinforcement learning - and more to say, reinforcement learning itself occurred to be more powerful when introducing the demonstrations (aka weak supervision) - examples of how to "play the game correctly".
At the lecture we will touch the theoretical background of the weak supervision and demonstrations, study the recent cases from different fields (mathematics, bioinformatics, information sciences etc.), and discuss the "chicken-egg problem" consequences of semi-supervision.

Презентация доступна по ссылке

"A Visual Analytics System for Improving Attention-based Traffic Forecasting Models"

Дата: 15.05.2023

Тема: "A Visual Analytics System for Improving Attention-based Traffic Forecasting Models".

Докладчик: Джин Сеунгмин, аспирант 3-го курса департамента больших данных и информационного поиска ФКН.

Аннотация: With deep learning (DL) outperforming conventional methods for different tasks, much effort has been devoted to utilizing DL in various domains. Researchers and developers in the traffic domain have also designed and improved DL models for forecasting tasks such as estimation of traffic speed and time of arrival. However, there exist many challenges in analyzing DL models due to the black-box property of DL models and complexity of traffic data (i.e., spatio-temporal dependencies). Collaborating with domain experts, we design a visual analytics system, AttnAnalyzer, that enables users to explore how DL models make predictions by allowing effective spatio-temporal dependency analysis. The system incorporates dynamic time warping (DTW) and Granger causality tests for computational spatio-temporal dependency analysis while providing map, table, line chart, and pixel views to assist users to perform dependency and model behavior analysis. For the evaluation, we present three case studies showing how AttnAnalyzer can effectively explore model behaviors and improve model performance in two different road networks. We also provide domain expert feedback.

“Utilizing empirical p-values in False Discovery Rate control and examination of the reasoning capacity of the deep net based METDR method”.

Дата: 26.05.2023

Тема: “Utilizing empirical p-values in False Discovery Rate control and examination of the reasoning capacity of the deep net based METDR method”.

Докладчик: Боревский Андрей Олегович, Стажер-исследователь в Научно-учебной лаборатории искусственного интеллекта для вычислительной биологии.

Искусственный интеллект был продемонстрирован как невероятно полезный инструмент для решения широкого круга задач. Одна из них — биоинформатика — предложила принципиально новые методики на стыке машинного обучения и статистики. Несмотря на свой потенциал, эти методы еще не использовались для более общих задач. Соответственно, в нашей работе мы разрабатываем радикально новый подход, называемый эмпирическими p-значениями (EPV). Предполагая, что отрицательные обучающие данные задачи классификации являются распределением нулевой гипотезы, мы вычисляем соответствующие p-значения для тестовых выборок. Позже мы расширяем процедуру BH для управления FDR, что позволяет как регулировать взаимосвязь распределений обучающих и тестовых данных, так и предсказывать новые метки на основе уже исследованных. Основная цель - точно предсказать количество принятых открытий на каждом уровне без истинных ярлыков.

Докладчик: Латыпов Инсан-Александр, студент 2 курса программы Науки о Данных факультета компьютерных наук

Задача ответов на вопросы по изображениям (VQA) имеет большое значение для искусственного интеллекта и требует использования комбинации визуальных и текстовых данных. Недавние успехи в области глубокого обучения, в том числе моделей на основе трансформеров, позволили достичь результатов, сравнимых с человеком. Тем не менее, все еще сложно объективно оценить способности модели к мышлению и пониманию отношений между объектами в реальном мире. В последнее время в задаче VQA появились большие наборы данных, которые требуют от моделей использования сложных визуальных и текстовых концепций. Целью данного исследования является создание нового датасета, который можно использовать для оценки моделей к сложному логическому мышлению, используя ограниченный набор визуальных и текстовых концепций, но при этом требующего комплексного понимания отношений между разными объектами. В этом исследовании представлена методология построения таких датасетов, а также описан код для Blender на языке Python. Также представлены результаты предобученной модели MDETR на предложенном датасете.

"Implementing a Label-Free Quantification Method in the crux-toolkit"

Дата: 28.11.2023

Тема: "Implementing a Label-Free Quantification Method in the crux-toolkit".

Докладчик: Аквей Фрэнк Лоренс Ний Адокквей, аспирант Департамента анализа данных и искусственного интеллекта Факультета компьютерных наук

Аннотация:

When doing bottom-up proteomics experiments with mass spectrometry, one seeks not only to identify peptides but also to identify the abundance of said peptides. One way of doing this is through Label-Free Quantification. These methods tend to be cheaper than chemical tagging albeit computationally expensive. However, FlashLFQ implemented a method which speeds up the computation time by orders of magnitude.

The goal of my project is to replicate this algorithm as part of the crux-toolkit thereby providing a more holistic set of tools for crux-toolkit users.

Запись мероприятия

Нашли опечатку?
Выделите её, нажмите Ctrl+Enter и отправьте нам уведомление. Спасибо за участие!
Сервис предназначен только для отправки сообщений об орфографических и пунктуационных ошибках.

Научно-учебная лаборатория искусственного интеллекта для вычислительной биологии

Семинары 2023

"The Power of Demonstrations in Machine Learning"

"A Visual Analytics System for Improving Attention-based Traffic Forecasting Models"

“Utilizing empirical p-values in False Discovery Rate control and examination of the reasoning capacity of the deep net based METDR method”.

"Implementing a Label-Free Quantification Method in the crux-toolkit"