Семинары октябрь - декабрь 2019

03.12 Geometric deep learning for functional protein design

Speaker: Michael Bronstein, Professor, Chair of Machine Learning and Pattern Recognition, Imperial College London / Head of Graph Learning Research, Twitter

Protein-based drugs are becoming some of the most important drugs of the XXI century. The typical mechanism of action of these drugs is a strong protein-protein interaction (PPI) between surfaces with complementary geometry and chemistry. Over the past three decades, large amounts of structural data on PPIs has been collected, creating opportunities for differentiable learning on the surface geometry and chemical properties of natural PPIs. Since the surface of these proteins has a non-Euclidean structure, it is a natural fit for geometric deep learning, a novel class of machine learning techniques generalizing successful neural architectures to manifolds and graphs. In the talk, I will show how geometric deep learning methods can be used to address various problems in functional protein design such as interface site prediction, pocket classification, and search for surface motifs. I will present results of our ongoing work with Bruno Correia, Pablo Gainza-Cirauqui, and others from the EPFL Lab of Protein Design and Immunoengineering.

Location: room 319, Bolshoy Tryokhsvyatitelsky Pereulok, 3 (Kitai-Gorod Station)

Seminar working language – English

Видео-запись выступления

21.11 Авторегрессионные генеративные модели для задачи генерации

Семинар по итогам стажировки в Гарвардском университете, США

Докладчик – Ирина Понамарева, 4 курс ПМИ

Аннотация. Подходы, существующие для решения задачи генерации белковых последовательностей, включают модели, основанные на выравниваниях последовательностей, и модели, не использующие выравнивание. Основная идея первых состоит в том, что на вход таким моделям подаются наборы выровненных последовательностей, и модель обучается предсказывать символы для каждой позиции. Такие модели (например, HMM) имеют существенный недостаток: набор выровненных последовательностей, необходимый для обучения, не всегда возможно получить, так как исследуемые последовательности могут быть сильно вариабельными как по аминокислотному составу, так и по длине. Для таких последовательностей модели, основанные на выравнивании, неприменимы, однако модели, не использующие выравнивание, могут оказаться полезны. Модели, не использующие выравнивание, включают модели, использующие скрытое представление последовательности (автоэнкодеры) и авторегрессионные модели. Ирина расскажет о том, как во время своей стажировки пыталась адаптировать одну из популярных авторегрессионных архитектур XLNet для данной задачи, а также расскажет о других похожих подходах.

Покровский бульвар, 11, корпус G, аудитория G120.

Время проведения: 18:10-19:30

12.11 Моделирование фармакогенетики ривароксабана методами машинного обучения

Докладчик – Александр Шеин, стажёр-исследователь лаборатории

Будут рассмотрены модели классификации исходов у пациентов по генетическим и клинико-диагностическим параметрам: логистическая регрессия, метод опорных векторов и случайный лес. Обсуждается анализ важности признаков. Также будут рассмотрены модели машинного обучения для предсказания концентрации ривароксабана: линейная модель с l1 регуляризацией, метод опорных векторов и случайный лес.

Покровский бульвар, 11, корпус S, аудитория S332.

Время проведения: 18:30-19:50

08.11 Studying the impact of genetic variability on chromatin architecture in humans

Speaker - Olga Pushkareva, 2nd year student of the master's program "Data Analysis in Biology and Medicine"

One of open problems in computational biology is the assessment of the genetic contribution to the development of complex phenotypic traits. The genome-wide association studies have shown that the majority of disease variants fall into the gene regulatory sequences. However, it is not always the case - for example, some noncoding variants can result in regulatory variations. Moreover, small population studies have shown that only a tiny part of this variation is related to genetics.

The talk will be based on the two studies that aim to characterize the chromatin variability in human lymphoblastoid cell lines (Waszak SM, et al. Cell, 2015 and Kumasaka N, et al. Nature Genetics, 2019) and my current work on application of these two models to the ATAC-seq data of human adipose stromal cells.

Seminar working language – English

Location: room R307, Pokrovsky Boulevard, 11

Time: 18:30-19:50

01.11 The Attention Mechanism (Transformer): A Potential for Bioinformatic Problems

Speaker - Jin Seungmin, PHD student (HSE).

The transformer, a popular state-of-the-art deep neural network, can outperform well-known RNN models for sequence data. This model is introduced because RNN based architectures are hard to parallelize and they have difficulties in learning long-range dependencies within input and output sequences. The transformer takes into account all these dependencies using special networks, which may directly access an input space. The core idea behind the transformer model is self-attention — the ability to attend to different positions of the input sequence to compute a representation of that sequence. Transformer creates stacks of self-attention layer using Scaled dot product attention and Multi-head attention. In the presentation, I will introduce the core idea of the model and present its pros and cons based on the example of LA Traffic Jams Analysis. Potential bioinformatics applications will be also discussed.

Seminar working language – English

Location: room R506, Pokrovsky Boulevard, 11

Time: 18:30-19:50

25.10 ZDNA recognition and mistakes generated by hybrid deep learning models

Докладчик – Назар Бекназаров, стажёр-исследователь лаборатории

Regions of the left-handed form of Z-DNA were found in genomes of different species. There is an experimental evidence that Z-DNA plays a role in transcription, chromatin remodeling, and recombination. The association of epigenetic factors with Z-DNA sites remains poorly understood. The aim of this work is to determine the Z-DNA sites in the human genome associated with epigenetic markers with the help of machine learning (ML) models. The effectiveness of convolution, fully connected and recurrent neural networks (CNN, FC RNN) in comparison with base-line machine learning models is investigated. It was shown that convolution networks improve the efficiency of predictions but an addition of recurrent networks to convolution even more considerably increases the model performance. The results demonstrate the practical relevance of deep-learning methods for bioinformatics tasks.

Покровский бульвар, 11, корпус R, аудитория R506.

Время проведения: 18:10-19:30

22.10 Search for promoter enrichment with quadruplexes, associated with histone marks

Докладчик - Арина Ностаева, стажёр-исследователь лаборатории

We analyzed G4-chip dataset for human genome and epigenetic landscapes in two types of tissues: human stem cells and brain tissue. We found that around 80% of quadruplexes linked with histone marks are shared between both tissues. We performed enrichment analysis and found that promoters with histone marks H3K4Me1, H3K4Me3, H3K9Ac, H3K27Ac are enriched with quadruplexes and depleted with H3K27Me3. When comparing tissues, we observe that for H3K9Ac (active promoter), H4K4me1 (active enhancer) the odds ratio increase twice while moving from stem cells to brain tissues. For H3K27Me3 we observed the opposite transition: brain 0.15, stem cells 0.47 (suppression is 3 times more active in stem cells). We discuss possible mechanisms that underlie the observed phenomena.

Покровский бульвар, 11, корпус T, аудитория T908.

Время проведения: 13:00-14:00

04.10 Фармакогенетические предикторы безопасности лекарственных средств: работа с реальными данными

Первый в этом учебном году семинар лаборатории посвящен фармакогенетическим предикторам безопасности лекарственных средств.

Докладчик - Дмитрий Иващенко, врач-психиатр, к.м.н., ФГБОУ ДПО РМАНПО Минздрава России, научный сотрудник отдела персонализированной медицины НИИ МПМ, доцент кафедры детской психиатрии и психотерапии.

Дмитрий кратко рассказал о проводимых фармакогенетических исследованиях безопасности лекарств. Генотип пациента является фактором риска развития осложнений при назначении многих препаратов даже в стандартной дозе. Выявление групп риска позволяет индивидуально рассчитать наиболее эффективную и безопасную дозу лекарства. Были рассмотрены работы по фармакогенетике антикоагулянтов и антиагрегантов.

Покровский бульвар, 11, корпус D, аудитория D108.

Время проведения: 19:00-20:00

Нашли опечатку?
Выделите её, нажмите Ctrl+Enter и отправьте нам уведомление. Спасибо за участие!
Сервис предназначен только для отправки сообщений об орфографических и пунктуационных ошибках.

Международная лаборатория биоинформатики

Семинары октябрь - декабрь 2019

03.12 Geometric deep learning for functional protein design

Видео-запись выступления

21.11 Авторегрессионные генеративные модели для задачи генерации

12.11 Моделирование фармакогенетики ривароксабана методами машинного обучения

08.11 Studying the impact of genetic variability on chromatin architecture in humans

01.11 The Attention Mechanism (Transformer): A Potential for Bioinformatic Problems

25.10 ZDNA recognition and mistakes generated by hybrid deep learning models

22.10 Search for promoter enrichment with quadruplexes, associated with histone marks

04.10 Фармакогенетические предикторы безопасности лекарственных средств: работа с реальными данными