Семинар лаборатории теоретической информатики: "Clustering Billions of Reads for DNA Data Storage". Докладчик: К. Макарычев
Мероприятие завершено
На очередном семинаре лаборатории теоретической информатики во вторник 03 октября состоится доклад
Константина Макарычева
"Clustering Billions of Reads for DNA Data Storage".
Время проведения:18:10 - 19:30
Адрес мероприятия: Кочновский проезд, д. 3, ауд. 205
Заказ пропуска: evavilova@hse.ru
Abstract
I will tell the audience how to quickly cluster billions of strings based on their similarity (edit distance). We will discuss what makes the problem hard and then explore known (theoretical/mathematical) techniques like Locality Sensitive Hashing (LSH), metric embeddings, and sketching that can be employed for clustering Big Data. Finally, I will show how we use these techniques along with some new ingredients to cluster billions of DNA strands.
I will also briefly mention how string clustering is used in the Microsoft DNA Storage project – the project that develops technology for storing data on synthesized DNA strands.
The talk is based on my joint work with a team of researchers from Microsoft Research and University of Washington.This paper will appear at NIPS 2017.
I will also briefly mention how string clustering is used in the Microsoft DNA Storage project – the project that develops technology for storing data on synthesized DNA strands.
The talk is based on my joint work with a team of researchers from Microsoft Research and University of Washington.This paper will appear at NIPS 2017.
Дата
3 октября
18:10
Адрес
Кочновский пр-д, д, 3
В статье упомянуты

