Адрес: 109028, г. Москва, Покровский бульвар, д. 11
Телефон: +7 (495) 531-00-00 *27254
Факультет готовит разработчиков и исследователей. Программа обучения сформирована с учётом опыта ведущих американских и европейских университетов, таких как Stanford University (США) и EPFL (Швейцария), а также Школы анализа данных — одной из самых сильных магистратур в области computer science в России. Широкий список курсов по выбору и значительная доля программы, выделенная под них, позволит каждому студенту сформировать свою собственную образовательную траекторию. В основе обучения — практика и проектная работа.
Belomestny D., Trabs M., Tsybakov A.
Bernoulli: a journal of mathematical statistics and probability. 2019. Vol. 25. No. 3. P. 1901-1938.
Beklemishev L. D., Kolmakov E.
Journal of Symbolic Logic. 2019. Vol. Volume 84. No. Issue 2. P. 849-869.
Makarov I., Gerasimova O., Sulimov P. et al.
PeerJ Computer Science. 2019. P. 1-20.
Rodriges Zalipynis R. A.
In bk.: Proceedings of the ACM SIGMOD International Conference on Management of Data. NY: ACM, 2019. P. 1985-1988.
Savostyanov A., Shapoval S., Shnirman M.
Physica D: Nonlinear Phenomena. 2020. Vol. 401. P. 132160.
В среду 24 июля в 15:00 Центр глубинного обучения и байесовских методов приглашает всех желающих послушать доклады наших зарубежных коллег, Александра Шеховцова (Czech Technical University in Prague) и Бельхаля Карими (Ecole Polytechnique & INRIA).Место проведения: ФКН (Кочновский проезд, 3), аудитория 622.Докладчики:Alexander Shekhovtsov, PhD. Department of Cybernetics Faculty of Electrical Engineering Czech Technical University in PragueTalk: Statistical Problems in Neural NetworksAbstract: Neural Networks (NNs) have became a very well working technology, in fact so well that it seemingly obviates the need for statistical models and statistical decision theory.In this talk I want to highlight different aspects of Neural Networks that require statistical treatment, discuss open problems and our modest contribution towards addressing them.More specifically, typical neural networks consist of a deterministic mapping and a simple probabilistic model on the output end, such as softmax function for classification or a Gaussian distribution for regression. However, as soon as we consider uncertain inputs, injected noises (such as dropout), uncertain parameters (arising from Bayesian learning) or e.g. stochastic binary activations (used to obtain binary networks), all hidden units become random variables. The network becomes a directed probabilistic graphical model. It is then necessary to compute or approximate the predictive probability of the network, expectations of the gradients and other quantities. There is evidence that a better probabilistic treatment allows to improve generalization properties, robustness to input perturbations, speed and reliability of training, etc. These improvements have been so far relatively small and coming at the price of a significant increase in the complexity of training and inference methods. However, they become more important as better methods are being developed and heuristical leaps such as new architectures are saturated.Belhal Karimi, PhD. Student Ecole Polytechnique & INRIA CMAP, XPOPTalk: Nonconvex Optimization for Latent Data Models: An Incremental and An Online Point of ViewAbstract: Many problems in machine learning pertain to tackling the minimization of a possibly nonconvex and non-smooth function defined on a Euclidean space. Examples include topic models, neural networks or sparse logistic regression. Optimization methods, used to solve those problems, have been widely studied in the literature for convex objective functions and are extensively used in practice. However, recent breakthroughs in statistical modeling, such as deep learning, coupled with an explosion of data samples, require improvements of nonconvex optimization procedures for large datasets.This talk is an attempt to address those two challenges by developing algorithms with cheaper updates, ideally independent of the number of samples, and improving the theoretical understanding of nonconvex optimization that remains rather limited. Particularly, we are interested in the minimization of such objective functions for latent data models, i.e., when the data is partially observed which includes the conventional sense of missing data but is much broader than that. We consider the minimization of a (possibly) nonconvex and non-smooth objective function using incremental and online updates.To that end, we propose and analyze several algorithms, that exploit the latent structure to efficiently optimize the objective function, and illustrate our findings with numerous applications. In the first main contribution, we provide a unified framework of analysis for optimizing nonconvex finite-sum problems which encompasses logistic regression and variational inference. This framework is an extension of an incremental surrogate optimization method based on the Majorization-Minimization principle and aims at minimizing an easier upper bound of the objective function at each iteration of the algorithm in an incremental fashion. Our proposed framework is proved to converge almost surely to a stationary point and in O(n/є) iterations to an є-stationary point.In the second main contribution, we analyze a stochastic approximation scheme where the stochastic drift term is non necessarily a gradient and with a potentially biased mean field under two cases: the vector of random variables is either i.i.d. or a state-dependent Markov chain. For both cases, we provide tight non-asymptotic upper bounds, of order O(c0 + log(n)/√n), where co is the potential bias of the drift term, and illustrate our findings by analyzing popular statistical learning algorithms such as the online Expectation Maximization algorithm and the average cost Policy-Gradient method.Если вам нужен пропуск в здание ФКН, обращайтесь к Екатерине Волжениной.
Центр глубинного обучения и байесовских методов: Менеджер