Improved methods of large-scale variational inference

Variational inference is a popular technique in modern machine learning. For example, it is used for large-scale text modelling (Latent Dirichlet Allocation) and semi-supervised classification. It is also an important component of many deep learning systems, e.g. generative modeling of images.

The key idea of the variational inference is to model the data distribution conditional on the latent variables that are responsible for modeling the unobserved underlying process that goes behind the observed data and to restore the posterior distribution over these variables.

However, the solution given by the Bayes' theorem is often intractable so we have to resort to various approximations. The better this approximation, the more information about the posterior we obtain, which leads to more precise modelling of our data. This often leads to significant practical benefits, such as more realistic images generation, better accuracy of semi-supervised models, etc.

Variational inference casts the problem of finding the most accurate approximation of the posterior distribution to an optimization problem. It introduces the variational lower bound on the marginal likelihood function, maximization of which over the family of variational approximation gives the most accurate approximation of posterior in terms of KL-divergence.

Recently, there's been a lot of work focusing on making variational approximations more accurate in large-scale Bayesian models (Rezende15), (Kingma16), (Burda15).

In this project, we use broader families of distribution as a variational approximation of the true posterior and tighter variational lower bounds.

We test our ideas on variational autoencoders (VAE) – one of the most promising deep generative models of images. In VAE an image is modelled as an output of a MLP that takes a vector of latent variables as input. And vice versa the parameters of variational approximation are modeled with another MLP conditioned on the input image.

Rezende D. J., Mohamed S. Variational inference with normalizing flows //arXiv preprint arXiv:1505.05770. – 2015.

Salimans T. et al. Markov chain Monte Carlo and variational inference: Bridging the gap //International Conference on Machine Learning. – 2015. – С. 1218-1226.

Burda Y., Grosse R., Salakhutdinov R. Importance weighted autoencoders //arXiv preprint arXiv:1509.00519. – 2015.

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!
To be used only for spelling or punctuation mistakes.

Bayesian Methods Research Group

Improved methods of large-scale variational inference