Семинар HDI Lab: Amortising intractable inference with diffusion models and off-policy RL
15 августа, в 14:40, состоится доклад Николая Малкина (Университет Эдинбурга)
I will present recent and ongoing work relating diffusion models, variational inference, and reinforcement learning. Efficient inference of high-dimensional variables under a diffusion model prior enables solutions to problems such as conditional generation, semantic segmentation, and combinatorial optimisation (https://arxiv.org/abs/2206.09012). A new family of amortised methods (https://arxiv.org/abs/2402.05098, https://arxiv.org/abs/2405.20971) places the problem of stochastic continuous control -- including sampling of posterior distributions under neural SDE priors -- in the theoretical framework of off-policy entropy-regularised reinforcement learning, via the formalism of continuous generative flow networks (https://arxiv.org/abs/2301.12594). This connection allows us to train diffusion models to sample a target density or energy function, i.e., to perform black-box variational inference. It further lets us fine-tune pretrained diffusion models in a data-free manner for asymptotically unbiased posterior sampling, with applications to vision (class-conditional generation, text-to-image generation), language (constrained generation in diffusion language models), and control/planning (KL-constrained policy extraction with a diffusion behaviour policy). These approaches are applicable in Bayesian machine learning and various scientific domains.
Diffusion models as plug-and-play priors
Improved off-policy training of diffusion samplers
Amortizing intractable inference in diffusion models for vision, language, and control
A theory of continuous generative flow networks