Семинар HDI&TFAIM Lab "On statistical complexity of the Inverse Optimal Control"
Inverse optimal control, also known as inverse reinforcement learning (IRL), aims to deduce the cost function that an expert optimizes based on their observed behavior. In this talk, we rigorously examine the complexity of this inverse problem for linearly-solvable Markov decision processes (LMDPs) connected to Linear Quadratic Stochastic Control problems. We assume a discretized trajectory of an optimally controlled process is observed over an extended period and analyze the error in reconstructing the terminal cost functional. Our findings demonstrate that the rate of convergence, in terms of the number of time steps, is logarithmic and cannot be generally improved.