RL 썸네일형 리스트형 [Diffusion Q-learning] Diffusion Policies As An Expressive Policy Class For Offline Reinforcement Learning 논문 리뷰 해당 논문 링크 : https://arxiv.org/abs/2208.06193 Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL. Standard RL methods often perform poorly in this regime due to the function approximation errors on out- arxiv.org 해당 .. 더보기 이전 1 다음