Q-function Approximation

diffusion_fine-tuning_via_reparameterized_policy_gradient_of_the_soft_q-function.md

description [ICLR 2026][Image Generation][Diffusion model fine-tuning] This paper proposes SQDF (Soft Q-based Diffusion Finetuning), which fine-tunes diffusion models under a KL-regularized RL ...

GitHub

q-function-approximation

This material describe the Q-function approximator for DQN/DDQN and the policy-value network used in PPO. Key components include experience replay and target network updates for DQN-based methods, and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

diffusion_fine-tuning_via_reparameterized_policy_gradient_of_the_soft_q-function.md

q-function-approximation

Trending now