Hierarchical Diffusion Policy: Manipulation Trajectory Generation via Contact Guidance
Dexin Wang,Chunsheng Liu,F. Chang,Yichen Xu
TLDR
A set of key technical contributions including one-shot gradient optimization, trajectory augmentation, and prompt guidance are proposed, which improve the policy's optimization efficiency, spatial awareness, and interactivity respectively.
摘要
Decision-making in robotics using denoising diffusion processes has increasingly become a hot research topic, but end-to-end policies perform poorly in tasks with rich contact and have limited interactivity. This article proposes Hierarchical Diffusion Policy (HDP), a new robot manipulation policy of using contact points to guide the generation of robot trajectories. The policy is divided into two layers: the high-level policy predicts the contact for the robot's next object manipulation based on 3-D information, while the low-level policy predicts the action sequence toward the high-level contact based on the latent variables of observation and contact. We represent both-level policies as conditional denoising diffusion processes, and combine behavioral cloning and Q-learning to optimize the low-level policy for accurately guiding actions towards contact. We benchmark Hierarchical Diffusion Policy across six different tasks and find that it significantly outperforms the existing state-of-the-art imitation learning method Diffusion Policy with an average improvement of 20.8% . We find that contact guidance yields significant improvements, including superior performance, greater interpretability, and stronger interactivity, especially on contact-rich tasks. To further unlock the potential of HDP, this article proposes a set of key technical contributions including one-shot gradient optimization, trajectory augmentation, and prompt guidance, which improve the policy's optimization efficiency, spatial awareness, and interactivity respectively. Finally, real-world experiments verify that HDP can handle both rigid and deformable objects.
