Legibility Diffuser: Offline Imitation for Intent Expressive Motion

TLDR

This paper proposes Legibility Diffuser, a diffusion-based policy that learns intent expressive motion directly from human demonstrations by variably combining the noise predictions from a goal-conditioned diffusion model and finds that decaying the guidance weight over the course of the trajectory is critical for maintaining a high success rate while maximizing legibility.

摘要

In human-robot collaboration, legible motion that conveys a robot's intentions and goals is known to improve safety, task efficiency, and user experience. Legible robot motion is typically generated using hand-designed cost functions and classical motion planners. However, with the rise of deep learning and data-driven robot policies, we need methods for training end-to-end on offline demonstration data. In this paper, we propose Legibility Diffuser, a diffusion-based policy that learns intent expressive motion directly from human demonstrations. By variably combining the noise predictions from a goal-conditioned diffusion model, we guide the robot's motion toward the most legible trajectory in the training dataset. We find that decaying the guidance weight over the course of the trajectory is critical for maintaining a high success rate while maximizing legibility.