PDP

Abstract

Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multimodal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.

Video

Method

Our method consists of three steps. First, we divide a motion dataset into tasks and train an RL policy for each task. Next, we use the expert policies to collect "noisy-state clean-action" trajectories, where the noisy state is obtained by executing actions from the RL policy with added noise, and the clean action is simply the action from the RL policy without noise. The clean action can be thought of as a corrective action from a noisy state. Lastly, we use the resulting dataset to train a diffusion policy using supervised learning.

Results

Perturbation Recovery

Diffusion Policy captures the multimodality of perturbation recoverty strategies. Given the same perturbation, the character can recover in multiple ways. Capturing this multimodality allows for increased robustness to out-of-distribution perturbations.

Motion Tracking

PDP is capable of tracking a wide range of dynamic motions. We successfully track 98.9% of all AMASS* motions. *Dataset does not contain motions that are infeasible in our simulator, such as those involving object interactions.

Walk

Squat

Kick

Handstand

Breakdance

Backflip

Text-to-Motion

PDP is capable of generating diverse and realistic human motion from text descriptions. We can also chain together text commands to synthesize novel motion sequences.

"A person walks clockwise in a circle"

"Kneel down on the ground"

"A person jumps in the air"

BibTeX


        @inproceedings{10.1145/3680528.3687683,
            author = {Truong, Takara Everest and Piseno, Michael and Xie, Zhaoming and Liu, Karen},
            title = {PDP: Physics-Based Character Animation via Diffusion Policy},
            year = {2024},
            isbn = {9798400711312},
            publisher = {Association for Computing Machinery},
            address = {New York, NY, USA},
            url = {https://doi.org/10.1145/3680528.3687683},
            doi = {10.1145/3680528.3687683},
            articleno = {86},
            numpages = {10},
            keywords = {character animation, reinforcement learning, diffusion models},
            series = {SA '24}
        }

PDP: Physics-Based Character Animation via Diffusion Policy

SIGGRAPH Asia 2024