We propose a novel approach that enables physically simulated humanoids to learn a variety of basketball skills from human-object demonstrations, such as shooting (blue), retrieving (red), and turnaround layup (yellow). Once acquired, these skills can be reused and combined to accomplish complex tasks, such as continuous scoring (green), which involves dribbling toward the basket, timing the dribble and layup to score, retrieving the rebound, and repeating.
Mastering basketball skills such as diverse layups and dribbling involves complex interactions with the ball and requires real-time adjustments. Traditional reinforcement learning methods for interaction skills rely on labor-intensive, manually designed rewards that do not generalize well across different skills.
Inspired by how humans learn from demonstrations, we propose SkillMimic , a data-driven approach that mimics both human and ball motions to learn a wide variety of basketball skills. SkillMimic employs a unified configuration to learn diverse skills from human-ball motion datasets, with skill diversity and generalization improving as the dataset grows. This approach allows training a single policy to learn multiple skills, enabling smooth skill switching even if these switches are not present in the reference dataset. The skills acquired by SkillMimic can be easily reused by a high-level controller to accomplish complex basketball tasks.
To evaluate our approach, we introduce two basketball datasets: one estimated through monocular RGB videos and the other using advanced motion capture equipment, collectively containing about 35 minutes of diverse basketball skills. Experiments show that our method can effectively learn all the basketball skills contained in the dataset with a unified configuration, including various styles of dribbling, layups, and shooting. Furthermore, by training a high-level controller to reuse the acquired skills, we can achieve complex basketball tasks such as scoring, which involves dribbling toward the basket, timing the dribble and layup to score, retrieving the rebound, and repeating the process.
Our system consists of three parts. (a) First, we capture real-world basketball skills to create a large Human-Object Interaction (HOI) motion dataset. (b) Second, we train a skill policy to learn interaction skills by imitating the corresponding HOI data. A unified HOI imitation reward is designed to imitate diverse HOI state transitions. (c) The third part involves training a High-Level Controller (HLC) to reuse the learned skills for complex tasks, with extremely simple task rewards.
Heading (dribble the ball to target positions)
Circling (dribble the ball along the center points with target radii)
Scoring (lay the ball up to target positions)
Throwing (throw ball to reach a target height)
Get Up
Pick Up
@misc{wang2024skillmimic,
title={SkillMimic: Learning Reusable Basketball Skills from Demonstrations},
author={Yinhuai Wang, Qihan Zhao, Runyi Yu, Ailing Zeng, Jing Lin, Zhengyi Luo, Hok Wai Tsui, Jiwen Yu, Xiu Li, Qifeng Chen, Jian Zhang, Lei Zhang and Ping Tan},
year={2024},
eprint={2408.15270v1},
archivePrefix={arXiv},
primaryClass={cs.CV}
}