2.5 KiB

Raw Permalink Blame History

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

arXiv: 2406.08858 Authors: Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi Fetched: 2026-02-13 Type: Research Paper (CoRL 2024)

Abstract

OmniH2O presents a learning-based framework enabling humanoid robots to be controlled through kinematic pose as a universal control interface. The system supports multiple control modalities: real-time operation via VR headset, verbal commands, and camera input. Beyond teleoperation, the system achieves autonomous operation by learning from demonstrated movements or collaborating with large language models like GPT-4. The work showcases applications across diverse tasks including sports, object manipulation, and human interaction.

Key Contributions

Universal control interface: Uses kinematic pose as a unified representation that supports multiple input modalities (VR headset, verbal commands, RGB camera)
Sim-to-real pipeline: Developed reinforcement learning methods incorporating large-scale human motion dataset retargeting and augmentation for real-world deployment with minimal sensor requirements
Multi-modal control versatility: Demonstrated dexterous whole-body control across multiple real-world tasks through both teleoperation and autonomous modes
OmniH2O-6 dataset: Introduced the first humanoid whole-body control dataset containing six everyday tasks for advancing skill learning research
Teacher-student learning: Implemented a privileged teacher policy approach to enable policy learning with sparse sensor inputs in physical deployments
LLM integration: Demonstrated collaboration with GPT-4 for autonomous task execution from verbal instructions

G1 Relevance

OmniH2O is directly developed and tested on the Unitree G1 platform, making it one of the most relevant research works for G1 whole-body control. The framework provides a complete pipeline from teleoperation to autonomous skill learning specifically validated on the G1. The OmniH2O-6 dataset contains G1-specific task demonstrations. The multi-modal input support (VR, camera, voice) offers flexible integration options for G1 deployments. This builds on the earlier H2O work with expanded capabilities and explicit G1 support.

References

Project Page: https://omni.human2humanoid.com/
GitHub: https://github.com/LeCAR-Lab/human2humanoid
arXiv: https://arxiv.org/abs/2406.08858