You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

2.5 KiB

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

arXiv: 2406.08858 Authors: Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi Fetched: 2026-02-13 Type: Research Paper (CoRL 2024)


Abstract

OmniH2O presents a learning-based framework enabling humanoid robots to be controlled through kinematic pose as a universal control interface. The system supports multiple control modalities: real-time operation via VR headset, verbal commands, and camera input. Beyond teleoperation, the system achieves autonomous operation by learning from demonstrated movements or collaborating with large language models like GPT-4. The work showcases applications across diverse tasks including sports, object manipulation, and human interaction.

Key Contributions

  • Universal control interface: Uses kinematic pose as a unified representation that supports multiple input modalities (VR headset, verbal commands, RGB camera)
  • Sim-to-real pipeline: Developed reinforcement learning methods incorporating large-scale human motion dataset retargeting and augmentation for real-world deployment with minimal sensor requirements
  • Multi-modal control versatility: Demonstrated dexterous whole-body control across multiple real-world tasks through both teleoperation and autonomous modes
  • OmniH2O-6 dataset: Introduced the first humanoid whole-body control dataset containing six everyday tasks for advancing skill learning research
  • Teacher-student learning: Implemented a privileged teacher policy approach to enable policy learning with sparse sensor inputs in physical deployments
  • LLM integration: Demonstrated collaboration with GPT-4 for autonomous task execution from verbal instructions

G1 Relevance

OmniH2O is directly developed and tested on the Unitree G1 platform, making it one of the most relevant research works for G1 whole-body control. The framework provides a complete pipeline from teleoperation to autonomous skill learning specifically validated on the G1. The OmniH2O-6 dataset contains G1-specific task demonstrations. The multi-modal input support (VR, camera, voice) offers flexible integration options for G1 deployments. This builds on the earlier H2O work with expanded capabilities and explicit G1 support.

References