Projects
JARVIS-VLA
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
MCU
MCU: An Evaluation Framework for Open-Ended Game Agents
ROCKET-1
ROCKET- 1: Master Open-World Interaction with Visual-Temporal Context Prompting
OmniJARVIS
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
JARVIS-1
JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models
GROOT
GROOT: Learning to Follow Instructions by Watching Gameplay Videos