
Welcome to MineStudio!#
MineStudio contains a series of tools and APIs that can help you quickly develop Minecraft AI agents.
Simulator
Easily customizable Minecraft simulator based on MineRL.
Data
A trajectory data structure for efficiently storing and retrieving arbitray trajectory segment.
Models
A template for Minecraft policy model and a gallery of baseline models.
Offline Training
A straightforward pipeline for pre-training Minecraft agents with offline data.
Online Training
Efficient RL implementation supporting memory-based policies and simulator crash recovery.
Inference
Pallarelized and distributed inference framework based on Ray.
Benchmark
Automating and batch-testing of diverse Minecraft task with MCU
API
A comprehensive API reference for MineStudio, including all modules and classes.
This repository is under development. We welcome any contributions and suggestions.
News#
2025/05/28 - We have released a big update of MineStudio (v1.1.4) with the following changes:
Refactored the data component to support more flexible data loading and processing, all the trajectory modals are now decoupled. Users are able to customize their own data processing methods.
Added detailed code comments and docstrings to all the modules, making it easier to understand and use the code.
Improved the documentation with more examples, tutorials, and a new API reference section.
Quick Start#
conda create -n minestudio --channel=conda-forge python=3.10 openjdk=8 -y
conda activate minestudio
pip install minestudio
# use xvfb for example
sudo apt install -y xvfb mesa-utils libegl1-mesa libgl1-mesa-dev libglu1-mesa-dev
python -m minestudio.simulator.entry
The simulator can be started with xvfb
or virtualGL
. See Installation for more details. We also provide a Docker image to run MineStudio in a container.
from minestudio.simulator import MinecraftSim
# Import callbacks from minestudio.simulator.callbacks if needed
sim = MinecraftSim(
obs_size=(224, 224), render_size=(640, 360),
callbacks = [...]
)
obs, info = sim.reset()
We provide a sets of callbacks to customize the simulator behavior, such as monitoring FPS, sending Minecraft commands, and more. You can also create your own callbacks by inheriting from BaseCallback
.
# Create a raw dataset from 6xx dataset
from minestudio.data import RawDataset
from minestudio.data.minecraft.callbacks import ImageKernelCallback, ActionKernelCallback
dataset = RawDataset(
dataset_dirs=[
'/nfs-shared-2/data/contractors/dataset_6xx',
],
modal_kernel_callbacks=[
ImageKernelCallback(frame_width=224, frame_height=224, enable_video_aug=False),
ActionKernelCallback(enable_prev_action=True, win_bias=1, read_bias=-1),
],
win_len=128,
split_ratio=0.9,
shuffle_episodes=True,
)
item = dataset[0]
print(item.keys())
We classify and save the data according to its corresponding modality, e.g. image
, action
, etc. You can incorporate your own data modalities via defining custom callbacks.
from minestudio.models import VPTPolicy
policy = VPTPolicy.from_pretrained("CraftJarvis/MineStudio_VPT.rl_from_early_game_2x").to("cuda")
policy.eval()
memory = None
# Interacting with the simulator
action, memory = policy.get_action(obs, memory, input_shape='*')
obs, reward, terminated, truncated, info = env.step(action)
We implemented a template for Minecraft policy model, which can be used to build your own models or load pre-trained models. We also provide a gallery of baseline models, including VPT, Groot, and more. You can find the pre-trained models on Hugging Face.
from minestudio.data import MineDataModule
from minestudio.offline import MineLightning
from minestudio.offline.mine_callbacks import BehaviorCloneCallback
mine_lightning = MineLightning(
mine_policy=policy,
learning_rate=lr,
warmup_steps=warmup_steps,
weight_decay=decay,
callbacks=[BehaviorCloneCallback(weight=bc_weight)]
)
mine_data = MineDataModule(
data_params=data_params_dict,
# additional parameters
)
L.Trainer(
callbacks=[
# callbacks for training, e.g. SmartCheckpointCallback, LearningRateMonitor, etc.
]
).fit(model=mine_lightning, datamodule=mine_data)
We make it easy to train Minecraft agents with offline data. Users can define their own data processing methods and training callbacks. The training pipeline is based on PyTorch Lightning, which allows users to easily customize the training process.
import OmegaConf
from minestudio.online.rollout.start_manager import start_rolloutmanager
from minestudio.online.trainer.start_trainer import start_trainer
def policy_generator():
return policy
def env_generator():
# customize env with reward and task callbacks
return sim
online_config = OmegaConf.create(online_dict) # online_dict is training configuration
start_rolloutmanager(policy_generator, env_generator, online_cfg)
start_trainer(policy_generator, env_generator, online_cfg)
We implemented a distributed online training framework based on Ray. It supports memory-based policies and simulator crash recovery. Users can customize the training process by defining their own policy and environment generators, as well as the training configuration.
from minestudio.inference import EpisodePipeline, MineGenerator, InfoBaseFilter
import ray
ray.init()
env_generator = partial(MinecraftSim, ...)
agent_generator = lambda: ... # load your policy model here
worker_kwargs = dict(
env_generator = env_generator,
agent_generator = agent_generator,
) # additional parameters for the worker
pipeline = EpisodePipeline(
episode_generator = MineGenerator(
num_workers = 8, # the number of workers
num_gpus = 0.25, # the number of gpus
max_restarts = 3, # the maximum number of restarts for failed workers
**worker_kwargs,
),
episode_filter = InfoBaseFilter(
key="mine_block",
regex=".*log.*",
num=1,
), # InfoBaseFilter will label episodes mine more than 1 *.log
)
summary = pipeline.run()
print(summary)
The distributed inference framework is designed to efficiently evaluate Minecraft agents in parallel. It allows users to filter episodes based on specific criteria. Users can customize the episode generator and filter to suit their needs.
custom_init_commands:
- /give @s minecraft:water_bucket 3
- /give @s minecraft:stone 64
- /give @s minecraft:dirt 64
- /give @s minecraft:shovel{Enchantments:[{id:"minecraft:efficiency",lvl:1}]} 1
text: Build a waterfall in your Minecraft world.
We provide a set of benchmark tasks for Minecraft agents, which can be used to evaluate the performance of the agents. The tasks are defined in YAML format, and users can easily add their own tasks. The benchmark tasks are designed to be compatible with MCU, a Minecraft task automation framework.
Gallery#
Datasets#
We converted the Contractor Data the OpenAI VPT project provided to our trajectory structure and released them to the Hugging Face. (The old dataset is only available in v1.0.6 and earlier versions. From v1.1.0, we have changed the dataset structure to support more flexible data loading and processing.)
Dataset | Description | Copyright |
---|---|---|
6xx | Free Gameplay | OpenAI |
7xx | Early Game | OpenAI |
8xx | House Building from Scratch | OpenAI |
9xx | House Building from Random Materials | OpenAI |
10xx | Obtain Diamond Pickaxe | OpenAI |
Models#
We have included a gallery of SOTA pre-trained agents in Minecraft, such as VPTs, GROOT, STEVE-1, ROCKETs. These models are trained by us or other researchers and are available on Hugging Face. You can use them directly in your projects or as a starting point for further training and fair comparison.
Model | Description | Copyright |
---|---|---|
VPT Foundation Model 1x | Behavior cloning on all contractor data, 71M parameters | OpenAI |
VPT Foundation Model 2x | Behavior cloning on all contractor data, 248M parameters | OpenAI |
VPT Foundation Model 3x | Behavior cloning on all contractor data, 0.5B parameters | OpenAI |
VPT-BC Early Game 2x | Behavior cloning on early game data | OpenAI |
VPT-RL from House 2x | RL from VPT-BC House | OpenAI |
VPT-RL from Early Game 2x | RL from VPT-BC Early Game | OpenAI |
VPT-BC House 3x | Behavior cloning from house building data | OpenAI |
VPT-BC Early Game 3x | Behavior cloning from early game data | OpenAI |
VPT-RL Shoot Animals 2x | RL for shooting animals | CraftJarvis |
VPT-RL Build Portal 2x | RL for building nether portal | CraftJarvis |
GROOT | Self-supervised training to follow demonstration videos | CraftJarvis |
STEVE-1 | Language/image-conditioned Minecraft agent | STEVE-1 |
ROCKET-1 | Segment-conditioned agent powered by SAM-2 and VLMs | CraftJarvis |
ROCKET-2 1x | Segment-conditioned agent capable of handling cross-view instructions | CraftJarvis |
ROCKET-2 1.5x | Segment-conditioned agent capable of handling cross-view instructions | CraftJarvis |