../_images/banner.png

Getting Started#

Before you start, make sure you have installed MineStudio and its dependencies.

minestudio Install MineStudio

Installation

Note

If you encounter any issues during installation, please open an issue on GitHub.

Welcome to MineStudio, please follow the tutorial below for installation.

Install JDK 8

To ensure that the Simulator runs smoothly, please make sure that JDK 8 is installed on your system. We recommend using conda to maintain an environment on Linux systems.

$ conda create -n minestudio python=3.10 -y
$ conda activate minestudio
$ conda install --channel=conda-forge openjdk=8 -y

Install MineStudio

a. Install MineStudio from the GitHub.

$ pip install git+https://github.com/CraftJarvis/MineStudio.git

b. Install MineStudio from PyPI.

$ pip install minestudio

Install the rendering tool

For users with nvidia graphics cards, we recommend installing VirtualGL; for other users, we recommend using Xvfb, which supports CPU rendering but is relatively slower.

Note

Installing rendering tools may require root permissions.

There are two options:

$ apt update 
$ apt install -y xvfb mesa-utils libegl1-mesa libgl1-mesa-dev libglu1-mesa-dev 

Warning

Not all graphics cards support virtualGL. If you do not have speed requirements, it is recommended to use the easier-to-install xvfb rendering tool.

You need to download the following sources:

$ apt update 
$ apt install -y xvfb mesa-utils libegl1-mesa libgl1-mesa-dev libglu1-mesa-dev 

Install the downloaded package.

$ dpkg -i virtualgl_3.1_amd64.deb

Shutdown the display manager and configure VirtualGL.

$ service gdm stop 

Configure VirtualGL.

$ /opt/VirtualGL/bin/vglserver_config 

Note

First choose 1,then Yes, No, No, No,finally enter X

Start the display manager.

$ service gdm start

Start the VirtualGL server.

$ bash vgl_entrypoint.sh

Warning

Each time the system is restarted, it may be necessary to run vgl_entrypoint.sh.

Configure the environment variables.

$ export PATH="${PATH}:/opt/VirtualGL/bin" 
$ export LD_LIBRARY_PATH="/usr/lib/libreoffice/program:${LD_LIBRARY_PATH}" 
$ export VGL_DISPLAY="egl" 
$ export VGL_REFRESHRATE="$REFRESH"
$ export DISPLAY=:1

Verify by running simulator

Hint

The first time you run it, the script will ask whether to download the compiled model from Hugging Face; just choose Y.

If you are using Xvfb, run the following command:

$ python -m minestudio.simulator.entry

If you are using VirtualGL, run the following command:

$ MINESTUDIO_GPU_RENDER=1 python -m minestudio.simulator.entry

If you see the following output, the installation is successful.

Speed Test Status: 
Average Time: 0.03 
Average FPS: 38.46 
Total Steps: 50 

Speed Test Status: 
Average Time: 0.02 
Average FPS: 45.08 
Total Steps: 100 

MineStudio Libraries Quickstart#

Click on the dropdowns for your desired library to get started:

minestudio Simulator: Customizable Minecraft Environment

Here is a minimal example of how to use the simulator:

from minestudio.simulator import MinecraftSim

sim = MinecraftSim(action_type="env")
obs, info = sim.reset()
for _ in range(100):
    action = sim.action_space.sample()
    obs, reward, terminated, truncated, info = sim.step(action)
sim.close()

Also, you can customize the environment by chaining multiple callbacks. Here is an example:

import numpy as np
from minestudio.simulator import MinecraftSim
from minestudio.simulator.callbacks import (
    SpeedTestCallback, 
    RecordCallback, 
    SummonMobsCallback, 
    MaskActionsCallback, 
    RewardsCallback, 
    CommandsCallback, 
    FastResetCallback
)

sim = MinecraftSim(
    action_type="env",
    callbacks=[
        SpeedTestCallback(50), 
        SummonMobsCallback([{'name': 'cow', 'number': 10, 'range_x': [-5, 5], 'range_z': [-5, 5]}]),
        MaskActionsCallback(inventory=0, camera=np.array([0., 0.])), 
        RecordCallback(record_path="./output", fps=30),
        RewardsCallback([{
            'event': 'kill_entity', 
            'objects': ['cow', 'sheep'], 
            'reward': 1.0, 
            'identity': 'kill sheep or cow', 
            'max_reward_times': 5, 
        }]),
        CommandsCallback(commands=[
            '/give @p minecraft:iron_sword 1',
            '/give @p minecraft:diamond 64',
        ]), 
        FastResetCallback(
            biomes=['mountains'],
            random_tp_range=1000,
        )
    ]
)
obs, info = sim.reset()
print(sim.action_space)
for i in range(100):
    action = sim.action_space.sample()
    obs, reward, terminated, truncated, info = sim.step(action)
sim.close()

Learn more about MineStudio Simulator

minestudio Data: Flexible Data Structures and Fast Dataloaders

Here is a minimal example to show how we load a trajectory from the dataset.

from minestudio.data import load_dataset

dataset = load_dataset(
    mode='raw', 
    dataset_dirs=['6xx', '7xx', '8xx', '9xx', '10xx'], 
    enable_video=True,
    enable_action=True,
    frame_width=224, 
    frame_height=224,
    win_len=128, 
    split='train', 
    split_ratio=0.9, 
    verbose=True
)
item = dataset[0]
print(item.keys())

You may see the output like this:

[08:14:15] [Kernel] Driver video load 15738 episodes.  
[08:14:15] [Kernel] Driver action load 15823 episodes. 
[08:14:15] [Kernel] episodes: 15655, frames: 160495936. 
dict_keys(['text', 'timestamp', 'episode', 'progress', 'env_action', 'agent_action', 'env_prev_action', 'agent_prev_action', 'image', 'mask'])

Hint

Please note that the dataset_dirs parameter here is a list that can contain multiple dataset directories. In this example, we have loaded five dataset directories.

If an element in the list is one of 6xx, 7xx, 8xx, 9xx, or 10xx, the program will automatically download it from Hugging Face, so please ensure your network connection is stable and you have enough storage space.

If an element in the list is a directory like /nfs-shared/data/contractors/dataset_6xx, the program will load data directly from that directory.

You can also mix the two types of elements in the list.

Learn more about Raw Dataset

Alternatively, you can also load trajectories that have specific events, for example, loading all trajectories that contain the kill entity event.

from minestudio.data import load_dataset

dataset = load_dataset(
    mode='event', 
    dataset_dirs=['7xx'], 
    enable_video=True,
    enable_action=True,
    frame_width=224, 
    frame_height=224,
    win_len=128, 
    split='train', 
    split_ratio=0.9, 
    verbose=True,
    event_regex='minecraft.kill_entity:.*'
)
item = dataset[0]
print(item.keys())

You may see the output like this:

[08:19:14] [Kernel] Driver video load 4617 episodes.
[08:19:14] [Kernel] Driver action load 4681 episodes. 
[08:19:14] [Kernel] episodes: 4568, frames: 65291168. 
[08:19:14] [Event Kernel] Number of loaded events: 58. 
[08:19:14] [Event Dataset] Regex: minecraft.kill_entity:.*, Number of events: 58, number of items: 19652
dict_keys(['text', 'env_action', 'agent_action', 'env_prev_action', 'agent_prev_action', 'image', 'mask'])

Learn more about Event Dataset

minestudio Models: Policy Template and Baselines

Here is an example that shows how to load the OpenAI’s VPT policy in the Minecraft environment.

from minestudio.simulator import MinecraftSim
from minestudio.simulator.callbacks import RecordCallback
from minestudio.models import load_vpt_policy, VPTPolicy

# load the policy from the local model files
policy = load_vpt_policy(
    model_path="/path/to/foundation-model-2x.model", 
    weights_path="/path/to/foundation-model-2x.weights"
).to("cuda")

# or load the policy from the Hugging Face model hub
policy = VPTPolicy.from_pretrained("CraftJarvis/MineStudio_VPT.rl_from_early_game_2x").to("cuda")

policy.eval()

env = MinecraftSim(
    obs_size=(128, 128), 
    callbacks=[RecordCallback(record_path="./output", fps=30, frame_type="pov")]
)
memory = None
obs, info = env.reset()
for i in range(1200):
    action, memory = policy.get_action(obs, memory, input_shape='*')
    obs, reward, terminated, truncated, info = env.step(action)
env.close()

Hint

In this example, the recorded video will be saved in the ./output directory.

minestudio Offline: Pre-Training Policy with Offline Data

Tutorial: Fine-tuning VPT to a Hunter

Fine-tune a VPT policy in MineStudio is really simple.

The following code snippet shows how to finetune a VPT policy to hunt animals in Minecraft using offline data.

  1. Import some dependencies:

    import lightning as L
    from lightning.pytorch.loggers import WandbLogger
    from lightning.pytorch.callbacks import LearningRateMonitor
    # below are MineStudio dependencies
    from minestudio.data import MineDataModule
    from minestudio.offline import MineLightning
    from minestudio.models import load_vpt_policy, VPTPolicy
    from minestudio.offline.mine_callbacks import BehaviorCloneCallback
    from minestudio.offline.lightning_callbacks import SmartCheckpointCallback, SpeedMonitorCallback
    
  2. Configure the policy model and the training process:

    policy = VPTPolicy.from_pretrained("CraftJarvis/MineStudio_VPT.foundation_model_2x")
    mine_lightning = MineLightning(
        mine_policy=policy,
        learning_rate=0.00004,
        warmup_steps=2000,
        weight_decay=0.000181,
        callbacks=[BehaviorCloneCallback(weight=0.01)]
    )
    
  3. Configure the data module that contains all the kill_entity trajectory segments:

    episode_continuous_batch = True
    mine_data = MineDataModule(
        data_params=dict(
            mode='event', 
            dataset_dirs=['10xx'],
            win_len=128,
            frame_width=128,
            frame_height=128,
            event_regex="minecraft.kill_entity:.*"
            bias=16,
            min_nearby=64,
        )
        batch_size=8,
        num_workers=8,
        prefetch_factor=4,
        split_ratio=0.9, 
        shuffle_episodes=True,
        episode_continuous_batch=episode_continuous_batch,
    )
    

    Warning

    If episode_continuous_batch=True, then the dataloader will automatically use our distributed batch sampler. When configuring the trainer, we need to set use_distributed_sampler=False.

  4. Configure the trainer with your preferred PyTorch Lightning callbacks:

    L.Trainer(
        logger=WandbLogger(project="minestudio-vpt"), 
        devices=8, 
        precision="bf16", 
        strategy='ddp_find_unused_parameters_true', 
        use_distributed_sampler=not episode_continuous_batch, 
        gradient_clip_val=1.0, 
        callbacks=[
            LearningRateMonitor(logging_interval='step'), 
            SpeedMonitorCallback(),
            SmartCheckpointCallback(
                dirpath='./weights', filename='weight-{epoch}-{step}', save_top_k=-1, 
                every_n_train_steps=2000, save_weights_only=True,
            ), 
            SmartCheckpointCallback(
                dirpath='./checkpoints', filename='ckpt-{epoch}-{step}', save_top_k=1, 
                every_n_train_steps=2000+1, save_weights_only=False,
            )
        ]
    ).fit(model=mine_lightning, datamodule=mine_data)
    
minestudio Online: Finetuning Policy via Online Interaction

We provide a simple example in online/run. You can fine-tune VPT to complete the task of killing cows by directly running:

cd online/run
bash run.sh

Specifically, this process includes several important configurations:

Policy Generator

which does not accept parameters and directly returns MinePolicy.As an example:

def policy_generator():
    from minestudio.models.openai_vpt.body import load_openai_policy
    model_path = 'pretrained/foundation-model-2x.model'
    weights_path = 'pretrained/bc-from-early-game-2x.weights'
    policy = load_openai_policy(model_path, weights_path)
    return policy

Environment Generator

which does not accept parameters and directly returns MinePolicy. As an example:

def env_generator():
    from minestudio.simulator import MinecraftSim
    from minestudio.simulator.callbacks import (
        SummonMobsCallback, 
        MaskActionsCallback, 
        RewardsCallback, 
        CommandsCallback, 
        JudgeResetCallback,
        FastResetCallback
    )
    sim = MinecraftSim(
        obs_size=(128, 128), 
        preferred_spawn_biome="plains", 
        action_type = "agent",
        timestep_limit=1000,
        callbacks=[
            SummonMobsCallback([{'name': 'cow', 'number': 10, 'range_x': [-5, 5], 'range_z': [-5, 5]}]),
            MaskActionsCallback(inventory=0), 
            RewardsCallback([{
                'event': 'kill_entity', 
                'objects': ['cow'], 
                'reward': 1.0, 
                'identity': 'chop_tree', 
                'max_reward_times': 30, 
            }]),
            CommandsCallback(commands=[
                '/give @p minecraft:iron_sword 1',
                '/give @p minecraft:diamond 64',
                '/effect @p 5 9999 255 true',
            ]),
            FastResetCallback(
                biomes=['plains'],
                random_tp_range=1000,
            ),
            JudgeResetCallback(600),
        ]
    )
    return sim

Config which provide the hyper-parameters for online training:

online_dict = {
 "trainer_name": "PPOTrainer",
    "detach_rollout_manager": True,
    "rollout_config": {
        "num_rollout_workers": 2,
        "num_gpus_per_worker": 1.0,
        "num_cpus_per_worker": 1,
        "fragment_length": 256,
        "to_send_queue_size": 6,
        "worker_config": {
            "num_envs": 12,
            "batch_size": 6,
            "restart_interval": 3600,  # 1h
            "video_fps": 20,
            "video_output_dir": "output/videos",
        },
        "replay_buffer_config": {
            "max_chunks": 4800,
            "max_reuse": 2,
            "max_staleness": 2,
            "fragments_per_report": 40,
            "fragments_per_chunk": 1,
            "database_config": {
                "path": "output/replay_buffer_cache",
                "num_shards": 8,
            },
        },
        "episode_statistics_config": {},
    },
    "train_config": {
        "num_workers": 2,
        "num_gpus_per_worker": 1.0,
        "num_iterations": 4000,
        "vf_warmup": 0,
        "learning_rate": 0.00002,
        "anneal_lr_linearly": 
        "weight_decay": 0.04,
        "adam_eps": 1e-8,
        "batch_size_per_gpu": 1,
        "batches_per_iteration": 200,
        "gradient_accumulation": 10, 
        "epochs_per_iteration": 1, 
        "context_length": 64,
        "discount": 0.999,
        "gae_lambda": 0.95,
        "ppo_clip": 0.2,
        "clip_vloss": False, 
        "max_grad_norm": 5, 
        "zero_initial_vf": True,
        "ppo_policy_coef": 1.0,
        "ppo_vf_coef": 0.5, 
        "kl_divergence_coef_rho": 0.2,
        "entropy_bonus_coef": 0.0,
        "coef_rho_decay": 0.9995,
        "log_ratio_range": 50,  
        "normalize_advantage_full_batch": True, 
        "use_normalized_vf": True,
        "num_readers": 4,
        "num_cpus_per_reader": 0.1,
        "prefetch_batches": 2,
        "save_interval": 10,
        "keep_interval": 40,
        "record_video_interval": 2,
        "fix_decoder": False,
        "resume": None, 
        "resume_optimizer": True,
        "save_path": "output"
    },

    "logger_config": {
        "project": "minestudio_online",
        "name": "bow_cow"
    },
}

online_config = OmegaConf.create(online_dict)

After preparing all the above content, run:

from minestudio.online.rollout.start_manager import start_rolloutmanager
from minestudio.online.trainer.start_trainer import start_trainer
start_rolloutmanager(policy_generator, env_generator, online_cfg)
start_trainer(policy_generator, env_generator, online_cfg)

to start online training.

We use wandb to log and monitor the progress of the run. The corresponding parameters passed to wandb are in config.logger_config. When save_path is None, the checkpoint will be saved to Ray’s working directory at ~/ray_results.

minestudio Inference: Parallel Inference and Record Demonstrations

Here is a minimal example of how to use the inference framework:

import ray
from minestudio.inference import EpisodePipeline, MineGenerator, InfoBaseFilter

from functools import partial
from minestudio.models import load_vpt_policy
from minestudio.simulator import MinecraftSim

if __name__ == '__main__':
    ray.init()
    env_generator = partial(
        MinecraftSim, 
        obs_size = (128, 128), 
        preferred_spawn_biome = "forest", 
    ) # generate the environment
    agent_generator = lambda: VPTPolicy.from_pretrained("CraftJarvis/MineStudio_VPT.rl_from_early_game_2x") # generate the agent
    worker_kwargs = dict(
        env_generator = env_generator, 
        agent_generator = agent_generator,
        num_max_steps = 12000, # provide the maximum number of steps
        num_episodes = 2, # provide the number of episodes for each worker
        tmpdir = "./output",
        image_media = "h264",
    ) # provide the worker kwargs
    pipeline = EpisodePipeline(
        episode_generator = MineGenerator(
            num_workers = 8, # the number of workers
            num_gpus = 0.25, # the number of gpus
            max_restarts = 3, # the maximum number of restarts for failed workers
            **worker_kwargs, 
        ),
        episode_filter = InfoBaseFilter(
            key = "mine_block",
            val = "diamond_ore",
            num = 1,
        ), # InfoBaseFilter will label episodes mine more than 1 diamond_ore
    )
    summary = pipeline.run()
    print(summary)
minestudio Benchmark: Benchmarking and Evaluation

Papers#

Our libraries directly support models from the following papers: