Data API Documentation#

Core Dataset#

Date: 2025-01-09 05:45:49 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-03-17 21:54:30 FilePath: /MineStudio/minestudio/data/minecraft/core.py

class minestudio.data.minecraft.core.KernelManager(dataset_dirs: List[str], modal_kernel_callbacks: List[ModalKernelCallback], verbose: bool = True)[source]#

Manages multiple ModalKernel instances, providing a unified interface for accessing data from different modalities (e.g., video, actions, metadata) in a dataset.

It loads and organizes data from specified dataset directories, ensuring consistency across modalities and episodes.

Parameters:

dataset_dirs (List[str]) – A list of paths to dataset directories. Each directory is expected to contain subdirectories for different modalities.
modal_kernel_callbacks (List[ModalKernelCallback]) – A list of ModalKernelCallback objects, one for each modality to be managed.
verbose (bool, optional) – If True, prints logging information during initialization. Defaults to True.

get_episodes_with_length()[source]#

Returns an OrderedDict mapping common episode names to their lengths (number of frames).

Returns:: An OrderedDict where keys are episode names and values are their lengths.
Return type:: OrderedDict

get_num_frames()[source]#

Returns the total number of frames across all common episodes and modalities.

Returns:: The total number of frames.
Return type:: int

load_modal_kernels()[source]#

Loads a ModalKernel for each modality specified in modal_kernel_callbacks.

It iterates through the callbacks, creates a ModalKernel for each, and stores them in the kernels dictionary. It also determines the common episodes across all modalities and calculates the total number of frames.

read(eps: str, start: int, win_len: int, skip_frame: int, **kwargs) → Dict[source]#

Reads and returns data for all managed modalities for a given episode and window.

It iterates through each loaded kernel, calls its read_frames method, and aggregates the results into a single dictionary.

Parameters:

eps (str) – The name of the episode.
start (int) – The starting frame index.
win_len (int) – The desired window length (number of frames).
skip_frame (int) – The number of frames to skip between selected frames.
**kwargs – Additional arguments passed to the read_frames method of each kernel.

Returns:

A dictionary containing data from all modalities for the specified window.

Return type:

Dict

class minestudio.data.minecraft.core.ModalKernel(source_dirs: List[str], modal_kernel_callback: ModalKernelCallback, short_name: bool = False)[source]#

Manages and provides access to data for a single modality (e.g., video, actions) from a collection of LMDB datasets. It merges metadata and provides methods to read chunks and frames of data for specific episodes.

Parameters:

source_dirs (List[str]) – A list of directory paths, each containing LMDB files for the modality.
modal_kernel_callback (ModalKernelCallback) – A callback object to handle modality-specific operations like data merging, slicing, and padding.
short_name (bool, optional) – If True, episode names are hashed to a shorter length. Defaults to False.

get_episode_list() → List[str][source]#

Returns a list of all episode names managed by this kernel.

Returns:: A list of episode names.
Return type:: List[str]

get_num_frames(episodes: List[str] | None = None)[source]#

Calculates and returns the total number of frames for the specified episodes.

If no episodes are provided, it calculates the total frames for all episodes managed by this kernel.

Parameters:: episodes (Optional[List[str]], optional) – An optional list of episode names. If None, all episodes are considered.
Returns:: The total number of frames.
Return type:: int

property name#

Returns the name of the modality, as defined by the modal_kernel_callback.

Returns:: The name of the modality.
Return type:: str

read_chunks(eps: str, start: int, end: int) → List[bytes][source]#

Reads and returns a list of data chunks for a given episode and frame range.

The start and end parameters specify the frame-level indices, which must be multiples of the chunk_size.

Parameters:

eps (str) – The name of the episode to read from.
start (int) – The starting frame index (inclusive, multiple of chunk_size).
end (int) – The ending frame index (inclusive, multiple of chunk_size).

Returns:

A list of byte strings, where each string is a data chunk.

Return type:

List[bytes]

Raises:

AssertionError – If start or end are not multiples of chunk_size.

read_frames(eps: str, start: int, win_len: int, skip_frame: int, **kwargs) → Dict[source]#

Reads, processes, and returns a dictionary of frames for a given episode and window.

This method handles reading data chunks, merging them into continuous frames, slicing based on skip_frame, and padding if necessary. It utilizes the modal_kernel_callback for modality-specific operations.

Parameters:

eps (str) – The name of the episode.
start (int) – The starting frame index.
win_len (int) – The desired window length (number of frames).
skip_frame (int) – The number of frames to skip between selected frames.
**kwargs – Additional arguments passed to the modal_kernel_callback.

Returns:

A dictionary containing the processed frames and a corresponding mask. The keys are formatted as “{modality_name}” and “{modality_name}_mask”.

Return type:

Dict

Event Dataset#

Date: 2024-11-10 10:26:52 LastEditors: muzhancun muzhancun@stu.pku.edu.cn LastEditTime: 2025-05-27 15:24:46 FilePath: /MineStudio/minestudio/data/minecraft/dataset_event.py

class minestudio.data.minecraft.dataset_event.EventDataModule(*args: Any, **kwargs: Any)[source]#

A PyTorch Lightning DataModule for handling event-based datasets.

This class encapsulates the train and validation EventDataset instances and provides corresponding DataLoaders. It simplifies the data handling pipeline for training and evaluating models with PyTorch Lightning.

Parameters:

data_params (Dict) – Dictionary of parameters to be passed to the EventDataset constructor. This includes dataset_dirs, modal_kernel_callbacks, win_len, etc.
batch_size (int, optional) – The batch size for the DataLoaders, defaults to 1.
num_workers (int, optional) – The number of worker processes for data loading, defaults to 0.
prefetch_factor (Optional[int]) – Number of batches loaded in advance by each worker. Defaults to None. See PyTorch DataLoader documentation for more details.

setup(stage: str | None = None)[source]#

Sets up the training and validation datasets.

This method is called by PyTorch Lightning at the appropriate time (e.g., before training or validation). It instantiates EventDataset for both ‘train’ and ‘val’ splits using the provided data_params.

Parameters:: stage (Optional[str]) – A string indicating the current stage (e.g., ‘fit’, ‘validate’, ‘test’). Not directly used in this implementation but part of the Lightning interface. Defaults to None.

train_dataloader()[source]#: Sets up the DataLoader for the training dataset.

val_dataloader()[source]#: Sets up the DataLoader for the validation dataset.

class minestudio.data.minecraft.dataset_event.EventDataset(*args: Any, **kwargs: Any)[source]#

A PyTorch Dataset for loading sequences of data centered around specific game events.

This dataset uses an EventKernelManager to identify occurrences of specified events (filtered by event_regex, min_nearby, max_within) and a KernelManager to retrieve the actual multi-modal data (like images, actions, etc.) for a window of time (win_len) around each event. It supports splitting into training and validation sets.

Parameters:

dataset_dirs (List[str]) – List of directories where the raw dataset (LMDBs for different modalities) is stored.
modal_kernel_callbacks (List[Union[str, ModalKernelCallback]]) – List of ModalKernelCallback instances or their registered names. These define how data for each modality is fetched and processed.
modal_kernel_config (Optional[Dict]) – Optional configuration dictionary for creating ModalKernelCallback instances if their names are provided. Defaults to None.
win_len (int) – The length of the window (number of frames/steps) to retrieve around each event. Defaults to 1.
skip_frame (int) – The number of frames to skip between consecutive frames in the window. Defaults to 1 (no skip).
split (Literal['train', 'val']) – Specifies whether this dataset instance is for ‘train’ or ‘val’. Defaults to ‘train’.
split_ratio (float) – The ratio of data to be used for the training set. The rest is for validation. Defaults to 0.8.
verbose (bool) – If True, logs information during setup. Defaults to True.
event_paths (Optional[List[str]]) – Optional list of paths to event LMDB databases. If None, it assumes event databases are in an “event” subdirectory within each dataset_dirs. Defaults to None.
bias (int) – An offset applied to the start time when fetching the window of data around an event. start = max(event_time - win_len + bias, 0). Defaults to 0.
event_regex (str) – Regular expression to filter event names from the EventKernelManager. Defaults to ‘’.
min_nearby (Optional[int]) – Passed to EventKernelManager to filter events. Minimum time between consecutive events of the same type. Defaults to None.
max_within (Optional[int]) – Passed to EventKernelManager to filter events. Maximum number of instances to keep for each event type. Defaults to None.

build_items() → None[source]#

Builds the list of items for the dataset based on selected events and split.

This method populates self.items, which is a list of tuples. Each tuple stores a cumulative count of items, the event name, and a bias for indexing into the original event list (used for train/val splitting). The total number of items in the dataset (self.num_items) is also calculated.

locate_item(idx: int) → Tuple[str, int][source]#

Locates the specific event and relative index within that event for a given global item index.

It uses a binary search on self.items (which stores cumulative counts) to efficiently find which event the idx falls into and what its relative index within that event’s items is (after considering the split bias).

Parameters:

idx (int) – The global index of the item in the dataset.

Returns:

A tuple containing: - event (str): The name of the event. - relative_idx_with_bias (int): The index within the specific event’s item list,

adjusted for the train/val split bias.

Return type:

Tuple[str, int]

to_tensor(item: ndarray | List | Dict) → torch.Tensor | List | Dict[source]#

Recursively converts NumPy arrays within a nested structure (list or dict) to PyTorch tensors.

If the input item is a NumPy array, it’s converted to a PyTorch tensor. If it’s a list, the conversion is applied to each element. If it’s a dictionary, the conversion is applied to each value. Other types are returned as is.

Parameters:: item (Union[np.ndarray, List, Dict]) – The item to convert. Can be a NumPy array, or a list/dict containing them.
Returns:: The item with NumPy arrays replaced by PyTorch tensors.
Return type:: Union[torch.Tensor, List, Dict]

class minestudio.data.minecraft.dataset_event.EventKernel(event_path: str | Path, event_regex: str, min_nearby: int | None = None, max_within: int | None = None)[source]#

Manages and provides access to event data stored in an LMDB database.

This class reads event information and specific event items from an LMDB database. It allows filtering events based on a regular expression and can further filter events by minimum time separation (min_nearby) or a maximum number of events to consider within a larger set (max_within). It also handles a codebook for mapping episode names if provided.

Parameters:

event_path (Union[str, Path]) – Path to the directory containing the LMDB event database.
event_regex (str) – Regular expression to filter event names.
min_nearby (Optional[int]) – Optional minimum time (in game ticks or similar units) between two occurrences of the same event in the same episode for an event instance to be included. Defaults to None.
max_within (Optional[int]) – Optional maximum number of event instances to consider for each event type after other filtering. Defaults to None.

filter_out(min_nearby: int | None = None, max_within: int | None = None)[source]#

Filters the events based on proximity and count constraints.

This method refines the list of events to be used. It ensures that for any given episode and event type, consecutive occurrences are separated by at least min_nearby time units. It also limits the total number of occurrences for each event type to max_within if specified.

Parameters:

min_nearby (Optional[int]) – Optional minimum time between consecutive events of the same type within an episode. If an event occurs too close to the previous one, it’s filtered out. Defaults to None.
max_within (Optional[int]) – Optional maximum number of instances to keep for each event type after applying the min_nearby filter. Defaults to None.

get_event_item(event: str, item_idx: int) → Tuple[str, int, int][source]#

Retrieves a specific event item by its type and index.

The method first remaps the item_idx if filtered remaining_events exist. It then fetches the raw event item (episode, event_time, value) from the LMDB database. If a codebook is present, it maps the episode name.

Parameters:

event (str) – The name of the event type (e.g., “minecraft.kill_entity:zombie”).
item_idx (int) – The index of the desired event item within its type, after filtering.

Raises:

AssertionError – If item_idx is out of the valid range for the given event.

Returns:

A tuple containing: - episode (str): The name or ID of the episode where the event occurred. - event_time (int): The timestamp (e.g., game tick) of the event. - value (int): A value associated with the event (semantics depend on event type).

Return type:

Tuple[str, int, int]

get_event_list() → List[str][source]#

Returns the sorted list of unique event names that match the event_regex and are present in the loaded event data.

Returns:: A list of event names.
Return type:: List[str]

get_event_size(event: str) → int[source]#

Returns the number of occurrences for a specific event type after filtering.

Parameters:: event (str) – The name of the event.
Returns:: The count of occurrences for the given event. Returns 0 if the event is not found or has no occurrences after filtering.
Return type:: int

class minestudio.data.minecraft.dataset_event.EventKernelManager(event_path: List[str | Path], event_regex: str, verbose: bool = True, **kwargs)[source]#

Manages multiple EventKernel instances to provide a unified view of event data.

This class aggregates event data from several EventKernel objects, which might correspond to different event LMDB databases or different partitions of data. It allows querying the total size of an event type across all kernels and retrieving specific event items by a global index.

Parameters:

event_path (List[Union[str, Path]]) – A list of paths to directories, each containing an LMDB event database to be managed by an EventKernel.
event_regex (str) – Regular expression used by each underlying EventKernel to filter event names.
verbose (bool, optional) – If True, logs information about the loaded events, defaults to True.
**kwargs – Additional keyword arguments passed to the constructor of each EventKernel (e.g., min_nearby, max_within).

get_event_item(event: str, item_idx: int) → Tuple[str, int, int][source]#

Retrieves a specific event item by its type and a global index across all managed kernels.

It iterates through the managed EventKernel instances, subtracting the size of each kernel’s event pool from item_idx until the index falls within the range of the current kernel. Then, it calls the get_event_item method of that specific kernel.

Parameters:

event (str) – The name of the event type.
item_idx (int) – The global index of the desired event item across all kernels.

Raises:

ValueError – If item_idx is out of the valid range for the given event across all managed kernels.

Returns:

A tuple containing: - episode (str): The name or ID of the episode. - event_time (int): The timestamp of the event. - value (int): A value associated with the event.

Return type:

Tuple[str, int, int]

get_event_list() → List[str][source]#

Returns the sorted list of unique event names aggregated from all managed EventKernel instances.

Returns:: A list of unique, sorted event names.
Return type:: List[str]

get_event_size(event: str) → int[source]#

Calculates the total number of occurrences for a specific event type across all managed kernels.

Parameters:: event (str) – The name of the event.
Returns:: The total count of the event. Returns 0 if the event is not found in any kernel or has no occurrences after filtering in the underlying kernels.
Return type:: int

Raw Dataset#

Date: 2024-11-10 10:26:32 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-01-21 23:03:15 FilePath: /MineStudio/minestudio/data/minecraft/dataset_raw.py

class minestudio.data.minecraft.dataset_raw.RawDataModule(*args: Any, **kwargs: Any)[source]#

LightningDataModule for the RawDataset.

Handles the creation of training and validation dataloaders. Supports episode continuous batching.

setup(stage: str | None = None)[source]#

Sets up the training and validation datasets.

This method is called by PyTorch Lightning to prepare the data. It instantiates RawDataset for both training and validation splits.

Parameters:: stage – The stage of training (e.g., ‘fit’, ‘validate’, ‘test’, ‘predict’). Not used in this implementation.

train_dataloader()[source]#

Creates the training dataloader.

If episode_continuous_batch is True, it uses MineDistributedBatchSampler for continuous loading of video frames. Otherwise, it uses a standard DataLoader with shuffling.

Returns:: The DataLoader for the training set.

val_dataloader()[source]#

Creates the validation dataloader.

If episode_continuous_batch is True, it uses MineDistributedBatchSampler. Otherwise, it uses a standard DataLoader without shuffling.

Returns:: The DataLoader for the validation set.

class minestudio.data.minecraft.dataset_raw.RawDataset(*args: Any, **kwargs: Any)[source]#

Raw dataset for training and testing.

Handles loading and processing of raw Minecraft gameplay data. It supports splitting data into training and validation sets, shuffling episodes, and uses a kernel manager to read data.

build_items() → None[source]#

Builds the list of items for the dataset.

This method processes episodes, splits them into train/val sets, and creates a list of items, where each item corresponds to a window of frames from an episode.

locate_item(idx: int) → Tuple[str, int][source]#

Locates the episode and relative index for a given item index.

Parameters:: idx – The index of the item in the dataset.
Returns:: A tuple containing the episode identifier and the relative index within that episode.

to_tensor(item: ndarray | List | Dict) → ndarray | List | Dict[source]#

Converts numpy arrays in an item to torch tensors.

Recursively traverses the item (which can be a numpy array, list, or dict) and converts any numpy arrays to torch tensors.

Parameters:: item – The item to convert, can be a numpy array, list, or dictionary.
Returns:: The item with numpy arrays converted to torch tensors.

Callbacks#

Date: 2025-01-09 05:08:19 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-01-21 22:28:09 FilePath: /MineStudio/minestudio/data/minecraft/callbacks/callback.py

class minestudio.data.minecraft.callbacks.callback.DrawFrameCallback[source]#

Base class for callbacks that draw overlay information onto video frames for visualization purposes.

draw_frames(frames: ndarray | List, infos: Dict, sample_idx: int, **kwargs) → ndarray[source]#

Draws specified information onto a set of video frames.

This method needs to be implemented by users who want to visualize datasets with custom overlays (e.g., drawing actions, metadata) on the video frames.

Parameters:

frames (Union[np.ndarray, List]) – A list of frames (e.g., Pillow images) or a NumPy array of frames (T, H, W, C).
infos (Dict) – A dictionary containing the information to be drawn on the frames. The structure of this dictionary depends on what the user wants to display.
sample_idx (int) – The index of the current sample being processed/visualized.
**kwargs – Additional keyword arguments that might be needed for drawing.

Returns:

A NumPy array of frames (T, H, W, C) with the information drawn on them.

Return type:

np.ndarray

Raises:

NotImplementedError – If this method is not implemented by a subclass.

class minestudio.data.minecraft.callbacks.callback.ModalConvertCallback(input_dirs: List[str], chunk_size: int)[source]#

Base class for callbacks that convert raw trajectory data into MineStudio’s built-in format.

Users should implement the methods of this class to define how their specific data format is converted.

do_convert(eps_id: str, skip_frames: List[List[bool]], modal_file_path: List[str | Path]) → Tuple[List, List][source]#

Converts the raw modal data for a given episode into the desired format.

skip_frames and modal_file_path are aligned lists, indicating which frames to skip in each corresponding file part.

Parameters:

eps_id (str) – The identifier of the episode being converted.
skip_frames (List[List[bool]]) – A list of lists of boolean flags. Each inner list corresponds to a file part in modal_file_path and indicates which frames within that part should be skipped.
modal_file_path (List[Union[str, Path]]) – A list of file paths (or Path objects) for the modal data files or parts of files that constitute the episode.

Returns:

A tuple containing two lists: chunk keys and chunk values, representing the converted data.

Return type:

Tuple[List, List]

Raises:

NotImplementedError – If this method is not implemented by a subclass.

gen_frame_skip_flags(file_name: str) → List[bool][source]#

Generates a list of boolean flags indicating which frames to skip in a given file.

This method should be implemented if users want to filter out specific frames based on the content or metadata of this modality.

Parameters:: file_name (str) – The name of the file for which to generate skip flags.
Returns:: A list of boolean flags. True indicates the frame should be skipped.
Return type:: List[bool]
Raises:: NotImplementedError – If this method is not implemented by a subclass.

load_episodes() → Dict[str, List[Tuple[str, str]]][source]#

Identifies and loads raw data from the specified input directories.

This method should be implemented by the user to parse their data structure and return a dictionary mapping episode names to a list of file parts.

Returns:: A dictionary where keys are episode names and values are lists of tuples. Each tuple contains a part identifier and the corresponding file path for that part of the episode.
Return type:: Dict[str, List[Tuple[str, str]]]
Raises:: NotImplementedError – If this method is not implemented by a subclass.

class minestudio.data.minecraft.callbacks.callback.ModalKernelCallback(read_bias: int = 0, win_bias: int = 0)[source]#

Base class for callbacks that define how a specific modality of data is handled within the ModalKernel.

Users must implement this callback for their custom modal data to manage operations like decoding, merging, slicing, and padding.

static create_from_config(config: Dict) → ModalKernelCallback[source]#

Factory method to create a ModalKernelCallback instance from a configuration dictionary.

Parameters:: config (Dict) – A dictionary containing the configuration parameters for the callback.
Returns:: An instance of a ModalKernelCallback subclass.
Return type:: ModalKernelCallback
Raises:: NotImplementedError – If this method is not implemented by a subclass.

do_decode(chunk: bytes, **kwargs) → Any[source]#

Decodes a raw byte chunk of modal data into its usable format.

Data is stored in LMDB files as byte chunks. Since decoding methods vary for different modalities, users must implement this method to specify how their data should be decoded.

Parameters:

chunk (bytes) – The raw byte string (chunk) to be decoded.
**kwargs – Additional keyword arguments that might be needed for decoding.

Returns:

The decoded data in its appropriate format (e.g., a NumPy array for images).

Return type:

Any

Raises:

NotImplementedError – If this method is not implemented by a subclass.

do_merge(chunk_list: List[bytes], **kwargs) → List | Dict[source]#

Merges a list of decoded data chunks into a continuous sequence or structure.

When reading a long trajectory segment, the system automatically reads and decodes multiple chunks. This method defines how these decoded chunks for a specific modality are combined into a single, coherent data sequence (e.g., a list of frames or a dictionary of features).

Parameters:

chunk_list (List[bytes]) – A list of decoded data chunks.
**kwargs – Additional keyword arguments that might be needed for merging.

Returns:

The merged data, typically a list or dictionary representing the continuous sequence.

Return type:

Union[List, Dict]

Raises:

NotImplementedError – If this method is not implemented by a subclass.

do_pad(data: List | Dict, pad_len: int, pad_pos: Literal['left', 'right'], **kwargs) → Tuple[List | Dict, ndarray][source]#

Pads the modal data to a specified length if it’s shorter than required.

Users need to implement this method to define how padding is applied (e.g., repeating the last frame, adding zero vectors) and to return a mask indicating the padded elements.

Parameters:

data (Union[List, Dict]) – The modal data to be padded.
pad_len (int) – The number of elements to add as padding.
pad_pos (Literal["left", "right"]) – The position where padding should be added, either “left” or “right”.
**kwargs – Additional keyword arguments that might be needed for padding.

Returns:

A tuple containing the padded data and a NumPy array representing the padding mask (1 for original data, 0 for padded data).

Return type:

Tuple[Union[List, Dict], np.ndarray]

Raises:

NotImplementedError – If this method is not implemented by a subclass.

do_postprocess(data: Dict, **kwargs) → Dict[source]#

Performs optional post-processing operations on the sampled modal data.

This method can be used for tasks like data augmentation or further transformations after the data has been read, sliced, and padded.

Parameters:

data (Dict) – A dictionary containing the modal data (and potentially its mask) to be post-processed.
**kwargs – Additional keyword arguments that might be needed for post-processing.

Returns:

The post-processed data dictionary.

Return type:

Dict

do_slice(data: List | Dict, start: int, end: int, skip_frame: int, **kwargs) → List | Dict[source]#

Extracts a slice from the modal data based on start, end, and skip_frame parameters.

Since data formats can vary significantly between modalities, users need to implement this method to define how slicing operations are performed on their specific data type.

Parameters:

data (Union[List, Dict]) – The modal data (e.g., a list of frames, a dictionary of features) to be sliced.
start (int) – The starting index for the slice (inclusive).
end (int) – The ending index for the slice (inclusive).
skip_frame (int) – The interval at which to select frames/data points within the slice.
**kwargs – Additional keyword arguments that might be needed for slicing.

Returns:

The sliced portion of the data.

Return type:

Union[List, Dict]

Raises:

NotImplementedError – If this method is not implemented by a subclass.

filter_dataset_paths(dataset_paths: List[str | Path]) → List[Path][source]#

Filters a list of potential dataset paths to select those relevant to this modality.

dataset_paths contains all possible paths pointing to different LMDB folders. This method should be implemented to identify and return only the paths that contain data for the modality handled by this callback.

Parameters:: dataset_paths (List[Union[str, Path]]) – A list of potential dataset directory paths (strings or Path objects).
Returns:: A filtered list of Path objects pointing to the relevant LMDB datasets.
Return type:: List[Path]
Raises:: NotImplementedError – If this method is not implemented by a subclass.

property name: str#

Returns the name of the modality this callback handles (e.g., “image”, “action”).

Returns:: The name of the modality.
Return type:: str
Raises:: NotImplementedError – If this property is not implemented by a subclass.

Modality:Action#

Date: 2025-01-09 05:27:25 LastEditors: muzhancun muzhancun@stu.pku.edu.cn LastEditTime: 2025-05-27 15:04:29 FilePath: /MineStudio/minestudio/data/minecraft/callbacks/action.py

class minestudio.data.minecraft.callbacks.action.ActionConvertCallback(input_dirs: List[str | Path], chunk_size: int = 32, action_transformer_kwargs: Dict = {}, **kwargs)[source]#

Callback for converting raw action data files.

This callback loads action data from .jsonl files, processes it, and converts it into a structured format.

do_convert(eps_id: str, skip_frames: List[List[bool]], modal_file_path: List[str | Path]) → Tuple[List, List][source]#

Converts action data for a given episode.

Processes actions from .jsonl files, applies transformations, and handles frame skipping and remapping.

Parameters:

eps_id (str) – The ID of the episode.
skip_frames (List[List[bool]]) – A list of lists of boolean flags indicating whether to skip frames.
modal_file_path (List[Union[str, Path]]) – A list of file paths for the modal data (action files).

Returns:

A tuple containing a list of keys (chunk start indices) and a list of pickled data values.

Return type:

Tuple[List, List]

gen_frame_skip_flags(file_name: str) → List[bool][source]#

Generates frame skip flags based on action data to identify no-operation frames.

A frame is considered a no-op if the camera is static and all other actions are zero.

Parameters:: file_name (str) – The name of the action file (without extension).
Returns:: A list of boolean flags, where True indicates the frame should be kept (not a no-op).
Return type:: List[bool]

load_episodes() → OrderedDict[source]#

Loads episodes from input directories containing .jsonl action files.

It identifies episode segments from file names, sorts them, and organizes them into an OrderedDict.

Returns:: An OrderedDict where keys are episode IDs and values are lists of segment file paths.
Return type:: OrderedDict

class minestudio.data.minecraft.callbacks.action.ActionDrawFrameCallback(start_point: Tuple[int, int] = (10, 10))[source]#

Callback for drawing action information onto video frames.

draw_frames(frames: ndarray | List, infos: Dict, sample_idx: int) → ndarray[source]#

Draws action information onto each frame.

Parameters:

frames (Union[np.ndarray, List]) – A list of frames or a numpy array of frames.
infos (Dict) – Dictionary containing action information.
sample_idx (int) – Index of the sample to process.

Returns:

A list of frames with action information drawn on them.

Return type:

List[np.ndarray]

class minestudio.data.minecraft.callbacks.action.ActionKernelCallback(n_camera_bins: int = 11, camera_binsize: int = 2, camera_maxval: int = 10, camera_mu: int = 10, camera_quantization_scheme='mu_law', enable_prev_action: bool = False, **kwargs)[source]#

Callback for handling Minecraft actions.

This callback processes action data, including decoding, merging, slicing, padding, and postprocessing. It can handle both regular actions and previous actions if enabled.

create_from_config() → ActionKernelCallback[source]#

Creates an ActionKernelCallback instance from a configuration dictionary.

Parameters:: config (Dict) – Configuration dictionary.
Returns:: An instance of ActionKernelCallback.
Return type:: ActionKernelCallback

do_decode(chunk: bytes, **kwargs) → Dict[source]#

Decodes a chunk of bytes into an action dictionary.

Parameters:

chunk (bytes) – Bytes to decode.
kwargs – Additional keyword arguments.

Returns:

Decoded action dictionary.

Return type:

Dict

do_merge(chunk_list: List[bytes], **kwargs) → Dict[source]#

Merges a list of decoded action chunks into a single dictionary.

Parameters:

chunk_list (List[bytes]) – List of byte chunks representing actions.
kwargs – Additional keyword arguments.

Returns:

A dictionary containing merged action data.

Return type:

Dict

do_pad(data: Dict, pad_len: int, pad_pos: Literal['left', 'right'], **kwargs) → Tuple[Dict, ndarray][source]#

Pads the action data.

Parameters:

data (Dict) – Action data dictionary.
pad_len (int) – Length of padding to add.
pad_pos (Literal["left", "right"]) – Position to add padding (“left” or “right”).
kwargs – Additional keyword arguments.

Returns:

A tuple containing the padded action data and the padding mask.

Return type:

Tuple[Dict, np.ndarray]

do_postprocess(data: Dict) → Dict[source]#

Postprocesses the action data.

This method handles the transformation of environment actions to agent actions and optionally includes previous actions.

Parameters:: data (Dict) – Data dictionary containing actions.
Returns:: Postprocessed data dictionary.
Return type:: Dict

do_slice(data: Dict, start: int, end: int, skip_frame: int, **kwargs) → Dict[source]#

Slices the action data.

Parameters:

data (Dict) – Action data dictionary.
start (int) – Start index for slicing.
end (int) – End index for slicing.
skip_frame (int) – Frame skipping interval.
kwargs – Additional keyword arguments.

Returns:

Sliced action data.

Return type:

Dict

filter_dataset_paths(dataset_paths: List[str | Path]) → List[Path][source]#

Filters dataset paths to select only action-related paths.

Parameters:: dataset_paths (List[Union[str, Path]]) – A list of dataset paths.
Returns:: A list of paths pointing to action data.
Return type:: List[Path]

property name: str#

Returns the name of the callback.

Returns:: The name ‘action’.
Return type:: str

class minestudio.data.minecraft.callbacks.action.VectorActionKernelCallback(action_chunk_size: int = 32, return_type: str = 'vector')[source]#

Callback for handling actions represented as vectors.

This callback converts actions between dictionary and vector representations.

action_to_dict(action: Dict) → Dict[source]#

Converts an action dictionary to a specific dictionary format with “camera” and “button” keys.

Parameters:: action (Dict) – Action dictionary.
Returns:: Dictionary with “camera” and “button” actions.
Return type:: Dict

action_to_vector(action: Dict) → ndarray[source]#

Converts an action dictionary to an action vector.

Parameters:: action (Dict) – Action dictionary.
Returns:: Action vector.
Return type:: np.ndarray

do_postprocess(data: Dict) → Dict[source]#

Postprocesses the action data, converting it to the specified return type (vector or dict).

Parameters:: data (Dict) – Data dictionary containing actions.
Returns:: Postprocessed data dictionary with actions in the specified format.
Return type:: Dict

property vector_dim: int#

Calculates the dimension of the action vector.

Returns:: The dimension of the action vector.
Return type:: int

vector_to_action(vector: ndarray) → List[Dict] | Dict[source]#

Converts an action vector or a list of action vectors to action dictionaries.

Parameters:: vector (np.ndarray) – Action vector(s).
Returns:: Action dictionary or list of action dictionaries.
Return type:: Union[List[Dict], Dict]

Modality:Image#

Date: 2025-01-09 05:07:59 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-01-21 22:28:22 FilePath: /MineStudio/minestudio/data/minecraft/callbacks/image.py

class minestudio.data.minecraft.callbacks.image.ImageConvertCallback(*args, thread_pool: int = 8, **kwargs)[source]#

A ModalConvertCallback for converting raw video data (e.g., from .mp4 files) into the MineStudio chunked format.

This class handles loading video episodes, splitting them, and encoding frames into byte chunks suitable for LMDB storage.

do_convert(eps_id: str, skip_frames: List[List[bool]], modal_file_path: List[str | Path]) → Tuple[List[int], List[bytes]][source]#

Converts video data for a given episode into a list of chunk keys and byte values.

It iterates through each video segment file for the episode, reads frames using av, applies frame skipping based on skip_frames, resizes frames to a fixed resolution (224x224 in this implementation, though cv_width and cv_height are reassigned), and then groups frames into chunks. Each chunk is then encoded into mp4 format using _write_video_chunk in a thread pool.

Note: There’s a hardcoded check for original video dimensions (640x360) and a hardcoded resize to 224x224. Also, source_path and ord variables seem to be used in print statements without clear definition in the provided snippet.

Parameters:

eps_id (str) – The identifier for the episode being converted.
skip_frames (List[List[bool]]) – A list of lists of boolean flags. Each inner list corresponds to a segment in modal_file_path and indicates whether to skip each frame.
modal_file_path (List[Union[str, Path]]) – A list of file paths (str or Path objects) for the video data segments.

Returns:

A tuple containing two lists: - keys (List[int]): A list of chunk start indices. - vals (List[bytes]): A list of serialized video chunk byte values.

Return type:

Tuple[List[int], List[bytes]]

load_episodes() → OrderedDict[source]#

Loads and organizes video episode data from the specified input directories.

It scans for .mp4 files, groups them by episode ID (parsed from filenames), sorts segments within each episode, and then re-splits episodes if segments are too far apart in time (controlled by MAX_TIME).

Returns:: An OrderedDict where keys are episode IDs (e.g., “episodeName-startTime”) and values are lists of (part_id, file_path) tuples for each segment belonging to that episode.
Return type:: OrderedDict

class minestudio.data.minecraft.callbacks.image.ImageKernelCallback(frame_width: int = 128, frame_height: int = 128, num_workers: int = 4, enable_video_aug: bool = False)[source]#

A ModalKernelCallback specifically for processing image (video frame) data.

This class handles the decoding of video chunks, merging multiple chunks, slicing frames, padding, and optionally applying video augmentations.

static create_from_config(config: Dict) → ImageKernelCallback[source]#

Factory method to create an ImageKernelCallback instance from a configuration dictionary.

It expects the configuration for the image callback to be under the ‘image’ key in the provided config dictionary.

Parameters:: config (Dict) – A dictionary containing the configuration parameters.
Returns:: An instance of ImageKernelCallback initialized with parameters from the config.
Return type:: ImageKernelCallback

do_decode(chunk: bytes, **kwargs) → ndarray[source]#

Decodes a video chunk (bytes) into a NumPy array of frames.

It uses the av library to open and decode the video stream from the byte chunk. Frames are converted to RGB24 format and resized to the specified frame_width and frame_height using OpenCV. Decoding and resizing are parallelized using a ThreadPoolExecutor.

Parameters:

chunk (bytes) – A byte string representing the video data chunk.
**kwargs – Additional keyword arguments (not used in this implementation).

Returns:

A NumPy array of decoded and resized video frames, with shape (T, H, W, C).

Return type:

np.ndarray

do_merge(chunk_list: List[bytes], **kwargs) → ndarray[source]#

Merges a list of decoded video chunks (each being a byte string) into a single NumPy array of frames.

It uses a ThreadPoolExecutor to parallelize the decoding of each chunk using the do_decode method, and then concatenates the resulting frame arrays.

Parameters:

chunk_list (List[bytes]) – A list of byte strings, where each string is a video data chunk.
**kwargs – Additional keyword arguments (not used in this implementation).

Returns:

A NumPy array representing the merged video frames from all chunks.

Return type:

np.ndarray

do_pad(data: ndarray, pad_len: int, pad_pos: Literal['left', 'right'], **kwargs) → Tuple[ndarray, ndarray][source]#

Pads a NumPy array of video frames with zeros to a specified length.

Padding can be added to the “left” (beginning) or “right” (end) of the frame sequence. A mask is also generated to indicate the original (1) versus padded (0) frames.

Parameters:

data (np.ndarray) – A NumPy array of video frames (T, H, W, C) to be padded.
pad_len (int) – The number of frames to add as padding.
pad_pos (Literal["left", "right"]) – The position to add padding, either “left” or “right”.
**kwargs – Additional keyword arguments (not used in this implementation).

Returns:

A tuple containing: - pad_data (np.ndarray): The padded video data. - pad_mask (np.ndarray): A 1D array mask indicating original (1) and padded (0) frames.

Return type:

Tuple[np.ndarray, np.ndarray]

Raises:

ValueError – If pad_pos is not “left” or “right”.

do_postprocess(data: Dict) → Dict[source]#

Post-processes the image data, primarily by applying video augmentation if enabled.

If enable_video_aug was set to True during initialization, this method applies the configured video_augmentor to the frames stored under the “image” key in the input dictionary.

Parameters:: data (Dict) – A dictionary containing the image data, expected to have an “image” key with a NumPy array of frames.
Returns:: The (potentially augmented) data dictionary.
Return type:: Dict

do_slice(data: ndarray, start: int, end: int, skip_frame: int, **kwargs) → ndarray[source]#

Slices a NumPy array of video frames based on start, end, and skip_frame parameters.

Parameters:

data (np.ndarray) – A NumPy array of video frames (T, H, W, C).
start (int) – The starting frame index for the slice (inclusive).
end (int) – The ending frame index for the slice (exclusive for standard Python slicing, but used to select frames up to end-1).
skip_frame (int) – The interval at which to select frames.
**kwargs – Additional keyword arguments (not used in this implementation).

Returns:

A NumPy array containing the sliced video frames.

Return type:

np.ndarray

filter_dataset_paths(dataset_paths: List[str | Path]) → List[Path][source]#

Filters a list of dataset paths to select only those relevant to image/video data.

It checks if the stem of each path is either ‘video’ or ‘image’.

Parameters:: dataset_paths (List[Union[str, Path]]) – A list of dataset paths (strings or Path objects).
Returns:: A list of Path objects pointing to the filtered image/video dataset directories.
Return type:: List[Path]

property name: str#

Returns the name of this callback, which is “image”.

Returns:: The string “image”.
Return type:: str

class minestudio.data.minecraft.callbacks.image.VideoAugmentation(frame_width: int = 224, frame_height: int = 224)[source]#

Applies a sequence of image augmentations to video frames using the albumentations library.

This class defines a transformation pipeline that includes color jittering and affine transformations (rotation, scaling, shearing).

Modality:Info#

Date: 2025-01-09 05:36:19 LastEditors: muzhancun muzhancun@stu.pku.edu.cn LastEditTime: 2025-05-27 14:45:03 FilePath: /MineStudio/minestudio/data/minecraft/callbacks/meta_info.py

class minestudio.data.minecraft.callbacks.meta_info.MetaInfoConvertCallback(input_dirs: List[str], chunk_size: int)[source]#

Callback for converting raw metadata into the MineStudio format.

do_convert(eps_id: str, skip_frames: List[List[bool]], modal_file_path: List[str | Path]) → Tuple[List, List][source]#

Converts metadata for a given episode.

It loads metadata from specified files, applies frame skipping, and chunks the data.

Parameters:

eps_id (str) – Episode ID.
skip_frames (List[List[bool]]) – A list of lists of boolean flags indicating whether to skip each frame for each segment file.
modal_file_path (List[Union[str, Path]]) – A list of file paths for the metadata segments.

Returns:

A tuple containing a list of chunk start indices and a list of serialized chunk values.

Return type:

Tuple[List, List]

load_episodes()[source]#

Loads and organizes metadata episode data from input directories.

It identifies metadata files (ending with .pkl), groups them by episode, sorts segments within episodes, and re-splits episodes based on a maximum time interval.

Returns:: An OrderedDict of episodes, where keys are episode IDs and values are lists of (part_id, file_path) tuples.
Return type:: OrderedDict

class minestudio.data.minecraft.callbacks.meta_info.MetaInfoDrawFrameCallback(start_point: Tuple[int, int] = (150, 10))[source]#

Callback for drawing metadata information onto video frames.

draw_frames(frames: List, infos: Dict, sample_idx: int) → ndarray[source]#

Draws metadata (pitch, yaw, cursor position, GUI status) onto each frame.

Parameters:

frames (List) – A list of frames.
infos (Dict) – Dictionary containing metadata information under the key ‘meta_info’.
sample_idx (int) – Index of the sample to process.

Returns:

A list of frames with metadata drawn on them.

Return type:

List[np.ndarray]

class minestudio.data.minecraft.callbacks.meta_info.MetaInfoKernelCallback[source]#

Callback for processing metadata information.

Handles decoding, merging, slicing, and padding of metadata.

create_from_config() → MetaInfoKernelCallback[source]#

Creates a MetaInfoKernelCallback instance from a configuration dictionary.

Parameters:: config (Dict) – Configuration dictionary.
Returns:: An instance of MetaInfoKernelCallback.
Return type:: MetaInfoKernelCallback

do_decode(chunk: bytes, **kwargs) → Dict[source]#

Decodes a chunk of bytes into a metadata dictionary.

Parameters:

chunk (bytes) – Bytes to decode.
kwargs – Additional keyword arguments.

Returns:

Decoded metadata dictionary.

Return type:

Dict

do_merge(chunk_list: List[bytes], **kwargs) → Dict[source]#

Merges a list of decoded metadata chunks into a single dictionary.

Each chunk is expected to be a list of frame_info dictionaries.

Parameters:

chunk_list (List[bytes]) – List of byte chunks representing metadata.
kwargs – Additional keyword arguments.

Returns:

A dictionary containing merged metadata.

Return type:

Dict

do_pad(data: Dict, pad_len: int, pad_pos: Literal['left', 'right'], **kwargs) → Tuple[Dict, ndarray][source]#

Pads the metadata.

Handles both numpy arrays and lists within the data dictionary. For numpy arrays, it pads with zeros. For lists, it pads with None.

Parameters:

data (Dict) – Metadata dictionary.
pad_len (int) – Length of padding to add.
pad_pos (Literal["left", "right"]) – Position to add padding (“left” or “right”).
kwargs – Additional keyword arguments.

Returns:

A tuple containing the padded metadata and the padding mask.

Return type:

Tuple[Dict, np.ndarray]

do_slice(data: Dict, start: int, end: int, skip_frame: int, **kwargs) → Dict[source]#

Slices the metadata.

Parameters:

data (Dict) – Metadata dictionary.
start (int) – Start index for slicing.
end (int) – End index for slicing.
skip_frame (int) – Frame skipping interval.
kwargs – Additional keyword arguments.

Returns:

Sliced metadata dictionary.

Return type:

Dict

filter_dataset_paths(dataset_paths: List[str]) → List[str][source]#

Filters dataset paths to select only metadata-related paths.

Parameters:: dataset_paths (List[str]) – A list of dataset paths.
Returns:: A list of paths pointing to metadata files.
Return type:: List[str]

property name: str#

Returns the name of the callback.

Returns:: The name ‘meta_info’.
Return type:: str

Modality:Segmentation#

Date: 2025-01-09 05:42:00 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-01-21 22:31:03 FilePath: /MineStudio/minestudio/data/minecraft/callbacks/segmentation.py

class minestudio.data.minecraft.callbacks.segmentation.SegmentationConvertCallback(input_dirs: List[str], chunk_size: int)[source]#

Callback for converting segmentation data from raw RLE files.

This callback loads episodes from directories containing RLE files, processes them, and converts them into a structured format.

do_convert(eps_id: str, skip_frames: List[List[bool]], modal_file_path: List[str | Path]) → Tuple[List, List][source]#

Convert segmentation data for a given episode.

The input video is connected end to end to form a complete trajectory, named eps_id. However, the input data is processed independently, so its frame id is also independent. When integrating it into a complete trajectory, the frame id needs to be remapped. That’s why frame_id_re_mapping is used here, where ord indicates the part of the whole trajectory.

Parameters:

eps_id (str) – The ID of the episode.
skip_frames (List[List[bool]]) – A list of lists of boolean flags indicating whether to skip frames.
modal_file_path (List[Union[str, Path]]) – A list of file paths for the modal data.

Returns:

A tuple containing a list of keys (chunk start indices) and a list of pickled data values.

Return type:

Tuple[List, List]

load_episodes() → OrderedDict[source]#

Load episodes from input directories containing RLE files.

Identifies episode segments, sorts them, and re-splits them based on time.

Returns:: An OrderedDict where keys are episode IDs and values are lists of segment file paths.
Return type:: OrderedDict

class minestudio.data.minecraft.callbacks.segmentation.SegmentationDrawFrameCallback(start_point: Tuple[int, int] = (300, 10), draw_point: bool = True, draw_mask: bool = True, draw_event: bool = True, draw_frame_id: bool = True, draw_frame_range: bool = True)[source]#

Callback for drawing segmentation information on frames.

This callback can draw points, masks, event text, frame IDs, and frame ranges onto video frames.

Draw segmentation information on a single frame.

Parameters:

frame (np.ndarray) – The input frame (numpy array).
point (Optional[Tuple[int, int]]) – The (y, x) coordinates of the interaction point (normalized).
obj_mask (Optional[np.ndarray]) – The object mask (numpy array).
obj_id (Optional[int]) – The object ID.
event (Optional[str]) – The event string.
frame_id (Optional[int]) – The frame ID.
frame_range (Optional[Tuple[int, int]]) – The frame range tuple.

Returns:

The frame with drawn segmentation information.

Return type:

np.ndarray

draw_frames(frames: ndarray | List, infos: Dict, sample_idx: int) → ndarray[source]#

Draw segmentation information on a list of frames.

Parameters:

frames (Union[np.ndarray, List]) – A list of frames or a numpy array of frames.
infos (Dict) – A dictionary containing segmentation information.
sample_idx (int) – The index of the sample to process.

Returns:

A list of frames with drawn segmentation information.

Return type:

List[np.ndarray]

class minestudio.data.minecraft.callbacks.segmentation.SegmentationKernelCallback(frame_width: int = 224, frame_height: int = 224)[source]#

Callback for handling segmentation data kernels.

This callback processes segmentation masks, including RLE encoding/decoding, filtering dataset paths, merging data chunks, slicing, and padding.

create_from_config() → SegmentationKernelCallback[source]#

Create a SegmentationKernelCallback instance from a configuration dictionary.

Parameters:: config (Dict) – Configuration dictionary.
Returns:: An instance of SegmentationKernelCallback.
Return type:: SegmentationKernelCallback

do_decode(chunk: bytes, **kwargs) → Dict[source]#

Decode a data chunk using pickle.

Parameters:

chunk (bytes) – The data chunk to decode.
kwargs – Additional keyword arguments.

Returns:

The decoded data as a dictionary.

Return type:

Dict

do_merge(chunk_list: List[bytes], **kwargs) → Dict[source]#

Merge a list of data chunks into a single dictionary.

Handles object ID remapping, mask resizing, and event processing.

Parameters:

chunk_list (List[bytes]) – A list of data chunks (bytes).
kwargs – Additional keyword arguments, e.g., ‘event_constrain’.

Returns:

A dictionary containing the merged segmentation data.

Return type:

Dict

do_pad(data: Dict, pad_len: int, pad_pos: Literal['left', 'right'], **kwargs) → Tuple[Dict, ndarray][source]#

Pad the data to a specified length.

Parameters:

data (Dict) – The input data dictionary.
pad_len (int) – The length of the padding to add.
pad_pos (Literal["left", "right"]) – The position to add padding (‘left’ or ‘right’).
kwargs – Additional keyword arguments.

Returns:

A tuple containing the padded data dictionary and the padding mask.

Return type:

Tuple[Dict, np.ndarray]

do_slice(data: Dict, start: int, end: int, skip_frame: int, **kwargs) → Dict[source]#

Slice the data based on start, end, and skip_frame parameters.

Parameters:

data (Dict) – The input data dictionary.
start (int) – The starting index for slicing.
end (int) – The ending index for slicing.
skip_frame (int) – The step for slicing (frame skipping).
kwargs – Additional keyword arguments.

Returns:

A dictionary containing the sliced data.

Return type:

Dict

filter_dataset_paths(dataset_paths: List[str | Path]) → List[Path][source]#

Filter dataset paths to include only segmentation-related files.

Parameters:: dataset_paths (List[Union[str, Path]]) – A list of dataset paths (strings or Path objects).
Returns:: A list of Path objects pointing to segmentation files.
Return type:: List[Path]

property name: str#

Get the name of the callback.

Returns:: The name ‘segmentation’.
Return type:: str

rle_decode(mask_rle: str, shape: Tuple[int, int]) → ndarray[source]#

Decode a run-length encoded (RLE) mask.

Parameters:

mask_rle (str) – Run-length as a string (start length).
shape (Tuple[int, int]) – (height, width) of the array to return.

Returns:

Numpy array, 1 - mask, 0 - background.

Return type:

np.ndarray

rle_encode(binary_mask: ndarray) → str[source]#

Encode a binary mask using run-length encoding (RLE).

Parameters:: binary_mask (np.ndarray) – Numpy array, 1 - mask, 0 - background.
Returns:: Run length as a string.
Return type:: str

LMDB Convertion#

Date: 2024-11-10 12:27:01 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-01-15 15:04:05 FilePath: /MineStudio/minestudio/data/minecraft/tools/convertion.py

class minestudio.data.minecraft.tools.convertion.ConvertManager(output_dir: str, convert_kernel: ModalConvertCallback, filter_kernel: ModalConvertCallback | None = None, chunk_size: int = 32, num_workers: int = 16)[source]#

Manages the overall data conversion process using multiple ConvertWorker actors.

This class is responsible for preparing tasks (episodes and their parts), dispatching these tasks to ConvertWorker instances, and collecting the results. It supports filtering of episodes and parts based on a filter_kernel.

dispatch()[source]#

Dispatch the prepared tasks to ConvertWorker actors for processing.

Divides the loaded episodes among the specified number of workers. Each worker processes its assigned episodes and writes the output to a separate LMDB file. Collects and prints summary statistics after all workers have completed.

prepare_tasks()[source]#

Prepare the tasks (episodes and their parts) for conversion.

Loads episodes using the convert_kernel and, if provided, the filter_kernel. Filters out episodes or parts of episodes that do not meet the criteria defined by the filter_kernel. The prepared tasks are stored in self.loaded_episodes.

Date: 2024-11-10 12:26:39 LastEditors: caishaofei-mus1 1744260356@qq.com LastEditTime: 2025-03-17 21:40:14 FilePath: /MineStudio/var/minestudio/data/minecraft/tools/event_convertion.py

minestudio.data.minecraft.tools.event_convertion.main(args)[source]#

Main function to process Minecraft game events and store them in an LMDB database.

This function reads event data from a specified input directory, processes it using a KernelManager, and then organizes and writes the events into an LMDB database with a specific structure. The structure includes a codebook for episode IDs, the total number of events, information about each event type (number of items and episodes), and individual event occurrences.

Parameters:: args (argparse.Namespace) – Command-line arguments, expected to have an input_dir attribute specifying the directory containing the raw game data.

Utils#

Date: 2024-11-10 10:06:28 LastEditors: caishaofei caishaofei@stu.pku.edu.cn LastEditTime: 2025-01-09 16:33:22 FilePath: /MineStudio/minestudio/data/minecraft/utils.py

class minestudio.data.minecraft.utils.MineDistributedBatchSampler(*args: Any, **kwargs: Any)[source]#

A distributed batch sampler for Minecraft datasets.

Ensures that each replica (process) gets a unique and continuous set of samples from the dataset, particularly useful for sequential data like video frames from game episodes. It divides the dataset among replicas and then creates batches within each replica’s assigned portion.

minestudio.data.minecraft.utils.batchify(batch_in: Sequence[Dict[str, Any]]) → Any[source]#

Collates a sequence of items into a batch.

Recursively processes dictionaries, stacking tensors and converting lists of numbers to tensors. Other types are returned as is.

Parameters:: batch_in – A sequence of items to batch. Each item is typically a dictionary.
Returns:: The collated batch.

minestudio.data.minecraft.utils.download_dataset_from_huggingface(name: Literal['6xx', '7xx', '8xx', '9xx', '10xx'], base_dir: str | None = None)[source]#

Downloads a dataset from Hugging Face.

Checks for sufficient disk space before downloading.

Parameters:

name – The name of the dataset to download (e.g., “6xx”).
base_dir – The base directory to download the dataset to. If None, uses the MineStudio directory.

Returns:

The local path to the downloaded dataset.

Raises:

ValueError – If there is insufficient disk space.

minestudio.data.minecraft.utils.get_repo_total_size(repo_id, repo_type='dataset', branch='main')[source]#

Calculate the total size of a Hugging Face repository.

Parameters:

repo_id – The identifier of the repository (e.g., ‘CraftJarvis/minestudio-data-6xx’).
repo_type – The type of the repository, defaults to “dataset”.
branch – The branch of the repository, defaults to “main”.

Returns:

A tuple containing the total size in bytes and the total size in gigabytes.

Raises:

requests.exceptions.HTTPError – If the API request fails.

minestudio.data.minecraft.utils.pull_datasets_from_remote(dataset_dirs: List[str]) → List[str][source]#

Pulls datasets from remote Hugging Face repositories if specified.

Iterates through a list of dataset directory paths. If a path is a recognized dataset name (e.g., ‘6xx’), it downloads the dataset from Hugging Face. Otherwise, it keeps the original path.

Parameters:: dataset_dirs – A list of dataset directory paths or dataset names.
Returns:: A list of updated dataset directory paths, with remote datasets downloaded.

minestudio.data.minecraft.utils.visualize_dataloader(dataloader, draw_frame_callbacks: List[DrawFrameCallback], num_samples: int = 1, save_fps: int = 20, output_dir: str = './') → None[source]#

Visualizes data from a dataloader by creating a video.

Iterates through the dataloader, processes frames using draw_frame_callbacks, and saves the resulting frames as an MP4 video.

Parameters:

dataloader – The dataloader to visualize.
draw_frame_callbacks – A list of callbacks to draw on frames.
num_samples – The number of batches to process from the dataloader.
save_fps – The frames per second for the output video.
output_dir – The directory to save the output video.

minestudio.data.minecraft.utils.write_video(file_name: str, frames: Sequence[ndarray], width: int = 640, height: int = 360, fps: int = 20) → None[source]#

Write a sequence of video frames to a video file.

Parameters:

file_name – The name of the output video file.
frames – A sequence of numpy arrays, where each array represents a frame. Frames are expected to be in RGB format.
width – The width of the video, defaults to 640.
height – The height of the video, defaults to 360.
fps – The frames per second of the video, defaults to 20.

Raises:

AssertionError – If a frame’s dimensions do not match the specified width and height.

Data API Documentation#

Core Dataset#

Event Dataset#

Raw Dataset#

Callbacks#

Modality:Action#

Modality:Image#

Modality:Info#

Modality:Segmentation#

LMDB Convertion#

Utils#

This Page