Customize#

Our framework supports customization of online algorithm details.

Trainer#

To customize the trainer, you need to imply an class inherit from minestudio.online.trainer.base_trainer.BaseTrainer with and implement the abstract methods setup_model_and_optimizer and train.

setup_model_and_optimizer return a pair (model, optimizer).

In the train function, you need to define the training loop. You can use the fetch_fragments_and_estimate_advantages method to obtain data from the replay buffer.

from minestudio.online.trainer.base_trainer import BaseTrainer
class PPOTrainer(BaseTrainer):
    def setup_model_and_optimizer(self):
        # Define model and optimizer
        pass

    def train(self):
        # Custom training logic
        pass

Refer to minestudio.online.trainer.ppotrainer.PPOTrainer for an example.