Customize#
Our framework supports customization of online algorithm details.
Trainer#
To customize the trainer, you need to imply an class inherit from minestudio.online.trainer.base_trainer.BaseTrainer
with and implement the abstract methods setup_model_and_optimizer
and train
.
setup_model_and_optimizer
return a pair (model, optimizer)
.
In the train
function, you need to define the training loop. You can use the fetch_fragments_and_estimate_advantages
method to obtain data from the replay buffer.
from minestudio.online.trainer.base_trainer import BaseTrainer
class PPOTrainer(BaseTrainer):
def setup_model_and_optimizer(self):
# Define model and optimizer
pass
def train(self):
# Custom training logic
pass
Refer to minestudio.online.trainer.ppotrainer.PPOTrainer
for an example.