Overview#

The code in the Online Training Module is mainly divided into four parts: run, rollout, and train. Each part is contained in its corresponding subfolder under the online folder.

Rollout#

The rollout section is located under online/rollout and mainly consists of several parts:

The EnvWorker can obtain control information such as actions from the RolloutWorker. interact directly with the environment and pass the obtained information to the RolloutWorker.

The RolloutWorker is responsible for compressing the observations, states, and other information from multiple EnvWorkers into a batches, passing them to the agent for efficient computation of actions, and then distributing the actions to the corresponding EnvWorkers. At the same time, it sends all the information to the ReplayManager.

The RolloutManager receives the information from the RolloutWorker and internally maintains the work progress of all EnvWorkers. When a worker has been working for a certain number of consecutive frames, it clips the work into fragments and sends them, along with the information, to the ReplayBuffer.

Trainer#

This folder contains our defined Online Trainer, including the parent class BaseTrainer and subclasses such as PPOTrainer. These trainers accept config and the computation results of GaeWorker on the ReplayBuffer as parameters.

By inheriting from BaseTrainer, you can customize the trainer according to your needs. Refer to Customization for details.