core.trainers.grpo.args
core.trainers.grpo.args
Axolotl Specific Training Args
Classes
| Name | Description |
|---|---|
| AxolotlAsyncGRPOConfig | Axolotl Async GRPO Config — adds async prefetch, streaming scoring, and IS correction. |
| AxolotlGRPOConfig | Axolotl GRPO Config for GRPO training |
AxolotlAsyncGRPOConfig
core.trainers.grpo.args.AxolotlAsyncGRPOConfig(
use_data_producer=False,
async_prefetch=False,
prefetch_depth=1,
vllm_sync_interval=1,
batch_flattening=False,
streaming_partial_batch=False,
streaming_min_groups=1,
vllm_importance_sampling_correction=True,
vllm_importance_sampling_mode='token_truncate',
vllm_importance_sampling_cap=3.0,
off_policy_mask_threshold=None,
use_bias_correction_kl=False,
reward_num_workers=1,
replay_buffer_size=0,
replay_recompute_logps=True,
reroll_start_fraction=0.5,
reroll_max_groups=1,
skip_zero_advantage_batches=True,
vllm_lora_sync=False,
context_parallel_size=None,
)Axolotl Async GRPO Config — adds async prefetch, streaming scoring, and IS correction.
AxolotlGRPOConfig
core.trainers.grpo.args.AxolotlGRPOConfig(context_parallel_size=None)Axolotl GRPO Config for GRPO training