core.trainers.grpo

core.trainers.grpo

GRPO Specific Strategy for training

Classes

Name Description
GRPOStrategy Strategy for GRPO training

GRPOStrategy

core.trainers.grpo.GRPOStrategy()

Strategy for GRPO training

Methods

Name Description
get_reward_func Returns the reward function from the given fully qualified name, or the path to the reward function model.
get_rollout_func Returns the rollout function from the given fully qualified name.
get_reward_func
core.trainers.grpo.GRPOStrategy.get_reward_func(reward_func_fqn)

Returns the reward function from the given fully qualified name, or the path to the reward function model.

Parameters
Name Type Description Default
reward_func_fqn str Fully qualified name of the reward function (e.g. r1_grpo.gsm8k_transform), or a HF hub path to the reward model. required
Returns
Name Type Description
RewardFunc RewardFunc A callable that accepts prompts and completions and returns rewards, or a path to a reward model.
Raises
Name Type Description
ValueError If the reward function does not accept at least two arguments.
get_rollout_func
core.trainers.grpo.GRPOStrategy.get_rollout_func(rollout_func_fqn)

Returns the rollout function from the given fully qualified name.

Parameters
Name Type Description Default
rollout_func_fqn str Fully qualified name of the rollout function (e.g. my_module.my_rollout_func) required
Returns
Name Type Description
Callable rollout function