core.trainers.grpo
core.trainers.grpo
GRPO Specific Strategy for training
Classes
GRPOStrategy
core.trainers.grpo.GRPOStrategy()
Strategy for GRPO training
Methods
| get_reward_func |
Returns the reward function from the given fully qualified name, or the path to the reward function model. |
| get_rollout_func |
Returns the rollout function from the given fully qualified name. |
get_reward_func
core.trainers.grpo.GRPOStrategy.get_reward_func(reward_func_fqn)
Returns the reward function from the given fully qualified name, or the path to the reward function model.
Parameters
| reward_func_fqn |
str |
Fully qualified name of the reward function (e.g. r1_grpo.gsm8k_transform), or a HF hub path to the reward model. |
required |
Returns
| RewardFunc |
RewardFunc |
A callable that accepts prompts and completions and returns rewards, or a path to a reward model. |
Raises
|
ValueError |
If the reward function does not accept at least two arguments. |
get_rollout_func
core.trainers.grpo.GRPOStrategy.get_rollout_func(rollout_func_fqn)
Returns the rollout function from the given fully qualified name.
Parameters
| rollout_func_fqn |
str |
Fully qualified name of the rollout function (e.g. my_module.my_rollout_func) |
required |
Returns
|
|
Callable rollout function |