integrations.nemo_gym.plugin
integrations.nemo_gym.plugin
NeMo Gym Plugin for Axolotl.
Integrates NVIDIA NeMo Gym environments as reward sources for GRPO training. Handles server lifecycle, dataset loading, and reward function wiring.
Supports two modes
- Single-turn (default): reward_fn calls /verify after each generation
- Multi-turn (nemo_gym_multi_turn: true): rollout_func orchestrates multi-step interactions with tool execution via resource servers
Classes
| Name | Description |
|---|---|
| NemoGymPlugin | Plugin for NVIDIA NeMo Gym integration with Axolotl. |
| VLLMWeightSyncCapabilities | What weight-sync routes a vLLM server actually exposes. |
NemoGymPlugin
integrations.nemo_gym.plugin.NemoGymPlugin()Plugin for NVIDIA NeMo Gym integration with Axolotl.
When enabled, this plugin: 1. Clones and sets up the NeMo Gym repo (if needed) 2. Starts NeMo Gym resource servers 3. Loads datasets from NeMo Gym JSONL files 4. For single-turn: creates a reward function calling /verify 5. For multi-turn: creates a rollout_func with tool execution and env_mask
Methods
| Name | Description |
|---|---|
| get_training_args | Pass through vLLM settings and force async trainer for multi-turn. |
| post_train_unload | Cleanup NeMo Gym servers if we started them. |
| post_trainer_create | Wire NeMo Gym into the trainer (reward_fn or rollout_func). |
| pre_model_load | Probe vLLM weight-sync routes and conditionally bypass NCCL init. |
get_training_args
integrations.nemo_gym.plugin.NemoGymPlugin.get_training_args(cfg)Pass through vLLM settings and force async trainer for multi-turn.
post_train_unload
integrations.nemo_gym.plugin.NemoGymPlugin.post_train_unload(cfg)Cleanup NeMo Gym servers if we started them.
post_trainer_create
integrations.nemo_gym.plugin.NemoGymPlugin.post_trainer_create(cfg, trainer)Wire NeMo Gym into the trainer (reward_fn or rollout_func).
pre_model_load
integrations.nemo_gym.plugin.NemoGymPlugin.pre_model_load(cfg)Probe vLLM weight-sync routes and conditionally bypass NCCL init.
Replaces the previous unconditional init_communicator monkey-patch
with a probe of the configured vLLM server’s /openapi.json. We only
bypass NCCL init when the server we’re talking to actually lacks the
/init_communicator/ route (i.e. stock vllm serve); against
TRL/axolotl serve modules that DO expose NCCL routes, we leave the
standard TRL flow alone so full-finetune training can sync weights.
VLLMWeightSyncCapabilities
integrations.nemo_gym.plugin.VLLMWeightSyncCapabilities(
nccl=False,
lora_filesystem=False,
lora_axolotl=False,
http_full=False,
probed=False,
probe_error=None,
routes=list(),
)What weight-sync routes a vLLM server actually exposes.
Discovered once at pre_model_load time by fetching the server’s
/openapi.json. Drives the transport-selection table below.
Attributes
| Name | Description |
|---|---|
| any_full_param_sync | True if at least one transport can push full-model weights. |
| any_lora_sync | True if at least one transport can push LoRA adapters. |
Functions
| Name | Description |
|---|---|
| probe_vllm_weight_sync | Detect which weight-sync routes the configured vLLM server exposes. |
| select_weight_sync_transport | Pick the right transport for a (server caps, model type) combo. |
probe_vllm_weight_sync
integrations.nemo_gym.plugin.probe_vllm_weight_sync(base_url, timeout=5.0)Detect which weight-sync routes the configured vLLM server exposes.
Uses the server’s FastAPI /openapi.json — every weight-sync transport
we care about is mounted as a POST route there. Falls back to all-False
on any error so the caller can still decide what to do (typically: raise
a clear error rather than silently no-op).
select_weight_sync_transport
integrations.nemo_gym.plugin.select_weight_sync_transport(
caps,
*,
has_lora,
vllm_lora_sync_pref,
)Pick the right transport for a (server caps, model type) combo.
Returns one of: "lora_filesystem", "nccl", "http_full", or
"none". The caller decides what to do with "none" (typically:
raise an error explaining the misconfiguration).
Selection table
LoRA model + lora endpoint + lora-sync pref → lora_filesystem LoRA model + lora endpoint → lora_filesystem LoRA model + nccl endpoint → nccl (broadcast merged adapter) Full model + nccl endpoint → nccl Full model + http endpoint → http_full anything else → none