integrations.nemo_gym.plugin

integrations.nemo_gym.plugin

NeMo Gym Plugin for Axolotl.

Integrates NVIDIA NeMo Gym environments as reward sources for GRPO training. Handles server lifecycle, dataset loading, and reward function wiring.

Supports two modes

Single-turn (default): reward_fn calls /verify after each generation
Multi-turn (nemo_gym_multi_turn: true): rollout_func orchestrates multi-step interactions with tool execution via resource servers

Classes

Name	Description
NemoGymPlugin	Plugin for NVIDIA NeMo Gym integration with Axolotl.
VLLMWeightSyncCapabilities	What weight-sync routes a vLLM server actually exposes.

NemoGymPlugin

integrations.nemo_gym.plugin.NemoGymPlugin()

Plugin for NVIDIA NeMo Gym integration with Axolotl.

When enabled, this plugin: 1. Clones and sets up the NeMo Gym repo (if needed) 2. Starts NeMo Gym resource servers 3. Loads datasets from NeMo Gym JSONL files 4. For single-turn: creates a reward function calling /verify 5. For multi-turn: creates a rollout_func with tool execution and env_mask

Methods

Name	Description
get_training_args	Pass through vLLM settings and force async trainer for multi-turn.
post_train_unload	Cleanup NeMo Gym servers if we started them.
post_trainer_create	Wire NeMo Gym into the trainer (reward_fn or rollout_func).
pre_model_load	Probe vLLM weight-sync routes and conditionally bypass NCCL init.

get_training_args

integrations.nemo_gym.plugin.NemoGymPlugin.get_training_args(cfg)

Pass through vLLM settings and force async trainer for multi-turn.

post_train_unload

integrations.nemo_gym.plugin.NemoGymPlugin.post_train_unload(cfg)

Cleanup NeMo Gym servers if we started them.

post_trainer_create

integrations.nemo_gym.plugin.NemoGymPlugin.post_trainer_create(cfg, trainer)

Wire NeMo Gym into the trainer (reward_fn or rollout_func).

pre_model_load

integrations.nemo_gym.plugin.NemoGymPlugin.pre_model_load(cfg)

Probe vLLM weight-sync routes and conditionally bypass NCCL init.

Replaces the previous unconditional init_communicator monkey-patch with a probe of the configured vLLM server’s /openapi.json. We only bypass NCCL init when the server we’re talking to actually lacks the /init_communicator/ route (i.e. stock vllm serve); against TRL/axolotl serve modules that DO expose NCCL routes, we leave the standard TRL flow alone so full-finetune training can sync weights.

VLLMWeightSyncCapabilities

integrations.nemo_gym.plugin.VLLMWeightSyncCapabilities(
    nccl=False,
    lora_filesystem=False,
    lora_axolotl=False,
    http_full=False,
    probed=False,
    probe_error=None,
    routes=list(),
)

What weight-sync routes a vLLM server actually exposes.

Discovered once at pre_model_load time by fetching the server’s /openapi.json. Drives the transport-selection table below.

Attributes

Name	Description
any_full_param_sync	True if at least one transport can push full-model weights.
any_lora_sync	True if at least one transport can push LoRA adapters.

Functions

Name	Description
probe_vllm_weight_sync	Detect which weight-sync routes the configured vLLM server exposes.
select_weight_sync_transport	Pick the right transport for a (server caps, model type) combo.

probe_vllm_weight_sync

integrations.nemo_gym.plugin.probe_vllm_weight_sync(base_url, timeout=5.0)

Detect which weight-sync routes the configured vLLM server exposes.

Uses the server’s FastAPI /openapi.json — every weight-sync transport we care about is mounted as a POST route there. Falls back to all-False on any error so the caller can still decide what to do (typically: raise a clear error rather than silently no-op).

select_weight_sync_transport

integrations.nemo_gym.plugin.select_weight_sync_transport(
    caps,
    *,
    has_lora,
    vllm_lora_sync_pref,
)

Pick the right transport for a (server caps, model type) combo.

Returns one of: "lora_filesystem", "nccl", "http_full", or "none". The caller decides what to do with "none" (typically: raise an error explaining the misconfiguration).

Selection table

LoRA model + lora endpoint + lora-sync pref → lora_filesystem LoRA model + lora endpoint → lora_filesystem LoRA model + nccl endpoint → nccl (broadcast merged adapter) Full model + nccl endpoint → nccl Full model + http endpoint → http_full anything else → none