monkeypatch.accelerate.parallelism_config
monkeypatch.accelerate.parallelism_config
ParallelismConfig monkeypatch.
Two extensions:
- Allow pure CP standalone via ACCELERATE_ALLOW_CP_STANDALONE.
- Add Expert Parallel (ep) as a first-class mesh axis inside the
data-parallel group. Mesh order is (ep, dp_replicate, dp_shard, cp, sp, tp)
so the dp axes stay contiguous (required for _flatten("dp")).
See expert_parallel/README.md for the full integration story.
Functions
| Name | Description |
|---|---|
| patch_clip_grad_norm_for_ep | Replace Accelerator.clip_grad_norm_ with the EP-aware version when |
| patch_prepare_data_loader_for_ep | Apply the EP-aware data-loader patch. |
| patched_is_fsdp2 | Patched version of is_fsdp2 that guards against a None fsdp_plugin. |
patch_clip_grad_norm_for_ep
monkeypatch.accelerate.parallelism_config.patch_clip_grad_norm_for_ep()Replace Accelerator.clip_grad_norm_ with the EP-aware version when
the active parallelism includes both ep and dp_shard (i.e., the
FSDP+EP composition produces multi-mesh DTensor grads).
patch_prepare_data_loader_for_ep
monkeypatch.accelerate.parallelism_config.patch_prepare_data_loader_for_ep()Apply the EP-aware data-loader patch.
Idempotent: replacing the bound function more than once is harmless because
the wrapper closes over the current prepare_data_loader.
patched_is_fsdp2
monkeypatch.accelerate.parallelism_config.patched_is_fsdp2(self)Patched version of is_fsdp2 that guards against a None fsdp_plugin.