utils.callbacks.dynamic_checkpoint
utils.callbacks.dynamic_checkpoint
Classes
| Name | Description |
|---|---|
| DynamicCheckpointCallback | Callback to save checkpoints on-demand during training via: |
DynamicCheckpointCallback
utils.callbacks.dynamic_checkpoint.DynamicCheckpointCallback(cfg)Callback to save checkpoints on-demand during training via: 1. File-based trigger (works everywhere, rank 0 checks file)
Thread-safe for multi-GPU distributed training.
Usage
File-based:
touch /path/to/output_dir/axolotl_checkpoint.save
Methods
| Name | Description |
|---|---|
| on_step_end | Check for checkpoint triggers at the end of each step. |
on_step_end
utils.callbacks.dynamic_checkpoint.DynamicCheckpointCallback.on_step_end(
args,
state,
control,
**_kwargs,
)Check for checkpoint triggers at the end of each step. ONLY rank 0 checks the file, then all ranks synchronize.