integrations.swanlab.callbacks
integrations.swanlab.callbacks
SwanLab callbacks for Axolotl trainers.
This module provides HuggingFace Trainer callbacks for logging RLHF completions to SwanLab.
Classes
| Name | Description |
|---|---|
| SwanLabRLHFCompletionCallback | Callback for logging RLHF completions to SwanLab. |
SwanLabRLHFCompletionCallback
integrations.swanlab.callbacks.SwanLabRLHFCompletionCallback(
log_interval=100,
max_completions=128,
table_name='rlhf_completions',
)Callback for logging RLHF completions to SwanLab.
This callback periodically logs model completions (prompts, chosen/rejected responses, rewards) to SwanLab during RLHF training for qualitative analysis.
Supports DPO, KTO, ORPO, and GRPO trainers.
Example usage
callback = SwanLabRLHFCompletionCallback( … log_interval=100, # Log every 100 steps … max_completions=128, # Keep last 128 completions … ) trainer.add_callback(callback)
Attributes
| Name | Type | Description |
|---|---|---|
| logger | CompletionLogger instance | |
| log_interval | Number of steps between SwanLab logging | |
| trainer_type | str | None | Auto-detected trainer type (dpo/kto/orpo/grpo) |
Methods
| Name | Description |
|---|---|
| on_init_end | Detect trainer type on initialization. |
| on_log | Capture completions from logs and buffer them. |
| on_train_end | Log remaining completions at end of training. |
on_init_end
integrations.swanlab.callbacks.SwanLabRLHFCompletionCallback.on_init_end(
args,
state,
control,
**kwargs,
)Detect trainer type on initialization.
on_log
integrations.swanlab.callbacks.SwanLabRLHFCompletionCallback.on_log(
args,
state,
control,
logs=None,
**kwargs,
)Capture completions from logs and buffer them.
Different trainers log completions in different formats: - DPO: logs[‘dpo/chosen’], logs[‘dpo/rejected’], logs[‘dpo/reward_diff’] - KTO: logs[‘kto/completion’], logs[‘kto/label’], logs[‘kto/reward’] - ORPO: logs[‘orpo/chosen’], logs[‘orpo/rejected’] - GRPO: logs[‘grpo/completion’], logs[‘grpo/reward’]
Note: This is a placeholder implementation. Actual log keys depend on the TRL trainer implementation. You may need to patch the trainers to expose completion data in logs.
on_train_end
integrations.swanlab.callbacks.SwanLabRLHFCompletionCallback.on_train_end(
args,
state,
control,
**kwargs,
)Log remaining completions at end of training.