utils.callbacks.opentelemetry
utils.callbacks.opentelemetry
OpenTelemetry metrics callback for Axolotl training
Classes
| Name | Description |
|---|---|
| OpenTelemetryMetricsCallback | TrainerCallback that exports training metrics to OpenTelemetry/Prometheus. |
OpenTelemetryMetricsCallback
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback(cfg)TrainerCallback that exports training metrics to OpenTelemetry/Prometheus.
This callback automatically tracks key training metrics including: - Training loss - Evaluation loss - Learning rate - Epoch progress - Global step count - Gradient norm
Metrics are exposed via HTTP endpoint for Prometheus scraping.
Methods
| Name | Description |
|---|---|
| on_evaluate | Called after evaluation |
| on_log | Called when logging occurs |
| on_step_end | Called at the end of each training step |
| on_train_begin | Called at the beginning of training |
| on_train_end | Called at the end of training |
on_evaluate
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback.on_evaluate(
args,
state,
control,
metrics=None,
**kwargs,
)Called after evaluation
on_log
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback.on_log(
args,
state,
control,
logs=None,
**kwargs,
)Called when logging occurs
on_step_end
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback.on_step_end(
args,
state,
control,
**kwargs,
)Called at the end of each training step
on_train_begin
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback.on_train_begin(
args,
state,
control,
**kwargs,
)Called at the beginning of training
on_train_end
utils.callbacks.opentelemetry.OpenTelemetryMetricsCallback.on_train_end(
args,
state,
control,
**kwargs,
)Called at the end of training