telemetry.runtime_metrics

telemetry.runtime_metrics

Telemetry utilities for runtime and memory metrics.

Classes

Name Description
RuntimeMetrics Container for runtime metrics to be tracked throughout training.
RuntimeMetricsTracker Tracker for runtime metrics during training.

RuntimeMetrics

telemetry.runtime_metrics.RuntimeMetrics(
    start_time,
    peak_cpu_memory=0,
    total_steps=0,
    current_epoch=0,
    current_step=0,
)

Container for runtime metrics to be tracked throughout training.

Attributes

Name Description
elapsed_time Calculate total elapsed time in seconds.

Methods

Name Description
average_epoch_time Calculate average time per epoch in seconds.
epoch_time Calculate time taken for a specific epoch in seconds.
steps_per_second Calculate average steps per second across all training.
to_dict Convert metrics to a dictionary for telemetry reporting.
average_epoch_time
telemetry.runtime_metrics.RuntimeMetrics.average_epoch_time()

Calculate average time per epoch in seconds.

epoch_time
telemetry.runtime_metrics.RuntimeMetrics.epoch_time(epoch)

Calculate time taken for a specific epoch in seconds.

steps_per_second
telemetry.runtime_metrics.RuntimeMetrics.steps_per_second()

Calculate average steps per second across all training.

to_dict
telemetry.runtime_metrics.RuntimeMetrics.to_dict()

Convert metrics to a dictionary for telemetry reporting.

RuntimeMetricsTracker

telemetry.runtime_metrics.RuntimeMetricsTracker()

Tracker for runtime metrics during training.

Methods

Name Description
end_epoch Record the end of an epoch.
get_memory_metrics Get the current memory metrics as a dictionary.
start_epoch Record the start of a new epoch.
update_memory_metrics Update peak memory usage metrics.
update_step Update the current step count.
end_epoch
telemetry.runtime_metrics.RuntimeMetricsTracker.end_epoch(epoch)

Record the end of an epoch.

get_memory_metrics
telemetry.runtime_metrics.RuntimeMetricsTracker.get_memory_metrics()

Get the current memory metrics as a dictionary.

start_epoch
telemetry.runtime_metrics.RuntimeMetricsTracker.start_epoch(epoch)

Record the start of a new epoch.

update_memory_metrics
telemetry.runtime_metrics.RuntimeMetricsTracker.update_memory_metrics()

Update peak memory usage metrics.

update_step
telemetry.runtime_metrics.RuntimeMetricsTracker.update_step(step)

Update the current step count.