integrations.kernels.autotune_callback
integrations.kernels.autotune_callback
Trainer callback for reporting Triton autotune results from scattermoe-lora kernels.
Classes
| Name | Description |
|---|---|
| AutotuneReportCallback | Reports Triton kernel autotune selections via telemetry. |
AutotuneReportCallback
integrations.kernels.autotune_callback.AutotuneReportCallback()Reports Triton kernel autotune selections via telemetry.
Fires once after the first training step completes (step 1), at
which point the forward and backward passes have both run and the
autotuned kernels have populated their caches. If for some reason
the caches are still empty (e.g. the kernel was never invoked), the
callback retries on subsequent steps up to _MAX_POLL_STEP and
then stops polling.
After reporting (or giving up) every subsequent on_step_end
call short-circuits on the _reported flag — zero hot-path cost.