prompt_strategies.ebft.ebft_strided_chat

prompt_strategies.ebft.ebft_strided_chat

Dataset transform for multi-turn chat data with strided EBFT.

Tokenizes conversations using the model’s chat template, producing input_ids with labels=-100 for system/user turns and real labels for assistant turns. The strided trainer places anchors only within assistant completion spans.

Works with datasets in OpenAI chat format

[{“role”: “user”, “content”: “…”}, {“role”: “assistant”, “content”: “…”}]