prompt_strategies.ebft.ebft_strided_structured
prompt_strategies.ebft.ebft_strided_structured
Dataset transform for structured (prompt, completion) data with strided EBFT.
Tokenizes prompt and completion separately, concatenates into a single input_ids sequence, and marks prompt tokens with labels=-100 so the strided trainer knows where to place anchors (completion span only).
Works with datasets that have chat-style fields (e.g., nvidia/OpenCodeInstruct).