prompt_strategies.ebft.ebft_strided_structured

prompt_strategies.ebft.ebft_strided_structured

Dataset transform for structured (prompt, completion) data with strided EBFT.

Tokenizes prompt and completion separately, concatenates into a single input_ids sequence, and marks prompt tokens with labels=-100 so the strided trainer knows where to place anchors (completion span only).

Works with datasets that have chat-style fields (e.g., nvidia/OpenCodeInstruct).