prompt_strategies.ebft.ebft_chat_multiturn

prompt_strategies.ebft.ebft_chat_multiturn

Dataset transform for multi-turn chat data with structured EBFT (vLLM mode).

Three variants:

transform — Uses the FIRST assistant turn as the generation target. Passes remaining turns as remaining_turns for sequential rollout. The trainer generates turn 1 via GRPO/vLLM, then sequentially generates subsequent assistant turns, comparing the full conversation to GT.
transform_last_turn — Uses the LAST assistant turn as the target. Simplest approach: the full conversation history is the prompt.
transform_all_turns — Explodes each conversation into N examples (one per assistant turn). Each turn is an independent training example. Use with batched=True.

Supports OpenAI chat format

{“messages”: [{“role”: …, “content”: …}, …]}

Name	Description
transform	Multi-turn with sequential rollout.
transform_all_turns	Explode: one example per assistant turn.
transform_last_turn	Single-turn: use the last assistant turn as the generation target.

prompt_strategies.ebft.ebft_chat_multiturn.transform(cfg, **kwargs)

Multi-turn with sequential rollout.

Returns the first assistant turn as ground_truth, plus remaining_turns for the trainer to do sequential rollout generation.

prompt_strategies.ebft.ebft_chat_multiturn.transform_all_turns(cfg, **kwargs)

Explode: one example per assistant turn.

Use with datasets.map(batched=True) to produce N examples from each N-turn conversation.

type: ebft_chat_multiturn.transform_all_turns

prompt_strategies.ebft.ebft_chat_multiturn.transform_last_turn(cfg, **kwargs)

Single-turn: use the last assistant turn as the generation target.