prompt_strategies.ebft.ebft_reasoning

prompt_strategies.ebft.ebft_reasoning

Dataset transform for reasoning/thinking datasets with EBFT.

Handles datasets where assistant responses contain reasoning traces (e.g., TeichAI/Claude-Opus-4.6-Reasoning, Qwen3.5 thinking mode outputs).

Two variants:

  1. transform — For structured EBFT (vLLM mode): Returns prompt + ground_truth with thinking tags preserved. Feature matching compares full responses (thinking + answer).

  2. transform_answer_only — For structured EBFT (vLLM mode): Strips from ground_truth, so feature matching only scores the final answer portion. Use when reasoning chains can vary but the answer should match.

  3. transform_strided — For strided EBFT: Tokenizes the full conversation with thinking traces. Optionally masks thinking tokens from CE loss (labels=-100 for think spans) while still placing anchors in thinking regions for feature matching.

All variants work with OpenAI chat format

{“messages”: [{“role”: “…”, “content”: “Answer”}]}

Functions

Name Description
transform Full response including thinking traces for feature matching.
transform_answer_only Strip thinking from ground_truth — match features on answer only.
transform_split_thinking Split tags into reasoning_content field for native chat template handling.
transform_strided For strided EBFT: tokenize with thinking, optionally mask think tokens from CE loss.

transform

prompt_strategies.ebft.ebft_reasoning.transform(cfg, **kwargs)

Full response including thinking traces for feature matching.

For datasets where assistant content has tags in the content field. The ground_truth includes the full content (thinking + answer).

transform_answer_only

prompt_strategies.ebft.ebft_reasoning.transform_answer_only(cfg, **kwargs)

Strip thinking from ground_truth — match features on answer only.

transform_split_thinking

prompt_strategies.ebft.ebft_reasoning.transform_split_thinking(cfg, **kwargs)

Split tags into reasoning_content field for native chat template handling.

For datasets where thinking is embedded in the content field as . Splits it into separate reasoning_content and content fields so the model’s chat template can format it natively (e.g., Qwen3.5’s reasoning_content support).

The prompt messages are passed through with reasoning_content properly split, so vLLM generation with enable_thinking=true produces comparable outputs. The ground_truth is the full assistant response (thinking + answer) for feature matching.

Also works for: - tags - <|begin_of_thought|>…<|end_of_thought|> tags

transform_strided

prompt_strategies.ebft.ebft_reasoning.transform_strided(cfg, **kwargs)

For strided EBFT: tokenize with thinking, optionally mask think tokens from CE loss.

Config options (via cfg): - ebft.mask_thinking_ce: bool (default False) If True, set labels=-100 for tokens inside blocks. Feature matching still uses these positions (anchors are placed everywhere in the completion span). Only CE auxiliary loss is affected.