Plano Orchestrator

Plano-Orchestrator is a family of 4B and 30B-A3B routing and orchestration models designed for multi-agent systems. It analyzes user intent and conversation context to make precise routing decisions, excelling at multi-turn context understanding, multi-intent detection, and context-dependent routing.

This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.

Getting started

  1. Install Axolotl following the installation guide.

  2. Install Cut Cross Entropy to reduce training VRAM usage.

  3. Run the finetuning example:

    axolotl train examples/plano/plano-4b-qlora.yaml

This config uses about 5.1 GiB VRAM. Let us know how it goes. Happy finetuning! 🚀

Orchestration Prompt

Plano-Orchestrator uses a specific orchestration prompt format for routing/agent decisions. Please check the official model card for proper prompt formatting and the ORCHESTRATION_PROMPT template.

Tips

  • To use the larger Plano-Orchestrator-30B-A3B MoE model, simply change base_model: katanemo/Plano-Orchestrator-30B-A3B in the config and enable multi-GPU training if needed.
  • You can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.
  • Read more on how to load your own dataset at docs.
  • The dataset format follows the OpenAI Messages format as seen here.

Optimization Guides

Please check the Optimizations doc.