Plano Orchestrator
Plano-Orchestrator is a family of 4B and 30B-A3B routing and orchestration models designed for multi-agent systems. It analyzes user intent and conversation context to make precise routing decisions, excelling at multi-turn context understanding, multi-intent detection, and context-dependent routing.
This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.
Getting started
Install Axolotl following the installation guide.
Install Cut Cross Entropy to reduce training VRAM usage.
Run the finetuning example:
axolotl train examples/plano/plano-4b-qlora.yaml
This config uses about 5.1 GiB VRAM. Let us know how it goes. Happy finetuning! 🚀
Orchestration Prompt
Plano-Orchestrator uses a specific orchestration prompt format for routing/agent decisions. Please check the official model card for proper prompt formatting and the ORCHESTRATION_PROMPT template.
Tips
- To use the larger Plano-Orchestrator-30B-A3B MoE model, simply change
base_model: katanemo/Plano-Orchestrator-30B-A3Bin the config and enable multi-GPU training if needed. - You can run a full finetuning by removing the
adapter: qloraandload_in_4bit: truefrom the config. - Read more on how to load your own dataset at docs.
- The dataset format follows the OpenAI Messages format as seen here.
Optimization Guides
Please check the Optimizations doc.