Plano Orchestrator

Plano-Orchestrator is a family of 4B and 30B-A3B routing and orchestration models designed for multi-agent systems. It analyzes user intent and conversation context to make precise routing decisions, excelling at multi-turn context understanding, multi-intent detection, and context-dependent routing.

This guide shows how to fine-tune it with Axolotl with multi-turn conversations and proper masking.

Getting started

Install Axolotl following the installation guide.
Install Cut Cross Entropy to reduce training VRAM usage.

Run the finetuning example:

axolotl train examples/plano/plano-4b-qlora.yaml

This config uses about 5.1 GiB VRAM. Let us know how it goes. Happy finetuning! 🚀

Orchestration Prompt

Plano-Orchestrator uses a specific orchestration prompt format for routing/agent decisions. Please check the official model card for proper prompt formatting and the ORCHESTRATION_PROMPT template.

Tips

To use the larger Plano-Orchestrator-30B-A3B MoE model, simply change base_model: katanemo/Plano-Orchestrator-30B-A3B in the config and enable multi-GPU training if needed.
You can run a full finetuning by removing the adapter: qlora and load_in_4bit: true from the config.
Read more on how to load your own dataset at docs.
The dataset format follows the OpenAI Messages format as seen here.

Optimization Guides

Please check the Optimizations doc.

Getting started

Orchestration Prompt

Tips

Optimization Guides

Related Resources