Multi-Agent Routing Patterns
- How triage agents route conversations to specialist agents
- Designing handoff patterns that preserve conversation context
- Loop guards and escalation to prevent infinite agent cycles
- Routing feedback mechanisms that improve accuracy over time
Single-agent systems work well for focused tasks. But when your platform handles pre-sales questions, support tickets, and billing inquiries through the same channel, one agent can't do it all. Multi-agent routing solves this by letting a triage agent classify incoming work and hand it to the right specialist.
Why Multiple Agents
A single agent with a massive prompt that covers every scenario gets worse as you add capabilities. Response quality drops, latency increases, and the agent starts confusing contexts. Splitting into specialised agents gives you:
- Focused prompts — each agent has a clear, narrow job
- Independent improvement — tune one agent without affecting others
- Different tool sets — the sales agent accesses CRM data; the support agent accesses ticket history
- Clearer audit trails — you know which agent made which decision
The Triage-Route-Respond Pattern
The standard multi-agent architecture has three layers:
1. Triage Agent
The triage agent is the entry point. It reads the incoming message and classifies it into a category. It doesn't try to answer — it just decides who should.
triage:
prompt: |
You are a routing agent. Read the incoming message and classify it.
Categories:
- pre_sales: Product questions, pricing, feature comparisons
- support: Technical issues, bugs, account problems
- billing: Payment, invoices, plan changes
- escalate: Unclear, sensitive, or multi-category
Message: {{trigger.message}}
Visitor: {{trigger.visitor.company}} ({{trigger.visitor.plan}})
Return JSON: { "category": "...", "confidence": 0.0-1.0, "reason": "..." }
The confidence score matters. If the triage agent isn't sure, route to a human or a more general agent rather than guessing wrong.
2. Specialist Agents
Each specialist agent handles one category. It receives the original message plus any context the triage agent extracted.
The key design principle: specialist agents should be stateless about routing. They don't know they were chosen by a triage agent. They just respond to the message with their specific expertise and tool set.
pre_sales_agent:
prompt: |
You are a pre-sales assistant for {{workspace.company}}.
Answer product questions using the context below.
Visitor: {{trigger.visitor.name}} at {{trigger.visitor.company}}
Plan: {{trigger.visitor.plan}}
Message: {{trigger.message}}
Conversation history: {{trigger.history}}
tools:
- fetch_workspace_context
- crm_lookup_contact
3. Response and Handoff
After the specialist responds, the workflow can:
- Send the response directly to the visitor
- Queue for HITL review if confidence was below a threshold
- Hand off to another agent if the conversation shifts topic
Handoff Patterns
Conversations don't always stay in one category. A pre-sales question can turn into a support issue mid-conversation. There are two approaches:
Re-triage on Every Message
Run the triage agent on each new message in the conversation. This is simple but can cause jarring agent switches mid-flow.
Sticky Routing with Override
Keep the conversation with the current agent unless the triage agent detects a clear category change with high confidence. This gives a more natural experience.
Key takeaway: Sticky routing with override works best for chat. Re-triage on every message works better for email, where each message is more self-contained.
Loop Guards
Without guardrails, agents can bounce conversations between each other indefinitely. A support agent might decide something is a billing issue, and the billing agent might decide it's a support issue.
Prevent this with:
- Hop counter — track how many times a conversation has been routed. After 2-3 hops, escalate to a human
- No-return rule — an agent can't route back to the agent that just routed to it
- Escalation timeout — if no agent resolves within N messages, escalate automatically
Routing Feedback
The triage agent's accuracy improves when you feed back results:
- Correction memory — when a human re-routes a misclassified message, store the original message + correct category as a correction
- Few-shot retrieval — before the triage agent classifies a new message, retrieve similar past corrections and include them in the prompt as examples
- Confidence calibration — track whether high-confidence classifications are actually correct, and adjust escalation thresholds accordingly
This creates a flywheel: more conversations lead to more corrections, which lead to better routing, which leads to fewer corrections needed.
Practical Considerations
Latency: Each routing hop adds latency. The triage agent should be fast (use a smaller model like Claude Haiku) while specialist agents can use more capable models.
Context window: Pass conversation history to specialist agents so they don't ask questions the visitor already answered. But trim older messages to avoid hitting token limits.
Visibility: Log every routing decision with the triage agent's category, confidence, and reason. This makes it straightforward to debug when routing goes wrong.
Try it: In Outrun, multi-agent routing is built into the chat agent templates. Deploy a Pre-Sales and Customer Success agent, and the auto-router handles triage automatically. Check the workflow run history to see how conversations were routed.
Summary
Multi-agent routing lets you scale AI chat without sacrificing quality. The triage agent keeps conversations moving to the right place, specialist agents stay focused on what they do best, and correction memory closes the feedback loop. Start with two or three specialists, review the routing decisions, and expand from there.