Every shared inbox has the same quiet problem: support@, sales@, or hello@ fills up with a jumble of refund requests, partnership pitches, bug reports, and invoices — and a human has to read each one and forward it to the right person. It is slow, it gets skipped on busy days, and the sender waits hours for something a machine could sort in seconds.
An AI agent fixes this. Unlike old keyword rules (“if subject contains ‘invoice’ → Finance”), an agent actually reads the email, understands intent, and decides where it belongs — even when the wording is messy or the customer never uses the “right” word. Here is how to build one without writing code, what it costs, and where this approach genuinely is not the right tool.
What “routing agent” actually means here
We are building a classifier with actions. The agent does four things on every incoming email:
- Reads the sender, subject, and body.
- Classifies it into one of your defined categories (e.g. Billing, Technical Support, Sales, Partnerships, Spam).
- Routes it — forwards to a team alias, applies a Gmail label, assigns in a helpdesk, or posts to a Slack channel.
- Logs the decision so you can audit and improve it.
The “no code” part is real: you wire this in a visual builder (Make, Zapier, or n8n) and let a language model do the thinking. No servers, no deployment.
Pick your stack
You need two things: an automation platform to catch the email and move it, and an LLM to classify it. Here is how the common no-code platforms compare for this specific job.
| Platform | Best for | Typical cost | Watch out for |
|---|---|---|---|
| Zapier | Fastest to set up; huge app library; built-in AI step | ~$20–70/mo; counts every email as a task | Gets expensive at high volume; less control over the AI prompt |
| Make (Integromat) | Visual flows with branching logic; cheaper per operation | ~$9–30/mo for thousands of ops | Steeper first hour; you design the routing tree yourself |
| n8n | Power users; self-host = no per-task fee; full control | Free self-hosted, or ~$20/mo cloud | You manage hosting if self-hosted; more setup |
Our honest pick for most teams: start in Make. It is cheap enough that you will not flinch at testing, and its branching router maps almost one-to-one onto “send Billing here, Support there.” If you already live in Zapier and volume is under a few hundred emails a month, just use Zapier — the simplicity is worth the slightly higher price. Choose n8n only if you are comfortable self-hosting or expect thousands of emails and want to kill per-task costs.
For the model, GPT-4o-mini, Claude Haiku, or Gemini Flash are all excellent and cheap for classification — you do not need a flagship model to sort email. We typically use a small, fast model and spend the savings on running it reliably.
Build it step by step (Make + Gmail example)
1. Trigger on new mail
Add a Gmail → Watch Emails module (or “Watch Emails in a folder”). Point it at your shared inbox and filter to unread in the inbox. Connect your Google account through Make’s OAuth — no API keys to manage. If you are on Microsoft, use the Email (IMAP) or Outlook module instead; the rest of the flow is identical.
2. Send the email to the model
Add an OpenAI (or Anthropic) module → “Create a Chat Completion.” This is where the real work happens — in the prompt. The single biggest mistake people make is a vague prompt that lets the model invent categories. Pin it down hard:
- List your exact categories and a one-line definition of each.
- Force a structured answer. Ask for JSON only.
- Give it an escape hatch: an
"unsure"category for anything ambiguous.
A prompt that works in production looks like this:
You are an email router for [Company]. Classify the email into exactly one category and return only JSON.
Categories:
– billing: payments, invoices, refunds, subscription changes
– support: bugs, errors, “it’s not working”, how-to questions
– sales: pricing questions from prospects, demo requests, new purchases
– partnerships: collaboration, affiliate, press, integration proposals
– spam: unsolicited promotions, irrelevant cold pitches
– unsure: anything that doesn’t clearly fit one category
Return:{"category": "...", "confidence": 0-100, "reason": "short phrase"}
Email from: {{sender}}
Subject: {{subject}}
Body: {{body}}
The confidence and reason fields are not decoration — they are what make this trustworthy. You will use confidence to decide what gets auto-routed versus held for a human, and reason turns your log into something you can actually debug.
3. Parse the answer
Add a JSON → Parse JSON module pointed at the model’s output. Now category, confidence, and reason are clean fields you can branch on. (Asking the model for JSON and parsing it is far more reliable than trying to read free-form text downstream.)
4. Route with a confidence gate
Add a Router module. Create one path per category, each with a filter. Crucially, add a confidence threshold to the rule, for example: category = billing AND confidence ≥ 80 → forward to billing@yourco.com. Then add one catch-all path: confidence < 80 OR category = unsure → forward to a human triage inbox (or apply a “needs-review” label).
That gate is the difference between a toy and something you can leave running. The agent handles the obvious 85%; the genuinely ambiguous 15% still reaches a person, so nothing falls through a crack.
For the action itself, pick whatever your teams already use:
- Forward to a team alias (simplest, works everywhere).
- Apply a Gmail label and skip the inbox (good if teams work from filtered views).
- Create a ticket in Zendesk, HubSpot, or Freshdesk with the category as the assignment group.
- Post to Slack — e.g. drop sales leads into
#sales-inboundwith the sender and reason.
5. Log every decision
Add a Google Sheets → Add a Row step recording timestamp, sender, subject, chosen category, confidence, and reason. This sheet is your quality control. After a week you will spot patterns — “every German invoice gets sent to support” — and tighten the prompt accordingly. Skip this and you are flying blind.
Test before you trust it
Do not point this at your live inbox on day one. Run it in suggest-only mode first: have it write the predicted category to the log sheet (and maybe a Slack message) without forwarding anything. Let it shadow real mail for a few days, eyeball the sheet, and only flip on the forwarding actions once you trust the calls. We have never deployed one of these straight to production — the shadow run always surfaces two or three edge cases worth fixing first.
When you do go live, keep a human on the catch-all path for at least the first two weeks.
When this is the WRONG tool
Be honest with yourself before building:
- You have under ~30 emails a day and clear keywords. Plain Gmail filters or Outlook rules are free, instant, and need no maintenance. An AI agent is overkill — do not add a model and a monthly bill to solve a problem a filter handles.
- Misrouting is dangerous, not just annoying. Legal, medical, or financial-compliance mail where a wrong route has real consequences should not be auto-sent on an 80% confidence score. Use the agent to suggest and label, but keep a human approving every move.
- Your categories overlap heavily. If even your own staff argue about whether something is “support” or “billing,” the model will struggle too. Fix the taxonomy first; no prompt rescues fuzzy categories.
- Strict data-privacy rules. If email contents cannot legally leave your infrastructure, a public LLM API is off the table — you would need a self-hosted model (n8n + a local model), which is no longer truly “no code.”
Frequently asked questions
How accurate is an AI email router, really?
For well-separated categories and a tight prompt, expect roughly 90–95% correct on clear emails — and the confidence gate sends the rest to a human, so the practical “things that go wrong silently” rate is far lower. Accuracy depends almost entirely on category clarity and prompt quality, not on which model you pick. Run it in shadow mode for a week and you will see your real number before committing.
What does it cost to run per month?
For most small teams, surprisingly little. A small model like GPT-4o-mini or Gemini Flash costs a fraction of a cent per email — a few thousand emails a month is often under a dollar in model usage. The bigger line item is the automation platform: roughly $9–30/mo for Make, or $20–70/mo for Zapier depending on volume. Self-hosted n8n can drop the platform cost to zero if you handle hosting.
Can it reply to emails too, not just sort them?
Yes, and it is the natural next step — add an LLM step that drafts a reply for the category, then route the draft to a human for one-click approval rather than auto-sending. Start with routing only, get it reliable, then layer drafting on top. Auto-sending AI replies without review is where teams get burned, so keep a person in the loop until you have weeks of clean data.
Your next step
Pick one painful category to start — usually billing or sales, because misrouting those costs you money or leads. Build the five-step Make flow above for that single category in suggest-only mode, point it at your inbox, and watch the log sheet for three days. Once you trust it on one category, adding the rest is just more rows in the router. You will have a working, no-code email-routing agent running before lunch — and a shared inbox that sorts itself by the end of the week.