Building AI agents without using templates

The first time we tried to build a support routing agent on top of an existing chatbot framework, we spent three weeks getting it to work reliably in staging and two days watching it fall apart in production. Not dramatically — it just quietly started classifying edge cases wrong, missing routing rules we hadn't thought to encode, and producing responses that were technically correct but contextually off in ways that needed a human to catch each time.

We scrapped it and started over from the process.

What "starting from the process" means in practice

It sounds obvious when you say it: before building an AI agent, map the process it's supposed to automate. But in practice, most teams skip this step or abbreviate it. They know roughly what they want the agent to do, so they find a framework that looks close, configure it, and start testing. The framework gives them a structure to fill in. That structure is the problem.

Frameworks are built around general cases. A chatbot framework is designed to handle a wide variety of conversational patterns, so it makes assumptions about how inputs arrive, how decisions branch, and what outputs look like. Those assumptions are reasonable for a lot of use cases. They're often wrong for yours.

When we start from the process, we produce a document before any code exists. It covers: what triggers the process (an email, a form submission, a calendar event, a record state change), what data the agent needs access to, what decisions it makes at each step, what it outputs and where, and what conditions should cause it to escalate or stop. We review that document with the client. We look for gaps and disagreements. We revise it.

Only then do we write the agent.

The framework gives you a structure to fill in. That structure is the problem.

Why templates create invisible dependencies

One thing that kept biting us with framework-based approaches was invisible dependencies — assumptions the framework made that weren't visible until something broke. A routing framework might assume all tickets arrive as plain text, when your system occasionally passes structured JSON. It might expect a response to be a single string, when your ticketing tool needs a response object with specific fields.

These aren't hard problems. They're just opaque until you hit them. When you design the agent from scratch against your actual data and API contracts, these things surface in the design phase, not in production.

There's also the question of confidence handling. Most chatbot frameworks have a built-in confidence threshold, and when the agent falls below it, they either produce a fallback response or pass to a human. But what counts as "below threshold" depends entirely on your use case. A support agent routing a billing complaint needs different confidence handling than one answering a general product question. The framework gives you a knob to turn, but it doesn't help you decide what setting makes sense for your process.

The design review step most teams skip

After we've written the process document, we do a design review with the client before building. This is the step that feels unnecessary and turns out to be the most valuable thing we do.

In every review we've done, the client identifies at least one case that wasn't in the original process description — usually something they handle regularly but didn't think to mention because it felt obvious. A category of ticket that always goes to a specific person regardless of content. A customer segment that gets a different response format. A seasonal spike that changes the routing rules.

None of these are hard to handle once you know about them. They're expensive to discover after the agent is live.

What this means for scope and timeline

Process-first design is slower to start and faster to finish. The design phase takes longer than configuring a framework out of the box — a week or two instead of a day or two. But the build phase is cleaner because the logic is already agreed on. Staging testing surfaces fewer surprises. Post-launch issues tend to be edge cases that were documented as out-of-scope rather than bugs in the core logic.

We're not religious about this. If someone needs a quick proof of concept to show a stakeholder what an agent could do, a framework build is fine. But if the goal is a production agent that runs reliably without babysitting, we start from the process.