What ops teams get wrong about AI

Most AI automation projects don't fail because the technology doesn't work. They fail because of assumptions made at the start that don't survive contact with reality. We see the same three assumptions come up repeatedly — not because ops teams are careless, but because these assumptions feel reasonable until they're not.

Assumption one: the process is well-documented

When we ask a client to walk us through a process before we scope an agent, we almost always discover the process is different from what was described. Not dishonestly — people describe the intended process, or the idealized version, or the way it works when everything goes normally. What they don't describe, until you ask specifically, is the exceptions.

There's the customer who always calls instead of emailing and whose calls get logged manually. The team member who handles a specific type of case differently because of a history with that account. The end-of-quarter crunch that changes how prioritization works. The system that sometimes sends data in an older format because of a legacy integration nobody has time to fix.

None of these are unusual. They're how real processes work. An agent built to the documented process will fail on all of them.

The fix is straightforward but time-consuming: interview the people who actually do the work, not just the ones who designed the process. Ask about last week specifically. Ask what happens when something goes wrong. Document what you find before designing anything.

Assumption two: the agent will handle exceptions by learning

A related assumption is that edge cases will sort themselves out as the agent collects more data. This is sometimes true and often false, and the distinction matters.

For classification tasks — identifying the type of incoming ticket, for example — more data genuinely does improve accuracy over time if the training loop is set up correctly. The agent sees more examples of each category and gets better at distinguishing them.

But for decision tasks — deciding what to do with a classified input — more data doesn't help unless the logic is right. An agent that routes billing complaints to the wrong team won't stop doing that just because it sees more billing complaints. The routing logic has to be corrected. And to correct it, someone has to be monitoring what the agent is actually doing, not just whether tickets are being processed.

This is why we build monitoring into every deployment. Not monitoring of throughput — monitoring of decision quality. What did the agent decide, and was the decision right? That loop has to involve a human, at least in the early weeks.

Assumption three: integration is the easy part

This one comes up most often when clients have already tried to build something themselves. They got the logic working in a demo environment, and then spent two months on integration and never shipped.

Integration is rarely easy. Not because APIs are complicated — most modern business software has usable APIs — but because the data in those systems is messier than expected. Fields that should always be populated sometimes aren't. Values that should follow a consistent format have three slightly different formats across records imported from different systems at different times. The same concept (a "customer") has different meanings across your CRM and your support tool, and those meanings need to be reconciled.

We scope integration conservatively. We build in time to audit the actual data before touching it. We write data validation into the agent itself, not as an afterthought. This adds time to the project. It also means the agent works when it goes live.

What this means for where to start

The implication of all three of these assumptions is that the safest way to start is with a process you can observe completely before you automate it. Not the most important process, not the one that would save the most time — the one where you can sit down with the person who runs it, watch them do it, and map every step including the ones they don't mention until you ask.

That's usually a narrower process than the ones that come up first when people brainstorm what AI could do for them. It's also usually the one where an agent actually gets built and deployed.