What actually makes an agentic workflow work

"Agentic" is the term of the year, and most of what is written about it is either breathless or impenetrable. Strip the noise away and an agentic workflow is a small set of parts working together. Here is the high-level shape of one that holds up in production, and, more importantly, where the real work hides.

The agent is the easy part

An agent is a model running in a loop. It reads a goal, decides on an action, takes it, looks at the result, and repeats until the job is done or it hits a limit. That loop is a few lines of code.

What makes it genuinely useful, or genuinely risky, is everything around the loop: what you let it touch, what you tell it, and what you do to keep it honest. Teams that struggle with agents almost never have a model problem. They have a context, tooling, or measurement problem.

Context lives in plain markdown

The cheapest, most underrated part of a good agent setup is a plain text file. Durable instructions (how your business works, what "done" looks like, the rules that must never be broken) live in version-controlled markdown that travels with the project. The convention is settling on files like AGENTS.md; Next.js now scaffolds one into new apps, which is a small signal of how mainstream this pattern is becoming.

Plain files are reviewable, diffable, and editable by a human who is not an engineer. That matters more than any prompt-tuning trick.

MCP connects the agent to your world

A model on its own is a confident stranger with no access to your systems. The Model Context Protocol (MCP) is the increasingly standard way to hand it real tools and data: your CRM, your database, your documents, an internal API, all through a consistent interface.

This is the line between a clever chatbot and something that does work. It is also where most of the security thinking belongs: an agent should be given the narrowest set of tools that lets it finish the task, and nothing more.

Hooks keep it deterministic

The model is probabilistic. Your business rules are not. Hooks are ordinary, deterministic code that runs at fixed points in the loop (before an action, after it, on completion) and they are where the guardrails go: validation, formatting, logging, "never write to production," "stop and ask a human here."

The model proposes; hooks dispose. A good agent feels less like magic and more like a well-instrumented machine.

Evals tell you whether it is any good

You cannot improve what you cannot measure, and "it seemed fine when I tried it" is not measurement. Evals are a test suite for behaviour: a set of representative cases with clear pass/fail criteria that you run every time you change a prompt, a model, or a tool. Without them, every release is a guess.

This is the step most teams skip, and it is the one that separates a demo from something you would trust with a customer.

Where the real work is

The model is a small fraction of an agentic workflow. The larger part (and this is meant as a rough illustration of the proportions, not a measured ratio) is context, tool boundaries, guardrails and measurement: the unglamorous engineering that decides whether an agent quietly saves your team hours or quietly creates a new mess.

The short version

The agent loop itself is simple. Context, tool access, hooks and evals are where the real design work lives.
Plain markdown context files are reviewable, diffable, and editable by non-engineers. That is their advantage.
MCP narrows what the agent can touch. Narrow access is a security decision, not a limitation.
Without evals, every change to a prompt or model is a guess. Evals are what make an agent safe to maintain.

That is deliberately high level. The right design depends entirely on the job you are trying to automate, the systems it has to touch, and how wrong it is allowed to be. If you are weighing where an agent could help your business, and just as importantly where it should not, that is exactly the kind of thing we like to talk through.

Back to all articles