We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

By clicking "Accept", you agree to our use of cookies.
Learn more.

CookbooksAI AgentsWhat is an AI Agent?

What is an AI Agent?

An AI agent is a program that uses an LLM to decide what to do next at runtime. Instead of following a fixed pipeline, the agent reasons about its goal, picks a tool or action, observes the result, and loops until the goal is met. This makes agents extremely powerful, but hard to run reliably.

Agents generally fall into two categories:

  • Semi-autonomous agents rely on pre-written code that codifies business logic or procedures. The LLM decides which tool or path to take, but every action it can invoke is defined ahead of time.
  • Fully autonomous agents can write and execute arbitrary code. The LLM generates code at runtime, runs it, and acts on the output. These agents are highly flexible but require sandboxing and careful guardrails.

You can build both with Hatchet, but most teams choose semi-autonomous agents for the majority of production workloads because they are easier to reason about, test, and run more reliably.

Agents fail in production when the process hosting them dies mid-loop, when they hold resources for hours or days while waiting on external input, or when a long-running reasoning chain exhausts a timeout. Hatchet solves these problems by making every agent a durable task.

How agents map to Hatchet

Agent conceptHatchet primitive
AgentDurable task
Reasoning loopChild spawning: task re-spawns itself until done
Tool callsChild tasks: sequential or parallel
Human approval gateWaitForEvent: slot freed while waiting
Routing by LLMif/else in code + spawn child to different workflows

Why Hatchet for agents

Simple primitives, flexible composition. Hatchet gives you a small set of primitives for managing state and distributing workloads. You compose them however your agent needs, and they scale reliably without custom infrastructure.

Survives crashes. Every step in an agent’s orchestration path is checkpointed. If a worker dies, the agent resumes from the last checkpoint rather than restarting from scratch.

Frees slots during waits. When an agent waits for a human approval event, external event, or sleeps for a scheduled retry, the worker slot is evicted and freed. No resources are held while the agent is idle, even if the wait lasts hours or days.

Handles streaming. Pipe LLM tokens from inside the task to connected clients as they’re generated. Hatchet manages the plumbing so you don’t build your own pub/sub layer. See Streaming.

Controls concurrency and rate limits. Use CANCEL_IN_PROGRESS on a session key so new user messages cancel stale agent runs. Use GROUP_ROUND_ROBIN to distribute work fairly across users at scale. Add rate limits to stay within external API quotas.

Full observability. Every child run appears in the Hatchet dashboard. You can trace the full reasoning chain: which tools were called, what the LLM returned, where the loop terminated.

Agent patterns