What is an AI Agent?
An AI agent is a program that uses an LLM to decide what to do next at runtime. Instead of following a fixed pipeline, the agent reasons about its goal, picks a tool or action, observes the result, and loops until the goal is met. This makes agents extremely powerful, but hard to run reliably.
Agents generally fall into two categories:
- Semi-autonomous agents rely on pre-written code that codifies business logic or procedures. The LLM decides which tool or path to take, but every action it can invoke is defined ahead of time.
- Fully autonomous agents can write and execute arbitrary code. The LLM generates code at runtime, runs it, and acts on the output. These agents are highly flexible but require sandboxing and careful guardrails.
You can build both with Hatchet, but most teams choose semi-autonomous agents for the majority of production workloads because they are easier to reason about, test, and run more reliably.
Agents fail in production when the process hosting them dies mid-loop, when they hold resources for hours or days while waiting on external input, or when a long-running reasoning chain exhausts a timeout. Hatchet solves these problems by making every agent a durable task.
How agents map to Hatchet
| Agent concept | Hatchet primitive |
|---|---|
| Agent | Durable task |
| Reasoning loop | Child spawning: task re-spawns itself until done |
| Tool calls | Child tasks: sequential or parallel |
| Human approval gate | WaitForEvent: slot freed while waiting |
| Routing by LLM | if/else in code + spawn child to different workflows |
Why Hatchet for agents
Simple primitives, flexible composition. Hatchet gives you a small set of primitives for managing state and distributing workloads. You compose them however your agent needs, and they scale reliably without custom infrastructure.
Survives crashes. Every step in an agent’s orchestration path is checkpointed. If a worker dies, the agent resumes from the last checkpoint rather than restarting from scratch.
Frees slots during waits. When an agent waits for a human approval event, external event, or sleeps for a scheduled retry, the worker slot is evicted and freed. No resources are held while the agent is idle, even if the wait lasts hours or days.
Handles streaming. Pipe LLM tokens from inside the task to connected clients as they’re generated. Hatchet manages the plumbing so you don’t build your own pub/sub layer. See Streaming.
Controls concurrency and rate limits. Use CANCEL_IN_PROGRESS on a session key so new user messages cancel stale agent runs. Use GROUP_ROUND_ROBIN to distribute work fairly across users at scale. Add rate limits to stay within external API quotas.
Full observability. Every child run appears in the Hatchet dashboard. You can trace the full reasoning chain: which tools were called, what the LLM returned, where the loop terminated.
Agent patterns
The core agent pattern. Reason → act → observe → repeat until done. Includes the evaluator-optimizer variant.
Reasoning LoopClassify incoming requests with an LLM or rule, then route to a specialist.
RoutingAn orchestrator delegates to specialist workflows. Each specialist has its own prompt and tools.
Multi-AgentFan out independent tool calls or sub-tasks in parallel. Aggregate results before the agent continues.
ParallelizationPause the agent for human approval. The slot is freed; the agent resumes when the event arrives.
Human-in-the-LoopPipe LLM tokens to frontends in real time.
Streaming