We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

By clicking "Accept", you agree to our use of cookies.
Learn more.

CookbooksLLM Pipelines

LLM Pipelines

LLM pipelines chain multiple model calls together with validation, retries, and structured outputs. Hatchet turns each step into a durable task so failures retry individually, rate limits protect provider APIs, and the full pipeline is observable in the dashboard.

Because each LLM call maps to a task and validation steps gate what runs next, these pipelines are a natural fit for DAG Workflows.

Step-by-step walkthrough

You’ll build a three-stage DAG pipeline (prompt, generate, validate) using a mock LLM so you can run it without API keys.

Define the pipeline

Create a workflow with prompt construction, LLM generation, and validation stages.

Prompt task

The prompt task depends on the pipeline input (Step 1). Build the prompt from user input and context. This step may include retrieval from a vector database (see RAG & Indexing).

The prompt is passed to your LLM service for generation. The examples above use a mock. To use a real provider, swap get_llm_service() with one of these:

OpenAI’s Chat Completions API provides access to GPT models for text generation, function calling, and structured outputs. It’s the most widely adopted LLM API and supports streaming, tool use, and JSON mode.

Generate and validate

This task takes the prompt from Step 2, calls the LLM, and validates the response. If validation fails, Retry Policies retry just this step with a corrective prompt.

Run the worker

Start the worker. Configure Rate Limits to stay within LLM provider quotas.

⚠️

Always set timeouts on LLM call steps. Model providers can hang or respond slowly under load. See Timeouts for configuration.

Common Patterns

PatternDescription
Generate → ValidateCall LLM, validate structured output, retry with error context on failure
Chain of thoughtMulti-step reasoning where each LLM call refines the previous output
Parallel evaluationFan out the same prompt to multiple models, then pick the best response
Translation pipelineGenerate content in one language, translate to others in parallel
Summarize → ClassifySummarize long text, then classify the summary for routing or tagging

Next Steps