We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

By clicking "Accept", you agree to our use of cookies.
Learn more.

CookbooksRAG & Data Indexing

RAG & Data Indexing

RAG and indexing pipelines share a common shape: ingest documents, split them into chunks, generate embeddings, and write to a vector database. Because the stages are known upfront, these pipelines map naturally to a DAG workflow, where each stage is a task and dependencies between stages are declared before execution begins.

You declare the full graph (ingest → chunk → embed → index) and Hatchet executes tasks in order, running independent tasks in parallel automatically. You can add fanout within the chunking stage to process documents in parallel.

Step-by-step walkthrough

You’ll define a workflow, then add tasks for ingesting, chunking, embedding, and querying, all using a mock embedding client so you can run it without API keys.

Define the workflow

Define your input type and create an empty DAG workflow. You’ll add tasks to this workflow in the following steps.

Define the ingest task

Add a task that ingests documents. A trigger (event, cron, or API call) starts the pipeline with document references.

Chunk the documents

The ingest task (Step 2) fans out to one child per document. Each child splits its document into chunks. Use child spawning for per-document parallelism.

Embed and index

Define a standalone embed-chunk task, then spawn one child task per chunk from the DAG’s chunk-and-embed task. Each child runs on any available worker and is individually retryable, so a single embedding failure does not restart the entire batch. Rate Limits throttle embedding API calls across all workers.

The examples above use a mock embedding client. To use a real provider, swap get_embedding_service() with one of these. Pick a provider, then your language:

OpenAI’s Embeddings API converts text into high-dimensional vectors. It supports configurable dimensions and is a popular default for semantic search and RAG pipelines.

Query

Add a query task that reuses the same embed-chunk child task to embed the query, then performs a vector similarity search. In production, replace the empty results with a real vector DB lookup.

Run the worker

Start the worker and register the DAG workflow, the embed-chunk child task, and the rag-query task.

⚠️

When fanning out to many chunks, ensure your workers have enough slots or use Concurrency Control to limit how many run simultaneously.

Multi-Tenant Indexing

For SaaS applications where multiple tenants share the same pipeline:

  • GROUP_ROUND_ROBIN concurrency distributes scheduling fairly so no single tenant monopolizes workers
  • Additional metadata tags each run with a tenant ID for filtering in the dashboard
  • Priority queues allow higher-priority indexing jobs to run ahead of lower-priority ones

Next Steps