RAG & Data Indexing
RAG and indexing pipelines share a common shape: ingest documents, split them into chunks, generate embeddings, and write to a vector database. Because the stages are known upfront, these pipelines map naturally to a DAG workflow, where each stage is a task and dependencies between stages are declared before execution begins.
You declare the full graph (ingest → chunk → embed → index) and Hatchet executes tasks in order, running independent tasks in parallel automatically. You can add fanout within the chunking stage to process documents in parallel.
Step-by-step walkthrough
You’ll define a workflow, then add tasks for ingesting, chunking, embedding, and querying, all using a mock embedding client so you can run it without API keys.
Define the workflow
Define your input type and create an empty DAG workflow. You’ll add tasks to this workflow in the following steps.
Define the ingest task
Add a task that ingests documents. A trigger (event, cron, or API call) starts the pipeline with document references.
Chunk the documents
The ingest task (Step 2) fans out to one child per document. Each child splits its document into chunks. Use child spawning for per-document parallelism.
Embed and index
Define a standalone embed-chunk task, then spawn one child task per chunk from the DAG’s chunk-and-embed task. Each child runs on any available worker and is individually retryable, so a single embedding failure does not restart the entire batch. Rate Limits throttle embedding API calls across all workers.
The examples above use a mock embedding client. To use a real provider, swap get_embedding_service() with one of these. Pick a provider, then your language:
OpenAI’s Embeddings API converts text into high-dimensional vectors. It supports configurable dimensions and is a popular default for semantic search and RAG pipelines.
Query
Add a query task that reuses the same embed-chunk child task to embed the query, then performs a vector similarity search. In production, replace the empty results with a real vector DB lookup.
Run the worker
Start the worker and register the DAG workflow, the embed-chunk child task, and the rag-query task.
When fanning out to many chunks, ensure your workers have enough slots or use Concurrency Control to limit how many run simultaneously.
Multi-Tenant Indexing
For SaaS applications where multiple tenants share the same pipeline:
- GROUP_ROUND_ROBIN concurrency distributes scheduling fairly so no single tenant monopolizes workers
- Additional metadata tags each run with a tenant ID for filtering in the dashboard
- Priority queues allow higher-priority indexing jobs to run ahead of lower-priority ones
Related Patterns
Declare tasks and dependencies upfront so Hatchet can execute them in order.
DAG WorkflowsParallelize document and chunk processing across your worker fleet.
FanoutImplement incremental indexing that re-crawls until all changes are processed.
CyclesGeneral-purpose batch processing patterns that apply to indexing workloads.
Batch ProcessingExtract and transform documents (invoices, contracts, forms), distinct from RAG’s chunk-and-embed for retrieval.
Document ProcessingNext Steps
- DAG Workflows: define multi-stage pipelines
- Rate Limits: configure rate limiting for embedding APIs
- Child Spawning: fan out to per-document tasks
- Concurrency Control: fair scheduling for multi-tenant indexing