Add to Cursor Add to Claude Copy for LLM View as MD

DAGs as Durable Workflows

A directed acyclic graph (DAG) is a workflow where every task, along with the dependencies between them, is declared upfront, before the workflow runs. At runtime, Hatchet schedules the tasks in the right order, runs tasks that don’t depend on each other in parallel, and passes the outputs of parent tasks to their children automatically.

DAGs in Hatchet are a form of durable execution by definition. Every time a task in a DAG completes, Hatchet persists its status and result so that the DAG can be retried without re-executing succeeded parts. This gives DAGs similar exactly-once-style guarantees you’d get from a durable task, without having to write any explicit durable execution code yourself.

On top of those durable properties, DAGs come bundled with a set of workflow-building features you’d otherwise have to construct by hand: automatic parallelism between independent tasks, typed inputs and outputs flowing from parents to children, branching, or groups for expressing complex wait conditions, and a clear visual representation of the workflow in the Hatchet dashboard.

Common examples of good fits for DAGs are ETL pipelines such as for document or image processing, and CI/CD-style workflows.

Defining a workflow

To define a DAG, start by declaring a workflow.

examples/python/dag/worker.py

dag_workflow = hatchet.workflow(name="DAGWorkflow")

examples/typescript/dag/workflow.ts

// First, we declare the workflow
export const dag = hatchet.workflow<DagInput, DagOutput>({
  name: 'simple',
});

examples/go/dag/main.go

workflow := client.NewWorkflow("dag-workflow")

The Ruby SDK is in early access, and may change. We'd love your feedback!

examples/ruby/dag/worker.rb

DAG_WORKFLOW = HATCHET.workflow(name: "DAGWorkflow")

Defining a task

Once you have a workflow, you can add tasks to it. Each task is a function that receives the workflow’s input and optionally returns an output. Just like a standalone task, every task in a DAG can declare its own retries, timeouts, concurrency settings, and so on. For more on how tasks work, see the tasks documentation.

examples/python/dag/worker.py

@dag_workflow.task(execution_timeout=timedelta(seconds=5))
def step1(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))

Adding task dependencies

Once you have more than one task on a workflow, you can start to declare dependencies between them. A task can declare one or more parent tasks, meaning that those parent tasks must complete successfully before the child task is allowed to run. Tasks that don’t depend on each other will be run in parallel. When a task runs, the outputs of its parents are available on its context object, so data flows naturally from one part of the DAG to the next.

examples/python/dag/worker.py

@dag_workflow.task(execution_timeout=timedelta(seconds=5))
async def step2(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))


@dag_workflow.task(parents=[step1, step2])
async def step3(input: EmptyModel, ctx: Context) -> RandomSum:
    one = ctx.task_output(step1).random_number
    two = ctx.task_output(step2).random_number

    return RandomSum(sum=one + two)

Running a workflow

Once you’ve defined a workflow and registered it on a worker, you can trigger it in all of the same ways you can trigger a standalone task. You can run it and wait for the result, enqueue it and let it run in the background, schedule it for the future, and so on. See the Running Tasks documentation for the full set of options.

Branching with parent conditions

Even though the structure of a DAG is fixed when you define it, individual tasks can still decide at runtime whether or not they should run, based on data produced earlier in the workflow. Parent conditions let a task inspect the output of one of its parents and either skip itself or cancel itself based on what that output contains. There are two operators available:

skip_if: Skip this task if the parent’s output matches the condition. Downstream tasks can check whether a parent was skipped and behave accordingly.
cancel_if: Cancel this task if the parent’s output matches the condition.

⚠️

A task cancelled by cancel_if behaves like any other cancellation in Hatchet, meaning its downstream dependents will be cancelled as well.

A common way to use parent conditions is to express sibling branches in a DAG. You declare a base task that returns some data, and then add two sibling tasks with complementary skip_if conditions, so that exactly one of them runs on any given workflow execution.

First, declare the base task that returns the value the branches will key off of:

examples/python/conditions/worker.py

@task_condition_workflow.task()
def start(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))

examples/typescript/dag_match_condition/complex-workflow.ts

const start = taskConditionWorkflow.task({
  name: 'start',
  fn: () => {
    return {
      randomNumber: Math.floor(Math.random() * 100) + 1,
    };
  },
});

examples/go/conditions/main.go

start := workflow.NewTask("start", func(ctx hatchet.Context, _ any) (StepOutput, error) {
	return StepOutput{RandomNumber: rand.Intn(100) + 1}, nil //nolint:gosec
})

The Ruby SDK is in early access, and may change. We'd love your feedback!

examples/ruby/conditions/worker.rb

COND_START = TASK_CONDITION_WORKFLOW.task(:start) do |input, ctx|
  { "random_number" => rand(1..100) }
end

Then add two branches that each use a skip_if parent condition to decide whether they should run, based on the output of the base task:

examples/python/conditions/worker.py

@task_condition_workflow.task(
    parents=[wait_for_sleep],
    skip_if=[
        ParentCondition(
            parent=wait_for_sleep,
            expression="output.random_number > 50",
        )
    ],
)
def left_branch(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))


@task_condition_workflow.task(
    parents=[wait_for_sleep],
    skip_if=[
        ParentCondition(
            parent=wait_for_sleep,
            expression="output.random_number <= 50",
        )
    ],
)
def right_branch(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))

Checking if a parent was skipped

When two sibling branches both feed into a common downstream task, the downstream task often needs to know which of its parents actually ran, as opposed to which one was skipped. You can check this on the context with ctx.was_skipped:

examples/python/conditions/worker.py

@task_condition_workflow.task(
    parents=[
        start,
        wait_for_sleep,
        wait_for_event,
        skip_on_event,
        left_branch,
        right_branch,
    ],
)
def sum(input: EmptyModel, ctx: Context) -> RandomSum:
    one = ctx.task_output(start).random_number
    two = ctx.task_output(wait_for_event).random_number
    three = ctx.task_output(wait_for_sleep).random_number
    four = (
        ctx.task_output(skip_on_event).random_number
        if not ctx.was_skipped(skip_on_event)
        else 0
    )

    five = (
        ctx.task_output(left_branch).random_number
        if not ctx.was_skipped(left_branch)
        else 0
    )
    six = (
        ctx.task_output(right_branch).random_number
        if not ctx.was_skipped(right_branch)
        else 0
    )

    return RandomSum(sum=one + two + three + four + five + six)

Waiting on conditions with or groups

In addition to parent conditions, a task in a DAG can wait for external signals before it starts running. It might wait for a sleep timer to expire, for a user event to arrive, or for some combination of these alongside parent conditions. You compose conditions like these using or groups.

An or group is a set of conditions combined with an Or operator, which evaluates to True as soon as at least one of the conditions inside it is satisfied. If you declare more than one or group on the same task, the groups are combined with AND, meaning that every group must have at least one of its conditions satisfied before the task is allowed to run. Between these two operators, you can express arbitrarily complex wait conditions in conjunctive normal form (CNF).

Sleep + event example

A common pattern is to combine a sleep condition and an event condition in the same or group, so that the task proceeds as soon as either an external signal arrives or a timeout expires, whichever happens first. This is a natural fit for human-in-the-loop workflows, where you want to put a deadline on how long you’ll wait for a response before moving on.

examples/python/conditions/worker.py

@task_condition_workflow.task(
    parents=[start],
    wait_for=[
        or_(
            SleepCondition(duration=timedelta(minutes=1)),
            UserEventCondition(event_key="wait_for_event:start"),
        )
    ],
)
def wait_for_event(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))

examples/typescript/dag_match_condition/complex-workflow.ts

const waitForEvent = taskConditionWorkflow.task({
  name: 'waitForEvent',
  parents: [start],
  waitFor: [Or(new SleepCondition('1m'), new UserEventCondition('wait_for_event:start', 'true'))],
  fn: () => {
    return {
      randomNumber: Math.floor(Math.random() * 100) + 1,
    };
  },
});

examples/go/conditions/main.go

waitForEvent := workflow.NewTask("wait-for-event", func(ctx hatchet.Context, _ any) (StepOutput, error) {
	return StepOutput{RandomNumber: rand.Intn(100) + 1}, nil //nolint:gosec
},
	hatchet.WithParents(start),
	hatchet.WithWaitFor(hatchet.OrCondition(
		hatchet.SleepCondition(1*time.Minute),
		hatchet.UserEventCondition("wait_for_event:start", ""),
	)),
)

The Ruby SDK is in early access, and may change. We'd love your feedback!

examples/ruby/conditions/worker.rb

WAIT_FOR_EVENT = TASK_CONDITION_WORKFLOW.task(
  :wait_for_event,
  parents: [COND_START],
  wait_for: [
    Hatchet.or_(
      Hatchet::SleepCondition.new(60),
      Hatchet::UserEventCondition.new(event_key: "wait_for_event:start")
    )
  ]
) do |input, ctx|
  { "random_number" => rand(1..100) }
end

Combining multiple or groups

For more complicated wait logic, you can declare more than one or group on the same task. As an example, consider these three conditions:

Condition A: A parent task’s output is greater than 50.
Condition B: A 30 second sleep timer expires.
Condition C: A payment:processed event arrives.

If you want the task to proceed when (A or B) and (A or C) are both satisfied, you’d declare two separate or groups on the task: one containing A or B, and the other containing A or C. The task will only start once both groups have been satisfied. If A is true, both groups pass immediately. If A is false, the task needs both B (the sleep expires) and C (the event arrives) before it can run.

examples/python/conditions/worker.py

@task_condition_workflow.task(
    parents=[start],
    wait_for=[
        or_(
            SleepCondition(duration=timedelta(seconds=30), readable_data_key="first"),
        ),
        or_(
            SleepCondition(duration=timedelta(seconds=30), readable_data_key="second"),
            UserEventCondition(event_key="payment:processed"),
        ),
    ],
)
def wait_for_or_groups(input: EmptyModel, ctx: Context) -> StepOutput:
    return StepOutput(random_number=random.randint(1, 100))

examples/typescript/dag_match_condition/complex-workflow.ts

taskConditionWorkflow.task({
  name: 'waitForOrGroups',
  parents: [start],
  waitFor: [
    Or(new ParentCondition(start, 'output.randomNumber > 50'), new SleepCondition('30s')),
    Or(
      new ParentCondition(start, 'output.randomNumber > 50'),
      new UserEventCondition('payment:processed', 'true')
    ),
  ],
  fn: () => {
    return {
      randomNumber: Math.floor(Math.random() * 100) + 1,
    };
  },
});

Task Eviction Running with Docker

We use cookies

DAGs as Durable Workflows

Defining a workflow

Defining a task

Adding task dependencies

Running a workflow

Branching with parent conditions

Checking if a parent was skipped

Waiting on conditions with or groups

Sleep + event example

Combining multiple or groups