Add to Cursor Add to Claude Copy for LLM View as MD

Autoscaling Workers

Hatchet provides a Task Stats API that enables you to implement autoscaling for your worker pools. By querying real-time queue depths and task distribution, you can dynamically scale workers based on actual workload demand.

Task Stats API

The Task Stats endpoint returns current statistics for queued and running tasks across your tenant, broken down by task name, queue, and concurrency group.

Endpoint

GET /api/v1/tenants/{tenantId}/task-stats

Authentication

The endpoint requires Bearer token authentication using a valid API token:

Authorization: Bearer <API_TOKEN>

Response Format

The response is a JSON object keyed by task name, with each task containing statistics for queued and running states:

{
  "my-task": {
    "queued": {
      "total": 150,
      "queues": {
        "my-task:default": 100,
        "my-task:priority": 50
      },
      "concurrency": [
        {
          "expression": "input.user_id",
          "type": "GROUP_ROUND_ROBIN",
          "keys": {
            "user-123": 10,
            "user-456": 15
          }
        }
      ],
      "oldest": "2024-01-15T10:30:00Z"
    },
    "running": {
      "total": 25,
      "oldest": "2024-01-15T10:25:00Z",
      "concurrency": []
    }
  }
}

Each task stat includes:

total: The total count of tasks in this state
concurrency: Distribution across concurrency groups (if concurrency limits are configured)
oldest: Timestamp of the oldest task in the specified state

These are available only for queued tasks:

queues: A breakdown of task counts by queue name

Example Usage

curl -H "Authorization: Bearer your-api-token-here" \
  https://cloud.onhatchet.run/api/v1/tenants/707d0855-80ab-4e1f-a156-f1c4546cbf52/task-stats

Autoscaling with KEDA

KEDA (Kubernetes Event-driven Autoscaling) can use the Task Stats API to automatically scale your worker deployments based on queue depth.

Setting Up a KEDA ScaledObject

Create a ScaledObject that queries the Hatchet Task Stats API and scales your worker deployment based on the number of queued tasks:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: hatchet-worker-scaler
spec:
  scaleTargetRef:
    name: hatchet-worker
  minReplicaCount: 1
  maxReplicaCount: 10
  triggers:
    - type: metrics-api
      metadata:
        targetValue: "100"
        url: "https://cloud.onhatchet.run/api/v1/tenants/YOUR_TENANT_ID/task-stats"
        valueLocation: "my-task.queued.total"
        authMode: "bearer"
      authenticationRef:
        name: hatchet-api-token
---
apiVersion: v1
kind: Secret
metadata:
  name: hatchet-api-token
type: Opaque
stringData:
  token: "your-api-token-here"
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: hatchet-api-token
spec:
  secretTargetRef:
    - parameter: token
      name: hatchet-api-token
      key: token

The valueLocation field uses JSONPath-style notation to extract a specific value from the response. Adjust my-task to match your actual task name.

Scaling Based on Multiple Tasks

If you have multiple task types handled by the same worker, you can create multiple triggers or use a custom metrics endpoint that aggregates the totals:

triggers:
  - type: metrics-api
    metadata:
      targetValue: "50"
      url: "https://cloud.onhatchet.run/api/v1/tenants/YOUR_TENANT_ID/task-stats"
      valueLocation: "task-a.queued.total"
      authMode: "bearer"
    authenticationRef:
      name: hatchet-api-token
  - type: metrics-api
    metadata:
      targetValue: "50"
      url: "https://cloud.onhatchet.run/api/v1/tenants/YOUR_TENANT_ID/task-stats"
      valueLocation: "task-b.queued.total"
      authMode: "bearer"
    authenticationRef:
      name: hatchet-api-token

Running with Docker Sticky Assignment

We use cookies