Managed Compute
GPU Machine Types

GPU Instance Configuration

⚠️

This feature is currently in beta and may be subject to change.

Overview

Hatchet supports GPU-accelerated workloads using NVIDIA GPUs. This guide covers GPU configuration options, Docker setup, and available regions. For basic compute configuration concepts, please refer to the CPU Instance Configuration documentation.

GPU Types and Availability

Hatchet currently supports the following NVIDIA GPU types:

GPU ModelMemoryAvailable Regions
NVIDIA A1024GBord (Chicago)
NVIDIA L40S48GBord (Chicago)
NVIDIA A100-PCIe40GBord (Chicago)
NVIDIA A100-SXM480GBams (Amsterdam), iad (Ashburn), mia (Miami), sjc (San Jose), syd (Sydney)

Basic Configuration

from hatchet_sdk.compute.configs import Compute
 
compute = Compute(
    gpu_kind="a100-80gb",    # GPU type
    gpus=1,                  # Number of GPUs
    memory_mb=163840,        # Memory in MB
    num_replicas=1,          # Number of instances
    regions=["ams"]          # Must be a region that supports your chosen GPU
)

Docker Configuration

Example Dockerfile

# Base image
FROM ubuntu:22.04
 
# Install CUDA and required libraries
RUN apt-get update && apt-get install -y --no-install-recommends \
    cuda-nvcc-12-2 \
    libcublas-12-2 \
    libcudnn8 \
    && rm -rf /var/lib/apt/lists/*
 
# Your application setup
WORKDIR /app
COPY . .
 
# Application entrypoint
CMD ["python", "worker.py"]

Best Practices for Docker for GPU Workloads

  1. Selective Library Installation
# DO NOT use meta-packages
# ❌ RUN apt-get install cuda-runtime-*
 
# ✅ Install only required libraries
RUN apt-get install -y \
    cuda-nvcc-12-2 \
    libcublas-12-2 \
    libcudnn8
  1. Multi-stage Builds
# Build stage
FROM ubuntu:22.04 AS builder
RUN apt-get update && apt-get install -y cuda-nvcc-12-2
 
# Runtime stage
FROM ubuntu:22.04
COPY --from=builder /usr/local/cuda-12.2 /usr/local/cuda-12.2

Usage in Workflows

from hatchet_sdk import Hatchet, Context
 
hatchet = Hatchet()
 
@hatchet.workflow()
class GPUWorkflow:
    @hatchet.step(
        compute=Compute(
            gpu_kind="a100-80gb",
            gpus=1,
            memory_mb=163840,
            num_replicas=1,
            regions=["ams"]
        )
    )
    def train_model(self, context: Context):
        # GPU-accelerated code here
        pass

Memory and Resource Allocation

Available Memory per GPU Type

  • A10: 24GB GPU Memory
  • L40S: 48GB GPU Memory
  • A100-PCIe: 40GB GPU Memory
  • A100-SXM4: 80GB GPU Memory

When configuring memory_mb, ensure it's sufficient for both system memory and GPU operations.

Region-Specific Configurations

A100-80GB Example

# Multi-region A100-80GB configuration
compute = Compute(
    gpu_kind="a100-80gb",
    gpus=1,
    memory_mb=163840,
    num_replicas=3,
    regions=["ams", "sjc", "syd"]  # Replicas will be randomly distributed
)

A10 Example

# Chicago-based A10 configuration
compute = Compute(
    gpu_kind="a10",
    gpus=1,
    memory_mb=49152,
    num_replicas=2,
    regions=["ord"]
)

Best Practices

  1. GPU Selection

    • Choose GPU type based on workload requirements
    • Consider memory requirements for your models
    • Factor in regional availability
  2. Docker Optimization

    • Use specific library versions instead of meta-packages
    • Implement multi-stage builds to reduce image size
    • Only install required CUDA libraries
  3. Region Strategy

    • Select regions based on data locality
    • Consider backup regions for redundancy
    • Remember that replicas are randomly distributed across specified regions
  4. Resource Management

    • Monitor GPU utilization
    • Optimize batch sizes for GPU memory
    • Consider CPU and system memory requirements alongside GPU resources

Common Issues and Solutions

  1. CUDA Version Compatibility

    # Ensure CUDA toolkit version matches driver
    RUN apt-get install -y cuda-nvcc-12-2
  2. Library Dependencies

    # Install commonly needed libraries
    RUN apt-get install -y \
        libcublas-12-2 \
        libcudnn8 \
        nvidia-driver-525
  3. Region Availability

Region support is limited while GPUs are in beta. Always verify GPU availability in chosen regions.

Remember to monitor your GPU workload performance and adjust configurations as needed. Consider implementing proper error handling for GPU-related operations and ensure your code gracefully handles scenarios where GPUs might be temporarily unavailable.