NativeLink
Deployment examples

On-prem overview

Architecture patterns for self-hosted NativeLink — what to deploy, where, and how it fits together.

The on-prem getting-started guide covers why you'd self-host. This page covers the reference architectures — the deployment shapes that have survived contact with real workloads.

Three common shapes

Shape A — Single-node cache

One machine, one binary, one config. The cache lives on local disk (or compressed local disk). No remote execution.

              ┌──────────────────────┐
   Bazel ───▶ │   nativelink         │
   Buck2 ───▶ │   cas + ac + server  │ ──▶ /var/lib/nativelink
              └──────────────────────┘

Good for: ≤ 10 developers sharing a cache. CI workers that host their own per-runner cache.

Not for: Anything where the cache restarting wipes your team's afternoon.

Config shape: Basic / filesystem-backed.

Shape B — Cache cluster + remote workers

A few control-plane VMs run CAS/AC/scheduler behind a load balancer. A separate, autoscaling worker fleet runs actions. Storage is S3 + Redis.

                ┌──────────────┐
                │   load LB    │
                └──────┬───────┘
              ┌────────┼────────┐
              ▼        ▼        ▼
         [server]  [server]  [server]    ◀── stateless
              │        │        │
              ▼        ▼        ▼
            ┌──────────────────────┐
            │   Redis (hot tier)   │
            ├──────────────────────┤
            │      S3 (durable)    │
            └──────────────────────┘


              ┌────────┴────────┐
              ▼                 ▼
          [worker]          [worker]      ◀── autoscaling
          [worker]          [worker]

Good for: Most production deployments. 50-500 engineers.

Not for: Truly massive scale (multi-region, regulated workloads with strict data residency).

Config shape: Production configurations.

Shape C — Multi-region, regulated

Per-region clusters with their own data plane; one global control plane for cross-region scheduling decisions. mTLS everywhere. Storage pinned to a region.

   ┌─────────────────┐         ┌─────────────────┐
   │   us-east-1     │   ╳     │     eu-west-1   │
   │                 │   ╳     │                 │
   │  full cluster   │   ╳     │  full cluster   │
   │                 │   ╳     │                 │
   └─────────────────┘   ╳     └─────────────────┘
        ▲                            ▲
        │      ╳ ── data-residency boundary ── ╳
        │                            │
        └──────── clients ───────────┘
                  (region-aware)

Good for: Regulated industries; orgs with strict residency contracts; truly global teams.

Cost: Engineering. Plan for a small platform team.

What to deploy on

SubstrateNotes
KubernetesThe richest deployment story. See Kubernetes.
Bare VMs + systemdLowest operational complexity if you already do this.
Docker ComposeReasonable for ≤ 5 devs.
Bare metalBest price/perf at scale.

Capacity rules of thumb

Numbers we've seen in production. Yours will vary — these are the order-of-magnitude starting points:

ResourcePer active developer / week
CAS storage (C++)15-20 GB
CAS storage (Go/Rust)5-10 GB
Cache reads5-10 GB
Worker CPU1 vCPU per concurrent action
Scheduler memory100 MB per 1000 in-flight

When the cluster is healthy, the scheduler is the cheapest piece and CAS storage is the dominant cost. Plan accordingly.

What's next