# Self-Hosted Deployment

*Running Mothership at scale on your own infrastructure. Reference topology, Docker Compose, and Kubernetes manifests.*

## Overview

Aura is designed to be operated by your platform team, not by Naridon. Every component of a production deployment — the Mothership server, its PostgreSQL catalog, its object store for AST fragments, its reverse proxy, its observability stack — runs on infrastructure you control. Naridon, Inc. publishes reference manifests, tested under load, that platform teams can adopt directly or adapt to local conventions.

This page covers three deployment shapes:

1. A **single-node Docker Compose** deployment for teams of up to roughly fifty engineers.
2. A **highly-available Kubernetes** deployment for teams from fifty to five hundred engineers.
3. A **multi-region federation** shape for global engineering organizations.

For air-gapped variants of any of these, see [Air-Gapped Install](/air-gapped-install). For performance tuning, see [Performance Tuning](/performance-tuning).

## Reference topology

A production Mothership has four moving parts:

- **Mothership server** (`aura-mothership`): the Rust binary that serves the peer protocol, the intent log, the sync pipeline, and the messaging subsystem.
- **Catalog database** (PostgreSQL 15+): stores function identities, zone ownership, intent log metadata, and team membership.
- **Fragment store**: stores compressed AST fragments. Either local disk (for single-node), or S3-compatible object storage (for HA).
- **Reverse proxy** (nginx, Caddy, Envoy, or your cloud's L7 load balancer): terminates TLS, enforces client certificate authentication, rate limits.

```text
                     +--------------------+
  peers (aura CLI) → |  Reverse proxy     |
                     |  (TLS, mTLS, rate) |
                     +---------+----------+
                               |
                     +---------v----------+      +----------------+
                     |  Mothership server |----->|  PostgreSQL    |
                     |  (stateless)       |      +----------------+
                     +---------+----------+
                               |
                     +---------v----------+
                     |  Fragment store    |
                     |  (disk or S3)      |
                     +--------------------+
```

The Mothership process itself is **stateless**. You can run as many replicas as you need behind a load balancer. All durable state lives in PostgreSQL and the fragment store.

## Single-node Docker Compose

Suitable for up to roughly fifty engineers, or for staging environments. Uses local disk for fragments and a single Postgres container.

```yaml
version: "3.9"

services:
  postgres:
    image: postgres:15
    restart: unless-stopped
    environment:
      POSTGRES_DB: aura
      POSTGRES_USER: aura
      POSTGRES_PASSWORD_FILE: /run/secrets/pg_password
    volumes:
      - pgdata:/var/lib/postgresql/data
    secrets:
      - pg_password

  mothership:
    image: ghcr.io/naridon-inc/aura-mothership:0.14.1
    restart: unless-stopped
    depends_on:
      - postgres
    environment:
      AURA_DB_URL: postgres://aura@postgres/aura
      AURA_FRAGMENT_PATH: /var/lib/aura/fragments
      AURA_LISTEN: 0.0.0.0:3001
      AURA_TLS_CERT: /etc/aura/tls/server.crt
      AURA_TLS_KEY: /etc/aura/tls/server.key
      AURA_CLIENT_CA: /etc/aura/tls/clients-ca.crt
    volumes:
      - fragments:/var/lib/aura/fragments
      - ./tls:/etc/aura/tls:ro
    ports:
      - "3001:3001"

  caddy:
    image: caddy:2
    restart: unless-stopped
    depends_on:
      - mothership
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
    ports:
      - "443:443"

volumes:
  pgdata:
  fragments:

secrets:
  pg_password:
    file: ./secrets/pg_password.txt
```

Back up `pgdata` and `fragments` on a daily schedule. See [Backup & Recovery](/backup-and-recovery) for the tested restore procedure.

## Kubernetes deployment

For teams larger than fifty engineers, we recommend Kubernetes with the Mothership as a horizontally-scaled deployment, Postgres as a managed service (RDS, Cloud SQL, AlloyDB, or a self-hosted operator like CloudNativePG), and S3-compatible object storage for fragments.

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: aura-mothership
  namespace: aura
spec:
  replicas: 3
  selector:
    matchLabels:
      app: aura-mothership
  template:
    metadata:
      labels:
        app: aura-mothership
    spec:
      containers:
        - name: mothership
          image: ghcr.io/naridon-inc/aura-mothership:0.14.1
          args: ["serve", "--config", "/etc/aura/config.toml"]
          ports:
            - containerPort: 3001
          env:
            - name: AURA_DB_URL
              valueFrom:
                secretKeyRef:
                  name: aura-db
                  key: url
            - name: AURA_FRAGMENT_BACKEND
              value: "s3"
            - name: AURA_S3_BUCKET
              value: "aura-fragments-prod"
            - name: AURA_S3_REGION
              value: "eu-central-2"
          readinessProbe:
            httpGet:
              path: /healthz
              port: 3001
            initialDelaySeconds: 5
          livenessProbe:
            httpGet:
              path: /healthz
              port: 3001
            initialDelaySeconds: 30
          resources:
            requests:
              cpu: "2"
              memory: "4Gi"
            limits:
              cpu: "4"
              memory: "8Gi"
          volumeMounts:
            - name: config
              mountPath: /etc/aura
              readOnly: true
      volumes:
        - name: config
          configMap:
            name: aura-mothership-config
---
apiVersion: v1
kind: Service
metadata:
  name: aura-mothership
  namespace: aura
spec:
  selector:
    app: aura-mothership
  ports:
    - port: 3001
      targetPort: 3001
  type: ClusterIP
```

Expose the service through an ingress that terminates TLS with client certificate authentication. An example ingress fragment using nginx-ingress:

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: aura-mothership
  namespace: aura
  annotations:
    nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
    nginx.ingress.kubernetes.io/auth-tls-secret: "aura/clients-ca"
    nginx.ingress.kubernetes.io/auth-tls-verify-depth: "1"
spec:
  tls:
    - hosts: ["aura.internal.example.com"]
      secretName: aura-server-tls
  rules:
    - host: aura.internal.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: aura-mothership
                port:
                  number: 3001
```

The Mothership process keeps no local state between requests, so the HorizontalPodAutoscaler can scale replicas based on CPU without care. Postgres connection pool sizing should scale with replica count; see [Performance Tuning](/performance-tuning).

## Configuration file

Configuration uses TOML. Every flag also accepts an environment variable of the form `AURA_<UPPER_SNAKE>`.

```toml
[server]
listen = "0.0.0.0:3001"
workers = 16
max_body_bytes = 67108864

[database]
url = "postgres://aura@postgres.internal/aura"
pool_size = 32
statement_timeout_ms = 10000

[fragment_store]
backend = "s3"
bucket = "aura-fragments-prod"
region = "eu-central-2"
prefix = "mothership/v1/"

[auth]
mode = "mtls"
client_ca = "/etc/aura/tls/clients-ca.crt"

[sync]
batch_size = 256
flush_interval_ms = 500
max_in_flight = 2048

[audit]
intent_log_backend = "postgres"
hash_chain = true
export_enabled = true

[observability]
metrics_listen = "0.0.0.0:9090"
tracing_otlp_endpoint = "http://otel-collector:4317"
log_format = "json"
```

## Multi-region federation

Organizations that operate across regulatory jurisdictions — for example, an EU-based product team, a US-based security team, and an APAC support organization — can run one Mothership per region and federate them.

The federation model is **eventually consistent**. Each regional Mothership is the system of record for its zone. Function identities and intent logs replicate between Motherships over the same peer protocol the CLI uses, with explicit allow-lists on which repositories cross which borders.

Configuration sketch:

```toml
[federation]
region = "eu-central"
peers = [
  { region = "us-east", endpoint = "https://aura-us.example.com", allow = ["monorepo/public/**"] },
  { region = "ap-south", endpoint = "https://aura-ap.example.com", allow = ["monorepo/public/**"] },
]
replication_mode = "opt-in"
```

This is the recommended shape for teams that need [EU data sovereignty](/data-sovereignty-eu) while still collaborating across borders.

## Observability

The Mothership exports Prometheus metrics on a separate port and OpenTelemetry traces to any OTLP-compatible collector. Core metrics worth alerting on:

- `aura_intent_log_append_errors_total` — should be zero at steady state.
- `aura_sync_queue_depth` — healthy values are below `max_in_flight`; sustained saturation indicates undersized workers.
- `aura_peer_sessions_active` — capacity planning signal.
- `aura_fragment_store_latency_seconds` — P99 above 500ms typically points at object-store throttling.
- `aura_postgres_pool_wait_seconds` — sustained non-zero values mean the pool is undersized.

## Rolling upgrades

Mothership releases follow semver. Minor upgrades (for example 0.14.x → 0.15.x) are wire-compatible with one release back, so rolling replicas one at a time is safe. Major upgrades will always ship with a migration runbook and a dual-run window. Never upgrade Postgres schema and binary in the same deployment; run `aura-mothership migrate` from a one-shot Job, then roll the fleet.

## Failure modes and mitigations

| Failure | Symptom | Mitigation |
| --- | --- | --- |
| Postgres down | Peers see 503 on intent append | Peers queue locally; no data loss. Restore Postgres, queues drain. |
| Fragment store slow | P99 sync latency rises | Mothership caches hot fragments; tune `fragment_cache_mb`. |
| Single AZ loss | Some replicas unreachable | HPA + multi-AZ node pools; no state on the replica. |
| Corrupted intent log page | Chain verification fails | Restore from WAL; see [Backup & Recovery](/backup-and-recovery). |
| Runaway agent fleet | Sync queue saturation | Rate-limit per agent identity; see [RBAC](/rbac-and-permissions). |

## TLS and mutual authentication

Peer-to-Mothership traffic is mutually authenticated in every supported deployment shape. The Mothership presents a server certificate signed by the customer's chosen CA (public CA for externally reachable deployments, internal CA for private ones). Each peer — human developer or autonomous agent — presents a client certificate whose subject maps to an identity in [RBAC](/rbac-and-permissions).

We strongly recommend a dedicated issuing CA for Aura peer certificates, separate from the customer's general-purpose internal CA. This keeps certificate revocation consequences scoped: revoking a compromised Aura peer should not force a recycle of, for example, an internal service mesh.

Certificate lifecycle tooling that customers have integrated successfully includes cert-manager (Kubernetes), HashiCorp Vault's PKI secrets engine, and step-ca. Short-lived certificates (24 to 72 hours) are preferred; they make revocation list maintenance trivial at the cost of requiring automated renewal in the peer CLIs, which the `aura` CLI supports natively.

## Rate limiting and abuse prevention

The Mothership includes a built-in per-identity rate limiter intended to protect against runaway agents and misbehaving automation. Limits are configurable per role and per named identity:

```toml
[rate_limit.defaults]
role_contributor = { pushes_per_minute = 60, intents_per_minute = 60 }
role_agent       = { pushes_per_minute = 120, intents_per_minute = 120 }

[[rate_limit.override]]
identity = "agent:bulk-refactor-bot"
pushes_per_minute = 600
intents_per_minute = 600
```

When a peer exceeds its limit, the Mothership returns a 429 with a `Retry-After` header; the CLI backs off automatically. Sustained rate-limit hits generate an alert because they usually indicate a misconfigured agent.

## Capacity planning

Capacity needs are driven by two dimensions: **concurrent peers** (how many developers and agents push in parallel) and **intent-log append rate** (how many function-level changes land per minute, peak). The former sizes connections and threads; the latter sizes Postgres and the fragment store.

A reasonable starting point for a 100-engineer deployment:

- Single Mothership, 8 vCPU, 16 GB RAM.
- Postgres, 4 vCPU, 16 GB RAM, 100 GB SSD.
- S3-compatible object storage for fragments, no specific sizing.
- 1 Gbit/s intra-network bandwidth between components.

Beyond 300 concurrent peers, the shape shifts; see [Performance Tuning](/performance-tuning) for the scale-out model we recommend.

## See Also

- [Air-Gapped Install](/air-gapped-install)
- [Performance Tuning](/performance-tuning)
- [Backup & Recovery](/backup-and-recovery)
- [RBAC & Permissions](/rbac-and-permissions)
- [Mothership Overview](/mothership-overview)