Bandwidth and Performance

Function bodies are small, compression is cheap, the network profile is a polite trickle.

Live Sync is designed to work over a bad hotel Wi-Fi. The payload on the wire is a compressed function body, not a whole file and not a keystroke stream. The 5-second cadence means bursts of activity batch naturally. Idle peers send nothing at all. This page covers the actual numbers — what a tick costs in bytes, CPU, and wall time — plus the levers you can pull on slow or metered networks. The take-away: on a typical engineering workload, Live Sync uses less bandwidth than a single Slack reaction per minute, and most of that is metadata.

Overview

People's first instinct about real-time sync is "it must be expensive." That instinct comes from streaming-keystroke systems where the wire is busy whenever the user is typing. Live Sync is not that. It sends a packet only when a function's AST hash changes, only once per sync tick, and only in compressed form. A developer who is reading, running tests, or thinking — the majority of the day — produces zero traffic.

Here is a day-in-the-life profile from internal dogfooding (mid-sized Rust repo, four developers, typical feature work):

  average active developer: 48 fn pushes / hour
                            76 fn pulls  / hour
                            total bytes:  ~180 KB / hour uploaded
                                          ~260 KB / hour downloaded
  peak burst:               35 KB in a 5-second tick (large refactor)
  idle:                     ~200 bytes / minute heartbeat

For comparison, a single HD video frame is a couple hundred KB. Live Sync's entire hour of traffic is smaller than one frame.

How It Works

A push ticks through three stages that each do their part on the byte count.

Stage 1: scan

The scan only touches files whose mtime has advanced since last tick. On a cold start the scan is proportional to the include set (see selective sync); on subsequent ticks it is proportional to the dirty set, which is usually zero to a handful of files.

Timing on a 50,000-function Rust repo, M2 MacBook:

  cold scan (full repo):         ~4.2 s, one-time
  warm scan (0 dirty files):     ~180 µs
  warm scan (3 dirty files):     ~6–12 ms (parse + hash)

Stage 2: diff

For each dirty file, Aura parses to AST, normalizes each function, and hashes. Hashing is blake3; parsing is tree-sitter incremental. Total CPU for a 3-function change is single-digit milliseconds.

Stage 3: wire

Each changed function body is zstd-compressed and framed.

  raw body (median Rust fn):     ~420 bytes
  zstd compressed (level 3):     ~180 bytes
  framing + signature overhead:  ~96 bytes
  on-wire total (median fn):     ~280 bytes

A tick that pushes three functions is about 840 bytes on the wire. Round up for TLS overhead and you are at ~1 KB per tick — well under a kilobyte per second on average even at high activity.

Compression choices

Aura uses zstd at level 3 by default. Higher levels buy a few percent more compression for noticeably more CPU; level 3 is the knee. You can tune it:

[live.compression]
algo = "zstd"        # zstd | lz4 | none
level = 3            # 1..=19 for zstd
min_body_bytes = 64  # below this, send uncompressed (framing overhead wins)

On a metered LTE hotspot, level 9 shaves another 20–30% off at the cost of ~3x CPU. On a desktop with a gigabit link, level 1 or lz4 is faster end-to-end.

Benchmarks

Measured on the aura-sovereign monorepo (92k functions, 4 peers, Mothership on a t3.small). Each row is an hour of representative developer work.

| Scenario | Pushes/hr | Pulls/hr | Upload | Download | CPU % | |---|---:|---:|---:|---:|---:| | Idle (reading, no edits) | 0 | ~40 (teammates) | 8 KB | 60 KB | <0.1% | | Light coding | 24 | 70 | 90 KB | 150 KB | <0.3% | | Heavy refactor | 180 | 110 | 1.1 MB | 220 KB | <1% | | Paired human + AI agent | 320 | 200 | 1.8 MB | 380 KB | ~1.5% | | Full-repo format (noisy) | 0* | ~12,000 | 0 | 0 | — |

* Reformat-only saves produce no aura_id change, so they are free on the wire. This is the same property that keeps semantic diff quiet.

Tuning for slow or metered networks

Three levers, in order of effectiveness:

1. Lengthen the cadence

[live]
cadence_ms = 15000   # 15 seconds

Quadrupling the window roughly halves the byte count (more coalescing per push) and reduces heartbeat traffic proportionally. Noticeable latency on pulls; still faster than a PR.

2. Increase compression level

[live.compression]
level = 9

15–30% smaller payloads. 2–4x CPU during encode. On battery this is a trade-off; on AC power it is free.

3. Narrow the scope

Use selective sync to include only the module you are in:

[live.selective]
include = ["src/billing/**"]

This is the biggest win on large repos: the scan itself becomes faster, and unrelated peer pushes are filtered out before they reach you.

Memory footprint

Live Sync holds a small in-memory index of (aura_id -> last_pushed_hash) plus the WAL write buffers.

  fn index:        ~56 bytes per function  (5.1 MB for 92k functions)
  WAL buffers:     ~2 MB  (2x segment_size_mb / 16)
  parser cache:    ~20 MB  peak, tree-sitter incremental
  total steady:    ~30 MB

For comparison, the VS Code process on the same repo is typically 400–800 MB. Live Sync is a rounding error.

Heartbeats and presence

Even when idle, Aura sends a small heartbeat every 30 seconds to keep the Mothership WebSocket alive and update presence. Heartbeat is a single framed JSON message, typically under 200 bytes.

  { "type":"hb", "peer":"alice", "branch":"feat/x", "active_fn":"billing::compute_tax", "ts":... }

You can disable the active-function field if you consider it sensitive:

[live.presence]
broadcast_active_fn = false

Worst-case scenarios

Mass rename. You run a rename across 500 functions. All 500 have a new aura_id. Aura pushes them in batches of 64 per tick, with compression. Total on-wire: ~140 KB spread over ~5 ticks. Finishes in under 30 seconds.

Full repo format. 10,000 files reformatted, zero logic changes. Live Sync notices no aura_id changed and sends nothing. This is the single biggest win over file-level sync.

Generated code that thrashes. A protobuf regeneration emits new bodies for 2,000 functions. Live Sync pushes them; bandwidth spike for a minute. Mitigation: add **/*.pb.rs to exclude.

Gotcha: If you have an auto-formatter that runs on save and also changes the logical tree (some formatters rename locals), every save becomes a push. Either fix the formatter config or raise cadence_ms so the pushes coalesce.

Config

[live]
cadence_ms = 5000

[live.compression]
algo = "zstd"
level = 3
min_body_bytes = 64

[live.batching]
max_fns_per_push = 64
max_push_bytes = 512_000

[live.metered]
# If the OS reports the network as metered, switch to these.
cadence_ms = 20000
compression_level = 9

Troubleshooting

CPU spikes during ticks. You are on a huge repo and the warm scan is not warm. Run aura doctor live; the likely fix is to tighten include.

Bandwidth higher than expected. Usually generated code or formatters. Inspect with aura live sync status --trace --n 50 to see what recently went out.

High latency between save and teammate seeing change. First suspect is your cadence_ms setting, second is the Mothership's own push loop, third is network RTT. See sync troubleshooting.

Offline behavior

A laptop coming off the network does not pay any continuous cost. The outbound queue fills, retries back off, and heartbeats are suppressed after two consecutive failures. When the network returns, Aura sends a catch-up burst that drains the queue — typically finishing in seconds for normal work, minutes after a long offline stretch with heavy edits.

One subtle perf note: after a long offline period the inbound side may receive a big batch of accumulated teammate pushes. Aura applies these in author-time order, batching the AST mutations so the file system sees a few coarse writes instead of many small ones. On a 30-minute offline return with ~400 teammate changes, the apply takes roughly 2–3 seconds.

Mothership-side cost

The Mothership does the fan-out. Each inbound push is routed to peers by a small in-memory subscription map, persisted to its own log for durability. Costs on the Mothership scale linearly with total team push volume, not with number of peers — each peer reads the same event stream.

For a 10-engineer team with paired AI agents, a t3.small Mothership handles the load at ~2% CPU and ~40 MB RAM. A t4g.nano is enough for a 3-person team. See Mothership operations for sizing.

Tuning worksheet

A quick checklist for matching config to environment:

  • Home office on fiber. Defaults. Nothing to tune.
  • Coffee shop Wi-Fi. Raise cadence_ms to 10000, raise compression level to 9.
  • Airplane / train with spotty LTE. Set cadence_ms to 30000 and rely on catch-up bursts when you land.
  • Bandwidth-capped corporate VPN. Narrow include to the module you are in; the scan cost drops and cross-module noise disappears from your pull stream.
  • Battery-sensitive laptop. Drop level to 1 or switch algo to lz4. CPU drops proportionally.

Each setting is hot-reloadable — Aura re-reads .aura/live.toml without a restart.

Measuring your own workload

Not every repo looks like the benchmarks. To measure your own:

aura live sync metrics --since 1h

Output:

  window:           60 min
  pushes:           142
  pulls:            97
  bytes out:        184 KB
  bytes in:         236 KB
  avg tick cost:    4.1 ms  scan, 0.6 ms diff, 1.8 ms push
  p95 tick cost:    12.3 ms
  heartbeats out:   119

If any number surprises you, drill in with aura live sync trace --n 50 to see the last fifty operations with their timings and sizes.

See Also