Function-Level Sync

The function is the AST unit the world already agrees on. Sync at that level and everything else gets easier.

Live Sync operates at the function level on a roughly 5-second cadence. Not per keystroke, not per line, not per file. This page explains why the function is the right unit, why 5 seconds is the right cadence, and what that choice buys you in conflict rates, bandwidth, and cognitive load. The short version: character-level sync for code is a syntactic nightmare and line-level sync is a typographic accident. Functions are the smallest unit that is both semantically complete and syntactically self-contained, and a 5-second window is long enough to batch useful work without feeling laggy.

Overview

Pick up any piece of source code and ask yourself: what is the smallest unit of this code that I could paste into a different file and expect to still be meaningful? Not a character. Not a line. Not a block. A function. Functions have names, argument lists, return types, and bodies. They are closed under the type system. They are addressable. They are the unit your code review tool shows you, the unit your stack trace points at, the unit your LLM asks you to paste.

If that is the unit humans and tools already reason about, it is also the unit real-time sync should reason about.

Why not characters?

Character-level sync — the CRDT-and-OT family that powers Google Docs and Figma — is a beautiful piece of engineering. It works because prose is forgiving: a half-typed word is still readable, a misplaced comma is still understandable. Code is not forgiving. A half-typed identifier is a compilation error. An unclosed brace breaks the parser for the next fifty lines. A missing semicolon cascades through the type checker.

If Aura streamed keystrokes, your teammate would spend most of the day looking at a syntactically invalid file. Their language server would thrash. Their formatters would refuse to run. Their linters would alarm on imaginary errors. Their AST-aware tools — including Aura itself, whose entire value proposition rests on parsing — would fail to index anything. The user experience would be a constant red underline.

Worse, the conflict model at character level is meaningless for code. If Alice types user_id and Bob types userId simultaneously, character CRDTs will merge these into user_id or userIduser_id or some other nonsense, because CRDTs do not know what an identifier is. Code has invariants that characters do not.

Why not lines?

Line-level sync is what Git uses, and its limits are well understood by now. Lines are a typographic unit: they are about where you chose to put newlines for readability. They carry no semantic weight. Move a function up ten lines and the line diff is the whole function, twice (once as removal, once as addition). Reformat an argument list from one line to three and the line diff hides the fact that nothing semantically changed.

For real-time sync, line-level is also too fine and too coarse. Too fine because you get conflicts on reflow. Too coarse because a three-hundred-line file with a one-line fix still has to re-diff the whole file on every scan.

Why functions?

Functions are the Goldilocks unit:

  • Syntactically self-contained. A function opens and closes cleanly. Parse one function and you get a valid subtree. Replace a function body in-place and the surrounding file is still parseable.
  • Semantically complete. A function typically represents one unit of intent — one thing the programmer or agent was trying to do. Saves tend to happen at function boundaries ("let me try that function again"), not mid-identifier.
  • Content-addressable. Every function in Aura already has an aura_id derived from its normalized AST. We can hash it, we can diff it, we can route it.
  • Right-sized for the network. The median function body is a few hundred bytes. Compressed, that is a packet. A whole file would be kilobytes; a keystroke would be wasteful overhead.

The AST unit is the unit the world already agreed on. Function-level sync just pays attention to the agreement.

The 5-second cadence

Aura scans the dirty set every 5 seconds by default. Why 5 and not 1, not 30?

  • Too fast (< 2s) and you interrupt the user. Saves in most editors happen on a debounce tied to keystrokes, focus loss, or explicit save commands. If you push faster than the user saves, you push half-work.
  • Too slow (> 10s) and you are no longer "real-time." Ten seconds is long enough for two people to each write a conflicting function and neither notice. The whole point of the feature is compressed.
  • 5 seconds lines up with typical save rhythm in VS Code, Vim auto-save plugins, and JetBrains auto-save. It is long enough to batch a burst of edits into one push, short enough that the gap between "I saved" and "Alice sees it" is not awkward.

The cadence is tunable. For very large repos, extend it:

[live]
cadence_ms = 10000   # 10 seconds

For paired AI workflows where the agent saves after every tool call, shorten it:

[live]
cadence_ms = 2000    # 2 seconds

Gotcha: Do not go below 1 second. The scan itself takes real CPU, and under 1 second you will be scanning before the previous scan finished. The result is queue growth, not faster sync.

How It Works

Each sync tick runs this loop:

  ┌─── t = 0s ─────────────────────────────────┐
  │ scan dirty set (files mtime > last scan)  │
  │ parse each dirty file to AST              │
  │ for each function node:                   │
  │   new_hash = aura_id(normalize(node))     │
  │   if new_hash != last_pushed_hash:        │
  │     enqueue(fn_body, new_hash)            │
  │ commit queue to WAL                       │
  │ push WAL tail to Mothership               │
  └────────────────────────────────────────────┘
  ┌─── t = 5s ─────────────────────────────────┐
  │ ... repeat ...                            │
  └────────────────────────────────────────────┘

The scan is mtime-gated — only files whose modification time has advanced since the last scan are re-parsed. On a 50k-function repo with 3 files changed, the scan is microseconds. On a cold start, the scan is proportional to repo size and takes a few seconds once.

The hash compares against the last pushed hash, not the last committed hash. That means sync works regardless of whether you have committed. Edits flow continuously, independent of Git.

What lands on the wire

The payload for a single synced function looks like this (simplified):

struct LiveFnPush {
    aura_id: [u8; 32],        // stable function identity
    content_hash: [u8; 32],   // hash of this body revision
    body_zstd: Vec<u8>,       // zstd-compressed function body
    parent_hash: [u8; 32],    // prior pushed hash (for conflict check)
    author: PeerId,
    ts_ms: u64,
}

A typical compressed body is under 500 bytes. A 5-second sync tick that touched three functions sends about 1.5 KB. Compare that to pushing the whole dirty file (kilobytes) or streaming keystrokes (tens of KB/s). See bandwidth and perf for the full numbers.

What happens when you save a file that didn't change logically

If you save a file after only reformatting or adding comments, no function's aura_id changed, and therefore no push happens. Live Sync is semantically quiet by default. Whitespace, comments, formatter runs — all invisible to the wire. This is the same property that makes Aura's semantic diff useful, reused for real-time.

Config

[live]
enabled = true
cadence_ms = 5000

[live.scan]
# Maximum files to parse per scan tick. Prevents scan storms on huge repos.
max_files_per_tick = 256
# Functions larger than this push as a single blob without body diffing.
large_fn_threshold_bytes = 8192

Troubleshooting

If aura live sync status shows outbound queue growing without bound, the scan is producing pushes faster than the Mothership can ack. Check cadence_ms is not pathologically low, check network latency to the Mothership, and see sync troubleshooting.

If you see zero pushes despite saving, your file probably is not in the include glob. See selective sync.

Edge cases

A few edge cases are worth knowing about.

Anonymous functions and closures

A closure inside a function is not a top-level syncable node on its own — its identity is tied to the enclosing function. When the closure changes, the enclosing function's aura_id changes, and the whole enclosing body is pushed. This is intentional: closures are not independently addressable in most languages, and treating them as first-class sync units would create phantom identities that drift between saves.

Methods on types

Methods are handled the same way as free functions. Their aura_id includes the enclosing type and module path, so impl User { fn login() } and impl Admin { fn login() } have distinct identities even though their inner names match. Rename the type and the method's aura_id changes; the rename-proof property only applies to the method's own name within a stable enclosing type.

Macros and generated code

Macro-generated functions do not exist in your source tree until expansion. Live Sync operates on pre-expansion source — the macro call itself is the node it tracks. If your macro call is inside a function, that function's body (the macro invocation line) is the unit. If a macro expands to produce new top-level functions, those functions only exist after the build step and are not in your source, so they are not synced.

Very large functions

A function whose compressed body exceeds large_fn_threshold_bytes (default 8 KB) is still pushed as one unit, but the push bypasses the small-payload optimizations and streams as a single blob. This is rare — most functions are well under a KB compressed — but it happens for generated parsers, giant match statements, and the occasional handwritten behemoth.

Non-function top-level items

Not everything at the top level of a file is a function. Constants, type definitions, macros, imports — Aura treats these as tracked nodes too, but with different sync rules. A constant definition change is pushed like a function body. An import change is pushed along with any function that references it. The granularity is "the smallest logical unit that makes sense," which for most practical purposes is the function, with non-function items riding along when they change.

Comparison to alternatives

| Sync unit | Noise floor | Syntax safety | Conflict model | |---|---|---|---| | Character (CRDT) | High — every keystroke | Invalid trees in flight | Meaningless merges | | Line (file-diff) | Medium — reflows trigger | Valid on save | Overlapping hunks | | File (whole-buffer) | Medium — formatters trigger | Valid on save | File-level overwrite | | Function (Aura) | Low — only logic changes | Always valid | Per-function, rare |

The function-level column is empirically dominant on every axis except possibly raw latency — and a 5-second window is not a latency disadvantage for any realistic workflow.

See Also