Content-Addressed Logic
Your code is a set of named things. Name them by what they are, not where they live.
A line number is a lie. It tells you where a piece of code is today, not what it is. Add an import at the top of a file and every line number below shifts by one. Reformat the file and half the line numbers become fictional. Rename the file and the line numbers point to a file that no longer exists.
Yet most tools address code by file and line. auth.rs:42. src/lib.rs:118:5. Compiler errors, stack traces, review comments, issue trackers — all of them pinned to a coordinate that changes every time you breathe near the file.
Aura addresses code differently. Every logic node in the system — every function, method, class, top-level constant — is identified by a content-derived hash. The hash is stable across file moves, line shifts, and reformatting. The hash is the address.
A line number tells you where to find the code. A content hash tells you what the code is.
What is a logic node
A logic node is a named, independently meaningful unit of code. In practice:
- Top-level functions.
- Methods (identified by their owning type and method signature).
- Classes, structs, enums, traits.
- Top-level constants and static bindings.
- Type aliases.
Each of these is a node in the logic graph. Each has a stable identity and a current content hash.
Nested things — expressions, statements, local bindings inside a function — are not independent logic nodes. They live inside their enclosing function's hash. If you change a local expression, the enclosing function's hash changes; the expression itself has no separate address.
This granularity is a deliberate choice. Functions are the smallest unit of code that most programmers think about independently. Smaller than that, and the graph becomes noisy; larger, and the tool loses precision.
Computing the hash
A node's content hash is derived from its normalized AST. The normalization steps:
- Parse the node with tree-sitter.
- Strip trivia: whitespace, comments, formatting tokens.
- Canonicalize local identifiers (so
let x = 1andlet y = 1hash the same). - Exclude the node's own name from the hash (so renames don't change identity). The name is recorded separately as metadata.
- Include structural signature: parameter shapes, return shape, generic bounds.
- Hash the result.
The output is a content digest — conceptually a hash like fn:7c9f21a4b2... — that is stable under all the cosmetic changes discussed in semantic diff and that changes precisely when the node's behavior changes.
The logic graph
Once every node has an address, code becomes a graph. Nodes are connected by references: function A calls function B, so there is an edge from A to B. Type T is used in function F, so there is an edge from F to T. Module M re-exports function A, so M has a reference to A.
Aura maintains this graph as part of the shadow branch. The graph is updated on every edit: new edges appear when you add a call, existing edges are preserved when you rename things, edges are rewritten when you move a node across files.
The graph is not decorative. It is what makes the following possible:
- Impact analysis. When a function's hash changes, Aura can walk the graph backward to find every caller and flag them for review.
- Cross-branch alerts. When a teammate modifies a node you reference on another branch, the graph lets Aura find the intersection.
- Proof queries.
aura_prove("user can authenticate via OAuth")finds the relevant logic nodes and checks their connections. The graph is what makes a proof query feasible. - Rename propagation. Renaming a function means updating its references. The graph is the list of references to update.
Merkle-tree-like structure
The content-addressed graph has a property that echoes a Merkle tree: a node's identity depends on its content, and a module's identity depends on the identities of the nodes it contains. Change a leaf (edit a function body), and the hashes of the containing structures can be recomputed incrementally.
The structure is not a strict Merkle tree — logic nodes can reference each other in cycles, which a tree forbids — but it shares the property that content-addressing cascades cleanly. This is what makes Aura's diffs fast: two versions of a module whose functions have unchanged hashes compare as equal without re-examining their bodies.
At a conceptual level:
module:auth
├── fn:authenticate [hash: 7c9f21...]
│ ├── references fn:hash_password
│ └── references type:User
├── fn:refresh_token [hash: a4b2d9...]
│ └── references fn:authenticate
└── type:User [hash: c1f8e0...]
Change hash_password, and authenticate's hash is unaffected (it references hash_password by identity, not by inlined content) — but Aura flags authenticate as potentially behaviorally affected because a function it calls has changed. The structural hash and the transitive impact are distinct questions, both of which the graph can answer.
Why this representation matters
Several everyday operations become trivial once code is content-addressed:
Deduplication. Two functions with the same hash are the same function. If the same function appears in two branches via independent authorship, Aura recognizes them as identical. Storage is shared.
Stable references. A code review comment pinned to fn:7c9f21... remains valid even after the function moves, is renamed, or shifts line numbers. Move the comment's target by identity; the text anchor follows automatically.
Cheap equality checks. Deciding "is this the same function I saw yesterday?" is a hash comparison, not a body comparison. Large codebases traverse fast.
History queries that follow identity. "Show me every version of this function" is a graph traversal, not a git log --follow heuristic. The traversal is exact.
Addresses vs locations
A useful distinction. A location tells you where code lives at this moment: file path, line number, character offset. Locations are ephemeral and change on every edit. An address tells you what code is, independent of where it lives. Addresses are stable.
Aura uses locations when they are the right tool — showing you a file to edit, pointing to a character in an error message. But for reasoning about code over time, locations are the wrong primitive. Addresses are.
| Question | Location-based answer | Content-addressed answer | |---|---|---| | Find this function later | Keep updating the line number | Use the address; it never changes | | Has this function changed? | Compare line-by-line text | Compare two hashes | | Is this the same function in another branch? | Heuristic name matching | Hash match | | Tell me callers of this function | Grep (unreliable) | Graph traversal (exact) | | Link a review comment to a line | Breaks on next edit | Pins to address; survives edits |
What the hash guarantees and what it doesn't
Guarantees:
- Two nodes with the same hash have the same normalized AST. Behavior is structurally identical.
- A node's hash is independent of its file, its line, and its formatting.
- A node's hash is independent of the names of its local variables.
- A node's hash is independent of its own name.
Does not guarantee:
- Two nodes with different hashes behave differently. They might be equivalent at runtime (
x + 1vs1 + x), and Aura does not try to prove that. - Two nodes with the same hash call the same external functions, because the external calls are referenced by identity in the graph, and those referenced nodes can evolve independently. This is why Aura surfaces transitive impact separately.
- The hash is a substitute for tests. It tells you what changed structurally; it does not tell you whether the change is correct.
Addressing and evolving the format
Content hashes are derived by Aura's normalization algorithm, and that algorithm evolves. A new version of Aura might treat some trivia differently, or support a new language feature, or normalize generics more carefully. When the normalization changes, hashes change.
Aura handles this by versioning its hashing algorithm and recording, per shadow commit, which version produced the hashes. Older commits keep their old hashes; new commits use the new ones. Cross-version comparisons use a stable identity anchor (see function-level identity) rather than raw hash equality.
The practical upshot: you can upgrade Aura without losing history. Your old commits are still addressable, your old rewinds still work, your old review comments still resolve.
The address is the API
For Aura's internal protocols — between CLI, MCP tools, hooks, and the mothership — the content address is the canonical identifier. A message that says "function X was modified" carries the address of X. A review comment references an address. A rewind operation specifies an address. A sync push ships addresses.
This is the right level. Addresses are portable; locations are not. Two agents working on two branches can talk about the same function by its address even if they have never seen each other's code. Their addresses match; their tools can proceed.
A short mental exercise
Imagine a codebase where every function is a content address and the files are merely views over that address space. Rearranging files is like rearranging bookmarks; it doesn't move the books. Renaming a function is like relabeling a bookmark; the book behind it is the same. Two branches that both contain the same function share the same underlying object; only one copy is stored.
This is the mental picture Aura is building toward. Git treats files as source of truth and infers everything else. Aura treats the logic graph as source of truth and uses files as a presentation layer. The graph is durable; the files are arrangements.
Related
Content addresses are what make function-level identity stable, what semantic diff operates on, and what the shadow branches store. The graph is traversed by AST merge to match nodes across branches.