AST Merge
Three-way merge, but the nodes are the units — not the lines.
Merging is where version control earns or loses trust. A tool that merges cleanly 99% of the time and silently corrupts the other 1% is worse than one that flags every hard case. Git's merge algorithm — three-way diff over lines — is a triumph of practicality, but it has a structural weakness: it does not know what a function is. It treats your program as a stream of characters separated by newlines. When two branches edit nearby lines, it throws up its hands.
Aura merges the same way Git does, but on a different substrate. The three-way merge runs over the AST, not the text. The unit of conflict is a node. The result is a merge algorithm that resolves cases Git cannot, and flags cases Git misses.
Merge conflicts are mostly an artifact of the representation, not the code. Change the representation, and most conflicts disappear.
The three-way merge, briefly
Every modern VCS uses a three-way merge. Given:
- Base (B): the common ancestor of two branches
- Ours (O): the current branch
- Theirs (T): the branch being merged in
The tool compares B→O and B→T. Where both sides changed the same region differently, it produces a conflict. Where only one side changed, it applies that side. Where both sides made the same change, it applies it once.
Git does this over lines. Aura does it over AST nodes. The arithmetic is identical; the granularity is not.
A case Git handles badly
Two developers edit the same function on different branches.
Base:
function authenticate(user: string, pass: string) {
const token = generateToken(user);
return token;
}
Branch A (renames parameter for clarity):
function authenticate(username: string, pass: string) {
const token = generateToken(username);
return token;
}
Branch B (adds a null check):
function authenticate(user: string, pass: string) {
if (!user) throw new Error("user required");
const token = generateToken(user);
return token;
}
Git sees overlapping line edits and produces a conflict. A human has to open the file, read both hunks, and manually combine them.
Aura sees two changes on two different AST nodes: a parameter rename on the signature and identifier nodes, and a new if statement inserted into the block body. The operations compose. Aura produces:
function authenticate(username: string, pass: string) {
if (!username) throw new Error("user required");
const token = generateToken(username);
return token;
}
Note the rename propagated into the new if statement. That is not accidental — it falls out of doing the rename as an operation over the AST (rename a binding, rewrite all references) rather than as a find-and-replace over lines.
A case Git handles wrongly
Two developers both add a new case to a switch statement, in the same location, that happens to produce adjacent lines. Git sees them as non-overlapping and merges both — sometimes producing code that compiles but is semantically inconsistent (two cases matching the same pattern, for example).
Aura sees two new child nodes under the same parent switch node. If they overlap semantically — same pattern, different bodies — Aura flags a conflict even though the text does not collide. The merge is suspicious, and the human is asked to confirm.
| Situation | Git merge result | Aura merge result | |---|---|---| | Rename in branch A, null-check added in branch B, same function | Conflict | Clean merge, rename propagates | | Both branches add distinct cases to the same switch | Silent merge (may compile wrong) | Structural review; flag if patterns overlap | | Branch A reformats, Branch B edits | Huge conflict | Clean merge; formatting is not a change | | Both branches rename the same function differently | Conflict | Conflict, surfaced as identity collision | | Branch A extracts helper, Branch B edits original body | Conflict | Structural merge; changes applied where semantically appropriate | | Branch A moves function to a new file, Branch B edits it | Conflict or lost history | Clean merge; function keeps its identity |
The algorithm, abstractly
The shape of Aura's merge algorithm, kept at the level of behavior rather than implementation:
- Resolve identities. For every node in B, O, and T, compute a stable identity. For functions this is the content hash combined with signature and location heuristics. For expressions and statements, identity derives from structural position and content.
- Compute node-level diffs. B→O and B→T as sequences of typed operations (add, delete, modify, rename, move, reorder). This is the same semantic diff used everywhere in Aura.
- Match operations. For each node that appears in both diffs, compare the operations. If they are the same, apply once. If they commute — changes to disjoint sub-nodes — apply both. If they conflict, mark for human resolution.
- Apply propagating effects. A rename on one side rewrites references on the other side's additions. A move updates import statements on the other side's edits.
- Serialize. The merged AST is pretty-printed back to source. Formatting follows the target branch's style; Aura does not invent formatting.
Step 4 is the part Git fundamentally cannot do. Rewriting references requires knowing what the identifiers mean.
Conflict classes
Aura surfaces conflicts at a finer granularity than Git. A conflict is attached to a specific AST node with a specific kind:
- Same-node modify/modify: both sides edited the body of the same function in incompatible ways. Surfaced as a function-level conflict with both versions shown.
- Delete/modify: one side removed a function, the other edited it. Aura does not silently drop the edit — it asks.
- Rename/rename: both sides renamed the same function to different names. Hard conflict.
- Identity collision: two sides both added a new function with the same name but different bodies. Surfaced so you can choose or combine.
- Signature drift: both sides changed a function's signature differently. Surfaced with callers listed, because signature changes are load-bearing.
Each class has a structured representation, not just a text marker. Tools (and humans) can inspect them programmatically.
Merging in the presence of refactors
The hardest merges in text-land are the ones that follow a refactor. One branch extracts a helper; the other edits the original body. Git typically gives up.
Aura sees:
- On branch A: function
processwas split intoprocess(delegating) andprocess_inner(extracted body). - On branch B: function
processhad its body edited.
The merge question becomes: "apply B's edit to which node?" Aura answers by following the extracted body. The edit lands in process_inner — where the affected code now lives — not on a stale line number in process. The result compiles. The result does what both developers intended.
This is not magic. It is what the AST literally shows: B's edit targeted specific statements, and those statements are now inside process_inner. Text merge cannot follow them; AST merge can.
When AST merge defers
Aura is not trying to solve every conflict automatically. When two changes genuinely contradict each other — same function, same node, incompatible rewrites — Aura presents the conflict clearly and refuses to guess. The goal is not "zero conflicts." The goal is "conflicts only when humans genuinely disagree."
A useful heuristic: a conflict in Aura almost always corresponds to a real design question. A conflict in Git often corresponds to a formatting coincidence.
What this means for reviewers
A reviewer looking at an Aura merge sees a list of node-level operations, not a tangle of lines. They see: "function authenticate had parameter renamed on one side and a null check added on the other; merged." They do not see: "47 lines of conflict markers to resolve."
The review is faster. The review is also more honest — the operations describe what happened, not what the text coincidentally looks like afterward.
Merge and the shadow branch
Aura's merges run on the shadow branch in parallel with the git merge. Your git history still records a merge commit in the ordinary way; Aura's shadow history records the AST-level operations. The two are kept in sync so that Aura can reason about history semantically without requiring you to abandon git tooling.
This matters in practice: git merge, git rebase, git cherry-pick still work on the text side. Aura's shadow merges track the same events at the semantic level. You get both views of the same history.
A short example, end to end
Imagine a small utility module. Alice is on branch feat/logging, Bob is on branch feat/retry. They both branch from main.
Alice renames fetch to fetch_with_log and wraps every call in a log statement. Bob adds a retry loop around the same fetch call.
Git produces a nasty conflict on the function body. Neither Alice nor Bob can resolve it without reading both branches carefully.
Aura sees:
- Alice: rename
fetch→fetch_with_log; wrap call site inlog(). - Bob: insert retry wrapper around call site.
The rename applies cleanly. The wrapping and retry are structural transformations on the same call expression — Aura composes them:
for attempt in 0..3 {
log("attempting fetch");
match fetch_with_log() {
Ok(r) => return Ok(r),
Err(e) if attempt < 2 => continue,
Err(e) => return Err(e),
}
}
Both intents preserved. No conflict markers. The reviewer sees the structural description and approves.
The limits
AST merge is only as good as the parser. Syntax errors in either side defeat the AST parse; Aura falls back to text diff with a clear warning. Languages without tree-sitter grammars degrade gracefully to text-based merging. The guarantee is: where Aura merges semantically, the result is structurally consistent; where it cannot, it tells you.
Related
AST merge depends on stable function-level identity to know which nodes in two branches are "the same" node. The operations it composes are the same ones produced by semantic diff. Merges are recorded alongside git history on Aura's shadow branches.