# Limitations and Edge Cases *When Aura falls back, where it is honest, and how to tune it for pathological inputs.* ## Overview No merge engine is perfect. A tool that claims perfection is a tool that lies to you. Aura is engineered so that its failure modes are **visible, localized, and recoverable** — not silent corruption, not whole-repo panics, not mysteriously wrong output. This page documents the places where Aura falls short of AST-level merging, the heuristics it uses to decide when to degrade, and the knobs available to tune behavior for your codebase. Read this before running Aura on a repository for the first time. Knowing the limits is knowing the tool. > A tool is trustworthy when its failures are predictable. Predictability beats cleverness at scale. ## How It Works ### The fallback hierarchy Aura's mergers are arranged in a hierarchy. When a higher-precision merger cannot proceed, it degrades to the next one down, always emitting a diagnostic: ```text 1. AST merge (tree-sitter + adapter) 2. Structural merge (JSON, YAML, .env, Markdown block-level) 3. Line merge (standard three-way text merge) 4. Defer (write merge markers, let the user resolve externally) ``` At every level, Aura preserves the three-way model — ancestor, ours, theirs — so resolutions downstream of a fallback still have all three versions available. ### When AST merge falls back to text Aura picks text merge over AST merge in the following cases: 1. **No grammar for the file's language.** Aura supports a large language set (see [cross-language AST](/cross-language-ast)), but proprietary DSLs and obscure languages fall through. 2. **Parse failure on a modified subtree.** If at least one side edited a region tree-sitter cannot parse, text merge is safer than guessing. 3. **File too large.** Above the configured cap (default 32MB for source, 128MB for structured data), full parsing is disabled. 4. **Binary file.** Detected by the null-byte heuristic and extension checks. Aura treats binaries as opaque; see below. 5. **Minified / generated file.** Heuristics on line length, density, and file-name patterns (`*.min.js`, `dist/**`, `build/**`). 6. **Explicit opt-out.** Paths listed in `.aura/merge.json`'s `text_only` array. Each fallback is recorded in the merge summary: ```bash $ aura merge feature/payments --summary merged: 47 files AST: 41 structural: 4 (json, yaml) text: 2 (src/bundle.min.js, CHANGELOG.md) deferred: 0 diagnostics: src/bundle.min.js: minified heuristic triggered (avg line length 2,144) CHANGELOG.md: grammar healthy, but `text_only` rule in .aura/merge.json ``` ### Partial parses Tree-sitter's error tolerance means Aura often gets a usable tree even when the input has syntax errors. Aura exploits this: it identifies **clean subtrees** (regions with no `ERROR` or `MISSING` nodes) and performs AST merge on those, falling back to text merge only on the affected regions. Concretely, if a file has 20 top-level declarations and one of them has a syntax error, Aura does AST merge on the other 19 and text merge on the broken one. You get the benefit of semantic merge for most of the file. ### Binary files Binaries (images, compiled artifacts, sqlite databases, anything detected by the null-byte heuristic) cannot be meaningfully three-way merged. Aura's policy: 1. If ours and theirs are byte-identical, keep either. 2. If one side is unchanged from ancestor, take the other. 3. Otherwise, raise a `type/type` conflict and record both alternatives in the merge session. Resolution verbs for binaries are `keep-ours`, `take-theirs`, or `keep-both` (writes both with disambiguating suffixes). LFS-tracked binaries follow the same rules; Aura respects the LFS pointer format and only materializes blobs on explicit reveal. ### Huge files Large files are a mixed bag: - **Source code over 32MB.** Almost always generated. Text merge. - **JSON over 128MB.** Almost always a data dump. Text merge with a warning suggesting Git LFS. - **Lockfiles.** Can exceed 10MB; Aura's JSON merger has a fast path for these (`prefer-theirs` or `prefer-ours` per `.aura/merge.json`). Merging two three-way lockfiles key by key is technically possible but rarely what you want — regenerating the lockfile is usually the right move. - **CSV / TSV / data files.** Aura does not ship a CSV merger by default. Text merge. Caps are configurable: ```json { "limits": { "ast_merge_max_bytes": 33554432, "structural_merge_max_bytes": 134217728, "parse_timeout_ms": 5000, "node_count_cap": 250000 } } ``` When a cap trips, Aura emits a diagnostic naming the specific limit so you can tune intentionally. ### Tree imbalance Extremely deep or wide trees stress the diff algorithm. Aura caps recursion at 10k levels and sibling-width at 100k nodes per parent; beyond that, the affected subtree degrades to text merge. In practice, the only files that hit these are adversarial inputs or tools that emit pathological formatting. ### Rename detection Aura's cross-branch rename detection is fuzzy: it uses AST similarity to match a deleted declaration on one side to an inserted declaration on the other. The threshold (default 0.75 Jaccard similarity on subtrees) is configurable. Below the threshold, a rename is seen as unrelated delete + insert, which can surface extra conflicts. Above the threshold, two genuinely different functions might be fused. The default is tuned for realistic codebases; teams with unusual styles can adjust: ```json { "rename_detection": { "threshold": 0.7 } } ``` ### Formatter disagreements Aura's pretty-printer observes the file's existing style: indent, quote preference, trailing commas, line width. When one side has run a new formatter (`prettier` major version bump, `rustfmt` with a new config) and the other has not, Aura can interpret the formatting-only hunks as real changes. The mitigation: run the formatter on ancestor before merging. `aura merge --normalize-ancestor` does this automatically using the project's detected formatter. ### Timestamps and non-deterministic builds Some generated files embed timestamps or nondeterministic values (build IDs, generated comments with dates). Every commit changes these, which defeats merge. Aura recognizes common patterns (ISO-8601 timestamps in header comments, UUIDs in `// Generated by ...` lines) and strips them from the compare while preserving them in the output. Unusual generators need a `.aura/merge.json` exclusion: ```json { "ignore_patterns": { "src/generated/*.ts": ["^// Generated at .+$"] } } ``` ## Examples ### A: A partial parse with recoverable subtree Ancestor parses clean. Ours edits `login`. Theirs has a typo in `signup`: ```typescript export function signup( email: string, password: string ): Promise100k files, initial session start-up (scanning for tracked files, computing per-file health) can take several seconds. Subsequent merges are cached. A background service mode (`aura daemon`) keeps caches warm for CI workers. **Concurrent merges on one working tree.** Unsupported. Aura locks `.aura/merge/` for the duration of a session and refuses to start a second concurrent session on the same tree. Running two merges at once is almost always user error. **Non-UTF8 source.** Tree-sitter requires UTF-8. Files in Shift-JIS, GB18030, or other encodings are transcoded on read; output is always written UTF-8. Repos that need non-UTF8 round-trip must exclude those files from AST merge. ## See Also - [How AST merge works](/how-ast-merge-works) - [Tree-sitter integration](/tree-sitter-integration) - [Cross-language AST](/cross-language-ast) - [JSON deep merge](/json-deep-merge) - [YAML merge](/yaml-merge) - [.env merge](/env-merge) - [Merge strategies](/merge-strategies) - [Conflict resolution](/conflict-resolution) - [Interactive conflict picker](/interactive-conflict-picker)