Cross-Assistant Alignment (Diagnostic)
wonton.py compare-cross-assistant aligns theorem outputs across two runs even when theorem names do not overlap.
This report is diagnostic-only. The paired benchmark gate is documented in cross-assistant-paired-benchmark.md.
Current implementation combines a kind-aware ProofGraphIR with lexical signals and theorem-level proof aggregation.
Graph source is configurable per side:
wild_type_graph(default)search_trace_graph(derived from*_search_trace_graph.json)proof_term_graph(derived fromwild_type_proof_term.json(.gz))- theorem-level aggregation:
single(wild-type entries only; provider pools may still contribute multiple proofs/theorem)best_of(minimum proof-pair distance)consensus(median nearest-neighbor distance across pooled proofs)
What It Uses
Per theorem, the aligner builds a compact signature from graph artifacts plus text metadata:
- graph kind (
search_trace,term_dag,unknown) - run-relative rank features (node/edge/depth/leaf/branching)
- edge-role profile (tactic-family roles for search traces, structural roles for term DAGs)
- motif profile (assistant-agnostic tactic/constructor action motifs)
- WL shape hash (
hash_shape) used only for same-kind comparisons - theorem-name tokens (including Lean item-id decoding, e.g.
x5f -> _) - statement/connective profile when statement text is available (Lean corpus items and Coq extraction via
coqtopcheck fallback)
Distance is kind-aware:
- same-kind pairs use rank features + edge-role profile + shape hash
- cross-kind pairs suppress incompatible role/shape channels and rely on rank + lexical/connective channels
Command
uv run python wonton.py compare-cross-assistant \
--run-a <run_dir_a> \
--run-b <run_dir_b> \
--graph-source-a wild_type_graph \
--graph-source-b wild_type_graph \
--proof-aggregation single \
--top-k 100 \
--one-to-one \
--output <report.json>Useful options:
--solved-only: only include solved wild-type theorems from each run--top-k N: candidate pool per theorem before assignment (use large values for one-to-one)--one-to-one/--allow-many-to-one--graph-source-a,--graph-source-b: choosewild_type_graph,search_trace_graph, orproof_term_graph--proof-aggregation:single|best_of|consensus--max-proofs-per-theorem: optional cap for pooled-proof modes--name-obfuscation names: obfuscate theorem/display/constant names in lexical channels--lexical-ablation drop_tokens|graph_only: remove lexical tokens only, or remove both lexical and connective channels to inspect graph-only behavior
Output Contract
The JSON report contains:
- run metadata (
run_a,run_b, theorem counts, proof counts) - obfuscation metadata (
name_obfuscation) - lexical-ablation metadata (
lexical_ablation) - aggregation metadata (
proof_aggregation,max_proofs_per_theorem) matches(selected theorem pairings with total distance, component distances, and reciprocal-top1 flag, representative proof pair, and proof support)candidates_by_theorem_a(top-k candidate list for each theorem in run A)- summary stats including:
mean_distance,mean_graph_distance,mean_lexical_distance,mean_connective_distancemean_lexical_overlapreciprocal_top1_rate,shape_hash_equal_ratecross_kind_ratekind_pair_distributiontop1_unique_rate(collapse diagnostic)- run-level graph kind distributions (
run_a_graph_kind_distribution,run_b_graph_kind_distribution) - lexical/statement coverage for each run
Interpretation Notes
- Low distance means structural similarity under this signature family, not semantic equivalence.
- If many theorems in one run map to the same target theorem in the other run, the signature is likely too coarse for semantic translation claims.
top1_unique_ratenear zero indicates severe nearest-neighbor collapse.shape_hash_equal_rateis a strong signal when non-zero; zero across runs is expected when backends induce very different proof-object constructions.cross_kind_rate = 1.0means every selected pair is cross-kind (for example Lean search trace vs Coq proof-term DAG). In this regime, lexical/statement channels dominate and shape-equality is expected to be zero.- For replayed Coq extraction runs,
search_trace_graphis the closest proxy to Lean proof process/search behavior. proof_term_graphremains the better Coq-side term-structure diagnostic when you want constructor-level information rather than replay/process structure.--lexical-ablation graph_onlyis the quickest local stress test for whether the graph/action IR is carrying real alignment signal or just riding theorem text.