Analysis Metrics Reference

Canonical semantics for metric families used in wonton-soup.

Structural Metrics

ged_search_graph: hard matching on goal signature + tactic family (Lean search graphs)
ged_search_graph_soft: soft node substitution cost via normalized goal-AST TED
ged_proof_graph: proof-object graph distance (external backends)
ged_trace_graph: trace-derived proxy graph distance

Family-specific validity metadata:

Normalized ordered tree edit distance over goal ASTs.

Used by:

Fast labeled-graph structural hash used in basin-style convergence summaries.

Depth/backtrack/uniqueness summaries from exploration history.

Attempt/success/failure summaries for search effort off the final solved path.

Divergence/reconvergence timing against wild-type solution-path signatures.

Soft sequence distance between wild/intervention solved-path goal signatures.

Repeated seeded runs measuring stability of solved structure distribution.

Core outputs:

K = log10(tau_blind / tau_agent) with trace-derived blind baseline estimates.

Interpretation guardrail:

For paired blind baseline experiments, use basin mode with explicit blind runs (paper_k).

Aggregate outcome counts by goal surface and normalized tactic.

Attempted/successful/solution-path tactic profile.

Novel and dropped goal-signature sets under intervention, with min-distance summaries.

Sheaf-inspired consistency/residual summaries over signature/tactic behavior.

Do not compare GED values across families without explicit normalization rationale.
ged_trace_graph is proxy-level evidence and must be labeled as such in reporting.
Compare metrics only within consistent goal-signature and goal-ID schemes.
Explicitly report missing or partial metrics via validity fields.

In-run metrics are generated by core orchestration. Heavy metrics are filled by postprocess.

In-run: trajectory/detour, many base per-variant metrics
Postprocess: soft GED, goal novelty, solution-path soft distance, K-style efficiency