SPECTER Labs
Technical Docs concepts

Analysis Metrics Reference

Canonical semantics for metric families used in wonton-soup.

Structural Metrics

GED Families

  • ged_search_graph: hard matching on goal signature + tactic family (Lean search graphs)
  • ged_search_graph_soft: soft node substitution cost via normalized goal-AST TED
  • ged_proof_graph: proof-object graph distance (external backends)
  • ged_trace_graph: trace-derived proxy graph distance

Family-specific validity metadata:

  • valid
  • validity_notes
  • trace_source
  • trace_completeness

Goal-AST Ordered TED

Normalized ordered tree edit distance over goal ASTs.

Used by:

  • ged_search_graph_soft
  • goal_novelty
  • solution_path_soft_distance
  • root_goal_similarity

Structure Hash (WL)

Fast labeled-graph structural hash used in basin-style convergence summaries.

Search-Dynamics Metrics

Trajectory Metrics

Depth/backtrack/uniqueness summaries from exploration history.

Detour Metrics

Attempt/success/failure summaries for search effort off the final solved path.

Trajectory Comparison (Recovery)

Divergence/reconvergence timing against wild-type solution-path signatures.

Solution-Path Soft Distance

Soft sequence distance between wild/intervention solved-path goal signatures.

Basin and Efficiency Metrics

Basin Analysis (Multi-Seed)

Repeated seeded runs measuring stability of solved structure distribution.

Core outputs:

  • solve_rate
  • unique_structures
  • dominant_structure_frequency
  • structure_distribution

K-Style Search Efficiency

K = log10(tau_blind / tau_agent) with trace-derived blind baseline estimates.

Interpretation guardrail:

  • configuration-dependent efficiency score
  • not an intrinsic theorem property

For paired blind baseline experiments, use basin mode with explicit blind runs (paper_k).

Tactic/Goal Behavior Metrics

Goal-Type / Tactic Matrix

Aggregate outcome counts by goal surface and normalized tactic.

Tactic Fingerprint

Attempted/successful/solution-path tactic profile.

Goal Novelty

Novel and dropped goal-signature sets under intervention, with min-distance summaries.

Sheaf Metrics

Sheaf-inspired consistency/residual summaries over signature/tactic behavior.

Cross-Metric Comparability Rules

  • Do not compare GED values across families without explicit normalization rationale.
  • ged_trace_graph is proxy-level evidence and must be labeled as such in reporting.
  • Compare metrics only within consistent goal-signature and goal-ID schemes.
  • Explicitly report missing or partial metrics via validity fields.

Computation Ownership

In-run metrics are generated by core orchestration. Heavy metrics are filled by postprocess.

  • In-run: trajectory/detour, many base per-variant metrics
  • Postprocess: soft GED, goal novelty, solution-path soft distance, K-style efficiency

See Analysis Lexicon (Wonton-Soup) and Log File Schemas for ownership and schema details.