Follow-Up Run Program (March 2026)

This run program is the operational answer to the follow-up thesis constraints in site/blog/wonton-soup-follow-up/index.md:

Entry Point

Use:

cd dossiers/wonton-soup
scripts/followup_run_program.sh

The script defaults to dry-run and prints every command. Run with --execute to actually launch.

For deepseek provider on an external artifacts volume, set:

DEEPSEEK_ARTIFACT_ROOT=/path/to/specter-archive

This root must contain: wonton-soup/models/ntp-mathlib-deepseek-1.3b-mlx-bf16.

Program namespace: PROGRAM_ID (default: followup-2026-03)
All run ids are nested under PROGRAM_ID/....
Lake/postprocess/K commands are scoped to one logs root: resolve_logs_root()/<PROGRAM_ID>
Run-id date prefix is frozen at program start, so cohorts do not split at UTC date rollover.
Use one pinned Lean corpus ref (LEAN_CORPUS_REF) across phases. Prefer a pinned artifact ref such as lean:<corpus_id>@<build_id>#feasible.

scripts/followup_run_program.sh --phase p1 --execute

p1: shared Lean panel (centralized MCTS), provider parity (reprover/heuristic/deepseek)
p2: paired centralized vs distributed MCTS on matched theorem slice
p3: basin wide (--seeds 10 --blind) on the broad panel
p4: basin deep (--seeds 50 --blind) on a focused subset
p5: external backend pilots (e, vampire, z3, optional coq stdlib)
p6: postprocess + lake reconcile scoped to this cohort logs root
p7: locked-reference K jobs via lake job presets

Run all phases in order:

scripts/followup_run_program.sh --execute

Phase p7 executes:

Each preset:

p5 is opt-in (ENABLE_EXTERNAL_PILOTS=1) and uses:

Unset inputs are skipped explicitly (fail-loud behavior by phase output).