Distributed MCTS Semantics

Canonical semantics for distributed (cell-view) MCTS in wonton-soup.

Scope and Enablement

Distributed mode is active when:

Distributed mode has multiple logical controllers (agents) sharing one MCTSTree.

Shared primitives:

Control loop:

Agent reserves a candidate node/path under a tree lock.
Reservation marks inflight counts on node/path.
Expansion uses one adapter (adapter_mode="single"), so Lean interaction is serialized.
Backprop updates tree statistics and decrements inflight counters.

This is logical parallelism over one shared search state, not concurrent Lean execution.

Reservation uses a UCT-like score with optional distributed modifiers:

Effective parent visits:

parent_visit_eff = parent.visit_count + inflight(parent) * virtual_loss

Effective child stats:

visits_eff  = child.visit_count + inflight(child) * virtual_loss
success_eff = child.success_count - inflight(child) * virtual_loss

Optional additives:

Unvisited and non-inflight nodes keep first-touch priority (+inf).

Blocking (BlockSchedule):

Delay (DelaySchedule):

Reroute:

when blocked or delayed, selection can retry alternate nodes up to max_attempts
reroute trail is recorded in trace payloads

Flag(s)	Requirement	Validation
`--mcts-agents`, `--mcts-inflight`	required in distributed mode	both must be provided
distributed options in centralized mode	forbidden	parser error
`--mcts-block-fraction`	enables block schedule	must be in `(0,1)`
`--mcts-block-duration` with block fraction	required	must be non-zero
`--mcts-block-seed` with block fraction	required	must be provided
`--mcts-block-immovable-fraction`	optional with block schedule	must be in `[0,1]`; requires positive duration
`--mcts-unfreeze-after`	optional with block schedule	must be `>= 1`
`--mcts-unfreeze-prob`	optional with block schedule	must be in `(0,1]`
`--mcts-reroute-blocked` + `--mcts-reroute-max`	paired	max required when reroute enabled; max `>= 1`
delay triplet (`--mcts-delay-prob`, `--mcts-delay-duration`, `--mcts-delay-seed`)	all-or-none	prob in `(0,1)`, duration `>= 1`, seed `>= 0`
`--mcts-virtual-loss`	optional	`>= 0`
`--mcts-depth-bias`, `--mcts-path-bias`	optional	each `>= 0`
`--mcts-history-cache`	optional	boolean switch

Cross-mode constraints:

Distributed mode uses the same provider stack as centralized mode:

Run metadata captures distributed settings (distributed_mcts) and policy choices.

Trace payloads capture distributed context:

These fields allow post-hoc reconstruction without relying on CLI history.