Home > Wiki > Research > Prototype Open Questions — Running List

Prototype Open Questions — Running List

Status: ACTIVE (living list) Agent: opencode/ext-agent (sandshrew) Timestamp UTC: 2026-05-12T01:30:00Z Session: Codified from MjF — 7 open questions to resolve before prototype build

Q-01: Pi 4 Runtime Shape & Configuration

Define the full Pi 4 runtime surface. This is the "T-00" from the task queue, expanded.

Scope: - Docker container shape — new container vs existing d3-tui vs host process. Base image. Port allocation. - LangGraph runtime — venv binding, startup sequence, health checks - Forgejo — configuration (currently unconfigured), repo hosting, bind mounts, SSH/git access - Bind mounts — what directories surface to the container (wiki files, game-surface repo, secrets, toolchains) - Agent handling — how the Pi runs agent harnesses (subprocess, HTTP, raw API). Where API keys live. - Cleanup/reset — archive current state of d3-tui-pi-teams-proto before reconfiguring. What to preserve. - Bootstrap start state — what's running when the player connects. What ports, what processes, what health checks. - Resource budget — RAM ceiling, disk, CPU given existing Docker load (d3-tui, forgejo)

Depends on: Q-02 (which harnesses), Q-03 (LangGraph config)

Q-02: Agent Harnesses & Models

Which harnesses? Which models? Pre-staged or emergent?

Scope: - Candidate harnesses — Pi agents (Bun), Hermes, OpenCode. Possibly multiple harnesses for different roles. - Runtime — Bun vs installed-on-Bun. Subprocess vs HTTP bridge between Python and Bun. - Extension configs — pre-provided tool stacks or staged with extension configurations? - Model routing — which model for which node type? Single model for prototype, or multiple? - API key management — where keys live, how they're accessed, per-unit vs per-node routing

Depends on: Q-01 (where agents run), Q-03 (how nodes call agents)

Q-03: LangGraph Configuration

The graph internals. What does the 36-node, 3-unit LangGraph graph actually look like?

Scope: - State schema — full TypedDict with all keys (nodes, units, access lists, output summaries, player prompts, staged configs, routed context, decisions, history, turn counters). The omni config. - Node functions — 36 stubs, wired to agent harnesses per Q-02. What does each node do when activated? - Edges — hex adjacency topology. Node state gating (locked/open based on prerequisites). Conditional edges for revisit/deepen. - Checkpointer — SqliteSaver on Pi. Serialization, thread safety, named checkpoints for save/load. - Interrupt patterns — every node interrupts for player review. Poller/stager subgraph for corrections. Curation subgraph for routing. - Compilation — compile time on Pi 4 for 36 nodes + ~180 edges. Recursion limit. - Streaming — stream_writer for live terminal output on RG.

Depends on: Q-01 (where LangGraph runs), Q-02 (what agents it calls), Q-04 (how RG reaches it)

Q-04: Sandwich Logic (Membrane)

The wiring membrane between Pi and RG. How do they talk?

Scope: - HTTP API — endpoints (GET /state, POST /invoke, POST /checkpoint, GET /health, GET /wiki/{path}). FastAPI + uvicorn confirmed. - State sync — polling at 100ms confirmed (16ms latency). Poll vs stream decision. - Gamepad input → invoke flow — how D-pad + button presses translate to HTTP calls to Pi - State payload — 5.9KB for 36 nodes confirmed. No bottleneck. - Wiki serving — how the RG fetches wiki pages. GET /wiki/nodes/hex_14.md → rendered on RG. - Error handling — Pi unreachable? RG caches last state. "Connecting..." overlay. - Tailscale — confirmed both devices on 1.96.4 with SSH. MagicDNS short names not supported on RG (use IPs). - Gamepad input adoption — which sub-agent patterns (radial menus, chips, branching interviews, tags/presets) get adopted?

Depends on: Q-03 (what state looks like), Q-05 (what the RG UI looks like)

Q-05: UI & RG-Specific Surface

What the player sees on the 640×480 screen. The interactive terminal, the nested menus, the wiki viewer.

Scope: - Terminal interface — how the interactive terminal renders on RG. Agent streaming output, chain of thought, interrupt prompts. - Nested menu conventions — full menu tree from L0 (grid) to L4 (expanded section). Tile → unit → node → output → raw. - Wiki viewer — how context-specific wiki pages render on RG. Scrolling, section expansion, linked page navigation. - Correction interview — UI for the poller/stager. Multiple choice drill-down. Echo-back confirmation. Close/Correct/Not-at-all loop. - Gamepad patterns — radial menus for quick actions. Chips composition for structured input. Tags/presets for common corrections. Branching interviews for complex config. - Routing submenu — grid routing mode. Content selection → method selection → target selection → confirmation. - Edge case UI — locked node indicators, prerequisite display, collision prevention, save/load confirmation. - Pygame approach — Wargame Engine node-tree viewport + custom hex rendering. Colored shapes for prototype, sprites later. - Past examples — Oracle Chamber (same device, same resolution), XCOM Deploy (agent management UI), Balatro (LOVE-based polish reference). - Notification conventions — how the player knows: "node unlocked," "context routed," "correction staged," "harvester flagged intel."

Depends on: Q-04 (how state reaches RG), Q-03 (what state contains)

Q-06: Failure Hardening & Constraints

What can go wrong? What's the constraint envelope?

Scope: - Anticipated failures — Pi overloaded (Docker + LangGraph + agent calls + wiki serving). RG battery death mid-turn. API rate limits. Agent timeout. State serialization failure. - Unanticipated failures — "unknown unknowns." How does the system degrade gracefully when something unplanned breaks? - Constraint envelope — Pi 4: 4GB RAM (3.1GB free), 29GB SD (18GB free). RG: 640×480, gamepad only. Network: 3-5ms Tailscale, but what if Tailscale goes down? What if RG is on a different network? - Recovery patterns — what happens when the player reconnects after disconnect? State resumption. Interrupt recovery. Partial output salvage. - Guard rails — max recursion per poller interview. Max concurrent agent calls. Max state size before pruning. - Guided by examples — what broke during the Akashic Abyss OXCE prototype? What broke during Oracle Chamber? What patterns from those failures inform the hardening?

Depends on: All previous questions. This is the stress-test layer.

Q-07: Mission Selection / Prototype Profile

The concrete story that proves the graph works. Defined LAST, informed by all previous answers.

Scope: - Mission objective — what real work does the prototype accomplish? Must exercise the full pipeline. - Node naming — 36 node names with phase assignments. What does each hex represent? - Phase distribution — how many nodes per phase? Which phases are entry points, which are gated? - Unit starting positions — where do Rif, Echo, and Sherpa start? What access do they have? - Acceptance criteria — what does "done" look like for each node? MVP checklists. - How it ports to meta mechanics — does the mission prove that the game surface can orchestrate real work? Does it validate the architecture or just the UI?

Depends on: Q-01 through Q-06. This is the capstone.

Dependency Graph

Q-01 (Pi runtime shape) ─────┐
Q-02 (harnesses/models) ─────┤
                              ├──► Q-03 (LangGraph config) ──► Q-04 (membrane) ──► Q-05 (UI/RG) ──► Q-06 (hardening) ──► Q-07 (mission)
                              │

Q-01 and Q-02 inform Q-03. Q-03 + Q-04 inform Q-05. Q-01 through Q-05 inform Q-06. All inform Q-07.