Prototype Task Queue — Sequential Build-Out

Status: ACTIVE (issue queue) Agent: opencode/ext-agent (sandshrew) Timestamp UTC: 2026-05-11T20:45:00Z Claim: synthesis | 2026-05-11T20:40:00Z Session: Task breakdown — interactive spec/PRD build-out — tagged with research refs

Prior Context

Constraints


Task Queue

T-00: Pi Start State (Prelim)

Define the cleanest setup for the Pi to run this as a persistent surface.

Research questions: - What's the current Pi resource state? (RAM usage, disk, running processes) - Does the game backend need its own Docker container, or can it run in the existing d3-tui-pi-teams-proto container alongside pi-teams? - Can Wargame Engine + LangGraph coexist with existing Docker load without thrashing the 4GB ceiling? - If a new container: what base image? Does it need Python 3.12 + pip + git? - If running on host: systemd service or tmux session? What user? - What ports are already bound? (Forgejo:3001, anything else?) What port for the HTTP bridge? - Should LangGraph use the same Python as the d3-tui container or its own venv?

Resource refs: - Pi live state: ssh root@100.120.38.37, docker ps, free -h, df -h - Existing container: d3-tui-pi-teams-proto (Bun/Node/Python/pi-teams), from-forgejo (Forgejo) - Pearl Brain: context-opencode-20260510-232853-ef8c26d1 (Pi access details) - Pearl Brain: Q-06 Runtime Container Shape (already answered — single container approach validated) - Wiki ref: [[langgraph-game-surface]] (client-server architecture diagram)


T-01: Wargame Toolchain on Pi

Install Wargame Engine on the Pi.

Research questions: - Does pip3 install git+https://github.com/maximinus/wargame.git work on aarch64 without compilation? - If it compiles C extensions, do build tools exist on Pi? (gcc, python3-dev) - What does Wargame Engine actually need at import time? pygame? numpy? Is there a list of dependencies? - Can it run headless? (No display on Pi — need SDL_VIDEODRIVER=dummy or similar) - Does Wargame Engine's scene/node system work without a display surface? - If headless is impossible, can we test rendering on the Pi via framebuffer or Xvfb?

Resource refs: - Wargame repo: https://github.com/maximinus/wargame - Wargame README: /wargame/README.md in the repo (dependency list) - RG verification: confirmed Python 3.12 + pygame 2.5.2 on RG — not yet verified on Pi - Pearl Brain: Q-09 Bun / Pi Install / Model Routing (Pi install patterns) - Existing Pi Python: ssh root@100.120.38.37 "python3 --version; pip3 list | grep pygame" - Wiki ref: [[wargame-engine-rg-deployment]] (RG verification, applicable to Pi)


T-02: Agent Config Definition

Define the unit's agent configuration.

Research questions: - Which agent harness is available on the Pi right now? (Pi agents run via Bun in the d3-tui container. Hermes runs on RG via Python. OpenCode runs on Mac.) - What model should the unit use? (Kimi? MiniMax? DeepSeek? Claude via Mac relay?) - Are API keys already mounted in the d3-tui container? (KIMI_API_KEY, MINIMAX_API_KEY visible in container env) - Can the LangGraph node function call the agent harness via subprocess (Python → Bun), or HTTP (Python → localhost Bun server), or raw API call (Python → model API directly)? - For the prototype, should the node function make actual LLM calls or return mock/stub responses? - What tools does the agent need? (Read, Write, Bash, WebFetch? Or just reasoning?)

Resource refs: - Pi container env: docker exec d3-tui-pi-teams-proto env | grep API_KEY - Hermes on RG: /userdata/roms/ports/hermes/ - Pi agent docs: /workcell/llm-wiki/wiki/agents/pi-team-roles.md - Pearl Brain: Q-01 Pi Teams Fit (harness evaluation) - Pearl Brain: Q-09 Bun / Pi Install / Model Routing (model key routing) - Wiki ref: [[langgraph-tool-execution-tradeoffs]] (Pattern C: Hybrid — LangGraph routes, agent decides)


T-03: LangGraph Config Research

What LangGraph setup is optimal for 30 nodes + 1 unit?

Research questions: - What's the compile time for a 30-node StateGraph on Pi 4? (Test with a stub graph) - Which checkpointer? MemorySaver (fast, ephemeral) vs SqliteSaver (durable, survives restart)? - If SqliteSaver: does it handle concurrent reads from RG polling + writes from invoke without locking? - State schema: what keys? How large is the state object at 30 nodes? Serialization overhead? - Recursion limit: what's appropriate for a single turn cycle? (Planning → N decisions → Resolution = ~3-5 super-steps per turn) - Interrupt pattern: one interrupt per node the unit visits? Or only at end-of-turn? - Can interrupt() return structured data that the RG client can render? (multiple choice, form data?) - Does LangGraph's streaming API (astream) work over HTTP to the RG, or is polling get_state simpler?

Resource refs: - LangGraph docs: https://docs.langchain.com/oss/python/langgraph/graph-api - LangGraph persistence: https://docs.langchain.com/oss/python/langgraph/persistence - LangGraph interrupts: https://docs.langchain.com/oss/python/langgraph/interrupts - LangGraph streaming: https://docs.langchain.com/oss/python/langgraph/streaming - LangGraph reference API: https://reference.langchain.com/python/langgraph/ - Pearl Brain: research-sandshrew-20260511-180837-57495d32 (LangGraph core mechanisms) - Pearl Brain: inference-sandshrew-20260511-180846-13cceab1 (pre-staged vs emergent) - Wiki ref: [[langgraph-node-anatomy]] (node capabilities and constraints) - Wiki ref: [[langgraph-hex-node-mapping]] (30-node scale characteristics)


T-04: Mission Selection

What real work does the unit do across these 30 nodes?

Research questions: - What's a concrete, completable task that exercises the full pipeline? Something with enough substance to warrant 30 nodes across 7 phases. - Does the mission draw from existing Pearl OS work? (Example: "Design and spec the LangGraph game surface prototype itself" — meta-recursive, the mission IS the prototype's own design) - Or does it draw from a real project? (Example: "Research and design the Shachi SDK PSX rendering layer") - What's the right scope? 30 nodes is ~5-6 per phase on average. Too many for a trivial task, exactly right for a real feature design cycle. - What does "done" look like at each node? What's the deliverable? A wiki page? A Forgejo issue? A code commit? - Should the mission be something already completed (retrospective — the unit replays past work to validate the graph) or something new (forward — the unit does real work)?

Resource refs: - Pearl OS project index: Pearl Brain query_pearl_brain("active projects") - Existing Linear tickets: POLYGON team, Infrastructure team - Wiki depot: /workcell/llm-wiki/wiki/research/research-queue.md (existing research questions) - Forgejo on Pi: http://100.120.38.37:3001/ (current issues could seed the mission) - Pearl Brain: Q-04 Work Shape / Lifecycle (T0-T7 task lifecycle) - Wiki ref: [[langgraph-unit-node-interaction-model]] (node output locales, unit traversal) - Pi file tree: ssh root@100.120.38.37 "ls /mnt/kitchen/from-house/workspace/" (active projects to draw from)


T-05: UI Definition & Design

What does the interface look like on 640×480?

Research questions: - How are 30 nodes displayed on 640×480? Full grid at once (small tiles) or scrollable lanes? - 7 phase labels as column headers or row headers? Vertical or horizontal lanes? - What shape are nodes? Hex? Rectangle? Circle? (640×480 / 30 nodes ≈ 21×16 pixels each if all visible — needs grouping or scrolling) - Should nodes be grouped by phase label? (All "Planning" nodes clustered together) - What does the selected node look like? What does the unit's current node look like? Different states? - Bottom bar layout: what info? Unit name, current phase, available actions, turn number? - Gamepad mapping: D-pad moves selection, A confirms, B cancels, Start pauses, Select toggles view? - Does Wargame Engine's node-tree viewport handle this layout natively or need custom pygame drawing? - Should there be a minimap or overview mode for the full 30-node graph?

Resource refs: - Wargame Engine node-tree viewport: https://github.com/maximinus/wargame (README example shows scene/node layout) - RG display: confirmed 640×480 via pygame.display.list_modes() - RG gamepad: confirmed /dev/input/js0, pygame handles natively - Existing UI reference: Oracle Chamber (/userdata/oracle-chamber/) — same device, same resolution, proof of concept - Existing UI reference: XCOM Deploy (/userdata/roms/ports/xcom-deploy/) — same device, agent management UI - Pearl Brain: POLYGON pack if loaded — PSX-era UI patterns (constrained resolution, gamepad input) - Wiki ref: [[langgraph-engine-evaluation]] (Wargame Engine capabilities) - Wiki ref: [[asset-pipeline-wargame]] (prototype rendering approach — shapes + text)


T-06: Wiring Determination

How does RG talk to Pi? LangGraph to Wargame?

Research questions: - HTTP bridge: FastAPI, Flask, or raw aiohttp? What's the lightest Python HTTP server for Pi? - Endpoints needed: GET /state (fetch current state), POST /invoke (send action, return updated state), GET /health (connectivity check)? - Polling vs streaming: does LangGraph astream work over Tailscale? Or is simple polling every 500ms sufficient for turn-based? - State serialization: JSON? How large is the state payload at 30 nodes? (Unit + nodes + edges + history) - Where does the HTTP server run? Same process as LangGraph, or separate? - Authentication: none for prototype (Tailscale mesh = private network already)? - Error handling: what if Pi is unreachable? RG caches last known state? Shows "connecting..." overlay? - Does the RG client import LangGraph directly (state lives locally) or is state only on Pi via HTTP?

Resource refs: - LangGraph invoke/stream: https://docs.langchain.com/oss/python/langgraph/graph-api - LangGraph get_state: https://reference.langchain.com/python/langgraph/graph/CompiledStateGraph#get_state - Tailscale mesh: RG (100.119.202.114) ↔ Pi (100.120.38.37) — both on tailnet, low latency - FastAPI: https://fastapi.tiangolo.com/ (lightweight, async, auto-docs) - Flask: https://flask.palletsprojects.com/ (simpler, synchronous, well-known) - Pi Docker networking: check if containers can reach each other and the RG via Tailscale - Wiki ref: [[wargame-engine-rg-deployment]] (RG-to-Pi architecture diagram) - Wiki ref: [[langgraph-game-surface]] (client-server split)


T-07: Draw the UI

Implement the rendering layer on RG.

Research questions: - Does the RG client run on the RG itself (full pygame app) or render via HTML served from Pi? - If pygame on RG: is Wargame Engine installed on RG too, or just raw pygame? - How does the RG client receive state from Pi? HTTP GET? WebSocket? - Node rendering: colored rectangles with text labels? Phase-label-colored borders? - Unit position: pulsing highlight? Different color border? Icon overlay? - Font size at 640×480 for 30 nodes — what's legible? Kenney fonts from Oracle Chamber as reference? - Gamepad navigation: D-pad moves between nodes, or free cursor? How to handle 30 nodes with D-pad? - How to show node status (open, in_progress, done, blocked)? Color coding? Icons?

Resource refs: - RG runtime: Python 3.12 + pygame 2.5.2 confirmed - Oracle Chamber fonts: /userdata/oracle-chamber/assets/fonts/ (Kenney Future, Kenney High, Kenney Mini) - XCOM Deploy colors: /userdata/roms/ports/xcom-deploy/config.py (SG9 palette, ready to reuse) - Wargame Engine example: https://github.com/maximinus/wargame/blob/master/wargame-examples/ - Wiki ref: [[asset-pipeline-wargame]] (colored shapes + text for prototype) - Wiki ref: [[wargame-engine-rg-deployment]] (PortMaster vs direct SSH deploy)


T-08: Bake the Game

Assemble the full prototype.

Research questions: - What's the startup sequence? Pi starts LangGraph + HTTP server → RG connects → state syncs? - How does the RG discover the Pi? Hardcoded Tailscale IP? mDNS? - What happens on first connection? (No state yet → graph initializes → RG renders empty graph) - What's the turn cycle in practice? Player selects node → presses A → invoke fires → state updates → RG re-renders → interrupt fires → player sees prompt? - Latency budget: how long does one invoke take? (Network round-trip + LangGraph super-step + agent call) - Does the full loop feel responsive on real hardware? What's the frame rate during invoke? - Debug surface: can we see LangGraph state from the Pi via SSH while the RG is playing?

Resource refs: - All T-00 through T-07 outputs - Pi + RG both on Tailscale, low-latency mesh - LangSmith tracing (optional): LANGSMITH_TRACING=true for debugging graph execution - Wiki ref: [[langgraph-gameplay-modes]] (turn cycle structure, deferred auto-run) - Wiki ref: [[langgraph-unit-node-interaction-model]] (unit traversal pattern)


T-09: Build the LangGraph

Implement the 30-node LangGraph graph.

Research questions: - What does each node function do? Stub initially, then wired to agent (T-02) per mission (T-04)? - How are 30 nodes added to the graph? Programmatic loop or 30 individual add_node calls? - Edge generation: manual edges or computed from phase labels? (Nodes in same phase connected sequentially? Cross-phase edges explicit?) - State schema: what keys exist? nodes, edges, unit, phase_labels, history, turn? - How does the unit's position update? Node function sets state["unit"]["current_node"] = "node_14"? - Access gating: how does the node function know what other nodes' outputs to include in context? - Output locale: where does node output get written? Wiki page? Forgejo issue comment? State key? - What's the LangGraph compile command? graph = builder.compile(checkpointer=SqliteSaver(...))? - Does the graph compile + run on Pi 4 within reasonable time? (<5 seconds compile, <1 second per super-step)

Resource refs: - LangGraph StateGraph API: https://docs.langchain.com/oss/python/langgraph/graph-api - LangGraph checkpointer: https://docs.langchain.com/oss/python/langgraph/persistence - LangGraph interrupt(): https://docs.langchain.com/oss/python/langgraph/interrupts - LangGraph Command: https://reference.langchain.com/python/langgraph/types/Command - LangGraph Send: https://reference.langchain.com/python/langgraph/types/Send - Pearl Brain: research-sandshrew-20260511-180837-57495d32 (LangGraph core mechanisms) - Wiki ref: [[langgraph-node-anatomy]] (full node anatomy — state, config, runtime, return types) - Wiki ref: [[langgraph-unit-node-interaction-model]] (unit state schema, access gating, output locales) - Wiki ref: [[langgraph-hex-node-mapping]] (scale characteristics for N-node graphs)


Dependency Graph

T-00 (Pi start state) ── foundation, informs everything

T-01 (Wargame toolchain) ────┐
T-02 (Agent config) ─────────┤
T-03 (LangGraph config) ─────┤
T-04 (Mission selection) ────┤──► T-06 (Wiring) ──► T-09 (Build LangGraph) ──┐
T-05 (UI design) ────────────┘                                                
                                                                              ├──► T-08 (Bake)
T-07 (Draw UI) ──────────────────────────────────────────────────────────────┘

Spec/PRD Output

After all 10 tasks are informed (one by one, interactively), the compiled output will serve as: - A build spec for LangGraph backend implementation sessions - A render spec for Wargame/pygame UI sessions - A wiring spec for HTTP bridge sessions - A deployment spec for Pi setup sessions

Each task's output becomes a section of the final PRD.