Prototype Task Queue — Sequential Build-Out
Status: ACTIVE (issue queue) Agent: opencode/ext-agent (sandshrew) Timestamp UTC: 2026-05-11T20:45:00Z Claim: synthesis | 2026-05-11T20:40:00Z Session: Task breakdown — interactive spec/PRD build-out — tagged with research refs
Prior Context
- [[prototype-blueprint-30-node-7-phase]] — Structural blueprint
- [[langgraph-game-surface]] — Architecture concept
- Goal: Knock out each item interactively → compiled spec/PRD at the end → chunkable for other sessions
Constraints
- 30 nodes, 7 phase labels tagged onto nodes (not lanes with fixed distribution)
- 1 unit to start
- UI focus first — Forgejo issue mapping comes later
- No auto-run, no harvesters, no multiple units, no combat
Task Queue
T-00: Pi Start State (Prelim)
Define the cleanest setup for the Pi to run this as a persistent surface.
Research questions:
- What's the current Pi resource state? (RAM usage, disk, running processes)
- Does the game backend need its own Docker container, or can it run in the existing d3-tui-pi-teams-proto container alongside pi-teams?
- Can Wargame Engine + LangGraph coexist with existing Docker load without thrashing the 4GB ceiling?
- If a new container: what base image? Does it need Python 3.12 + pip + git?
- If running on host: systemd service or tmux session? What user?
- What ports are already bound? (Forgejo:3001, anything else?) What port for the HTTP bridge?
- Should LangGraph use the same Python as the d3-tui container or its own venv?
Resource refs:
- Pi live state: ssh root@100.120.38.37, docker ps, free -h, df -h
- Existing container: d3-tui-pi-teams-proto (Bun/Node/Python/pi-teams), from-forgejo (Forgejo)
- Pearl Brain: context-opencode-20260510-232853-ef8c26d1 (Pi access details)
- Pearl Brain: Q-06 Runtime Container Shape (already answered — single container approach validated)
- Wiki ref: [[langgraph-game-surface]] (client-server architecture diagram)
T-01: Wargame Toolchain on Pi
Install Wargame Engine on the Pi.
Research questions:
- Does pip3 install git+https://github.com/maximinus/wargame.git work on aarch64 without compilation?
- If it compiles C extensions, do build tools exist on Pi? (gcc, python3-dev)
- What does Wargame Engine actually need at import time? pygame? numpy? Is there a list of dependencies?
- Can it run headless? (No display on Pi — need SDL_VIDEODRIVER=dummy or similar)
- Does Wargame Engine's scene/node system work without a display surface?
- If headless is impossible, can we test rendering on the Pi via framebuffer or Xvfb?
Resource refs:
- Wargame repo: https://github.com/maximinus/wargame
- Wargame README: /wargame/README.md in the repo (dependency list)
- RG verification: confirmed Python 3.12 + pygame 2.5.2 on RG — not yet verified on Pi
- Pearl Brain: Q-09 Bun / Pi Install / Model Routing (Pi install patterns)
- Existing Pi Python: ssh root@100.120.38.37 "python3 --version; pip3 list | grep pygame"
- Wiki ref: [[wargame-engine-rg-deployment]] (RG verification, applicable to Pi)
T-02: Agent Config Definition
Define the unit's agent configuration.
Research questions:
- Which agent harness is available on the Pi right now? (Pi agents run via Bun in the d3-tui container. Hermes runs on RG via Python. OpenCode runs on Mac.)
- What model should the unit use? (Kimi? MiniMax? DeepSeek? Claude via Mac relay?)
- Are API keys already mounted in the d3-tui container? (KIMI_API_KEY, MINIMAX_API_KEY visible in container env)
- Can the LangGraph node function call the agent harness via subprocess (Python → Bun), or HTTP (Python → localhost Bun server), or raw API call (Python → model API directly)?
- For the prototype, should the node function make actual LLM calls or return mock/stub responses?
- What tools does the agent need? (Read, Write, Bash, WebFetch? Or just reasoning?)
Resource refs:
- Pi container env: docker exec d3-tui-pi-teams-proto env | grep API_KEY
- Hermes on RG: /userdata/roms/ports/hermes/
- Pi agent docs: /workcell/llm-wiki/wiki/agents/pi-team-roles.md
- Pearl Brain: Q-01 Pi Teams Fit (harness evaluation)
- Pearl Brain: Q-09 Bun / Pi Install / Model Routing (model key routing)
- Wiki ref: [[langgraph-tool-execution-tradeoffs]] (Pattern C: Hybrid — LangGraph routes, agent decides)
T-03: LangGraph Config Research
What LangGraph setup is optimal for 30 nodes + 1 unit?
Research questions:
- What's the compile time for a 30-node StateGraph on Pi 4? (Test with a stub graph)
- Which checkpointer? MemorySaver (fast, ephemeral) vs SqliteSaver (durable, survives restart)?
- If SqliteSaver: does it handle concurrent reads from RG polling + writes from invoke without locking?
- State schema: what keys? How large is the state object at 30 nodes? Serialization overhead?
- Recursion limit: what's appropriate for a single turn cycle? (Planning → N decisions → Resolution = ~3-5 super-steps per turn)
- Interrupt pattern: one interrupt per node the unit visits? Or only at end-of-turn?
- Can interrupt() return structured data that the RG client can render? (multiple choice, form data?)
- Does LangGraph's streaming API (astream) work over HTTP to the RG, or is polling get_state simpler?
Resource refs:
- LangGraph docs: https://docs.langchain.com/oss/python/langgraph/graph-api
- LangGraph persistence: https://docs.langchain.com/oss/python/langgraph/persistence
- LangGraph interrupts: https://docs.langchain.com/oss/python/langgraph/interrupts
- LangGraph streaming: https://docs.langchain.com/oss/python/langgraph/streaming
- LangGraph reference API: https://reference.langchain.com/python/langgraph/
- Pearl Brain: research-sandshrew-20260511-180837-57495d32 (LangGraph core mechanisms)
- Pearl Brain: inference-sandshrew-20260511-180846-13cceab1 (pre-staged vs emergent)
- Wiki ref: [[langgraph-node-anatomy]] (node capabilities and constraints)
- Wiki ref: [[langgraph-hex-node-mapping]] (30-node scale characteristics)
T-04: Mission Selection
What real work does the unit do across these 30 nodes?
Research questions: - What's a concrete, completable task that exercises the full pipeline? Something with enough substance to warrant 30 nodes across 7 phases. - Does the mission draw from existing Pearl OS work? (Example: "Design and spec the LangGraph game surface prototype itself" — meta-recursive, the mission IS the prototype's own design) - Or does it draw from a real project? (Example: "Research and design the Shachi SDK PSX rendering layer") - What's the right scope? 30 nodes is ~5-6 per phase on average. Too many for a trivial task, exactly right for a real feature design cycle. - What does "done" look like at each node? What's the deliverable? A wiki page? A Forgejo issue? A code commit? - Should the mission be something already completed (retrospective — the unit replays past work to validate the graph) or something new (forward — the unit does real work)?
Resource refs:
- Pearl OS project index: Pearl Brain query_pearl_brain("active projects")
- Existing Linear tickets: POLYGON team, Infrastructure team
- Wiki depot: /workcell/llm-wiki/wiki/research/research-queue.md (existing research questions)
- Forgejo on Pi: http://100.120.38.37:3001/ (current issues could seed the mission)
- Pearl Brain: Q-04 Work Shape / Lifecycle (T0-T7 task lifecycle)
- Wiki ref: [[langgraph-unit-node-interaction-model]] (node output locales, unit traversal)
- Pi file tree: ssh root@100.120.38.37 "ls /mnt/kitchen/from-house/workspace/" (active projects to draw from)
T-05: UI Definition & Design
What does the interface look like on 640×480?
Research questions: - How are 30 nodes displayed on 640×480? Full grid at once (small tiles) or scrollable lanes? - 7 phase labels as column headers or row headers? Vertical or horizontal lanes? - What shape are nodes? Hex? Rectangle? Circle? (640×480 / 30 nodes ≈ 21×16 pixels each if all visible — needs grouping or scrolling) - Should nodes be grouped by phase label? (All "Planning" nodes clustered together) - What does the selected node look like? What does the unit's current node look like? Different states? - Bottom bar layout: what info? Unit name, current phase, available actions, turn number? - Gamepad mapping: D-pad moves selection, A confirms, B cancels, Start pauses, Select toggles view? - Does Wargame Engine's node-tree viewport handle this layout natively or need custom pygame drawing? - Should there be a minimap or overview mode for the full 30-node graph?
Resource refs:
- Wargame Engine node-tree viewport: https://github.com/maximinus/wargame (README example shows scene/node layout)
- RG display: confirmed 640×480 via pygame.display.list_modes()
- RG gamepad: confirmed /dev/input/js0, pygame handles natively
- Existing UI reference: Oracle Chamber (/userdata/oracle-chamber/) — same device, same resolution, proof of concept
- Existing UI reference: XCOM Deploy (/userdata/roms/ports/xcom-deploy/) — same device, agent management UI
- Pearl Brain: POLYGON pack if loaded — PSX-era UI patterns (constrained resolution, gamepad input)
- Wiki ref: [[langgraph-engine-evaluation]] (Wargame Engine capabilities)
- Wiki ref: [[asset-pipeline-wargame]] (prototype rendering approach — shapes + text)
T-06: Wiring Determination
How does RG talk to Pi? LangGraph to Wargame?
Research questions:
- HTTP bridge: FastAPI, Flask, or raw aiohttp? What's the lightest Python HTTP server for Pi?
- Endpoints needed: GET /state (fetch current state), POST /invoke (send action, return updated state), GET /health (connectivity check)?
- Polling vs streaming: does LangGraph astream work over Tailscale? Or is simple polling every 500ms sufficient for turn-based?
- State serialization: JSON? How large is the state payload at 30 nodes? (Unit + nodes + edges + history)
- Where does the HTTP server run? Same process as LangGraph, or separate?
- Authentication: none for prototype (Tailscale mesh = private network already)?
- Error handling: what if Pi is unreachable? RG caches last known state? Shows "connecting..." overlay?
- Does the RG client import LangGraph directly (state lives locally) or is state only on Pi via HTTP?
Resource refs:
- LangGraph invoke/stream: https://docs.langchain.com/oss/python/langgraph/graph-api
- LangGraph get_state: https://reference.langchain.com/python/langgraph/graph/CompiledStateGraph#get_state
- Tailscale mesh: RG (100.119.202.114) ↔ Pi (100.120.38.37) — both on tailnet, low latency
- FastAPI: https://fastapi.tiangolo.com/ (lightweight, async, auto-docs)
- Flask: https://flask.palletsprojects.com/ (simpler, synchronous, well-known)
- Pi Docker networking: check if containers can reach each other and the RG via Tailscale
- Wiki ref: [[wargame-engine-rg-deployment]] (RG-to-Pi architecture diagram)
- Wiki ref: [[langgraph-game-surface]] (client-server split)
T-07: Draw the UI
Implement the rendering layer on RG.
Research questions: - Does the RG client run on the RG itself (full pygame app) or render via HTML served from Pi? - If pygame on RG: is Wargame Engine installed on RG too, or just raw pygame? - How does the RG client receive state from Pi? HTTP GET? WebSocket? - Node rendering: colored rectangles with text labels? Phase-label-colored borders? - Unit position: pulsing highlight? Different color border? Icon overlay? - Font size at 640×480 for 30 nodes — what's legible? Kenney fonts from Oracle Chamber as reference? - Gamepad navigation: D-pad moves between nodes, or free cursor? How to handle 30 nodes with D-pad? - How to show node status (open, in_progress, done, blocked)? Color coding? Icons?
Resource refs:
- RG runtime: Python 3.12 + pygame 2.5.2 confirmed
- Oracle Chamber fonts: /userdata/oracle-chamber/assets/fonts/ (Kenney Future, Kenney High, Kenney Mini)
- XCOM Deploy colors: /userdata/roms/ports/xcom-deploy/config.py (SG9 palette, ready to reuse)
- Wargame Engine example: https://github.com/maximinus/wargame/blob/master/wargame-examples/
- Wiki ref: [[asset-pipeline-wargame]] (colored shapes + text for prototype)
- Wiki ref: [[wargame-engine-rg-deployment]] (PortMaster vs direct SSH deploy)
T-08: Bake the Game
Assemble the full prototype.
Research questions: - What's the startup sequence? Pi starts LangGraph + HTTP server → RG connects → state syncs? - How does the RG discover the Pi? Hardcoded Tailscale IP? mDNS? - What happens on first connection? (No state yet → graph initializes → RG renders empty graph) - What's the turn cycle in practice? Player selects node → presses A → invoke fires → state updates → RG re-renders → interrupt fires → player sees prompt? - Latency budget: how long does one invoke take? (Network round-trip + LangGraph super-step + agent call) - Does the full loop feel responsive on real hardware? What's the frame rate during invoke? - Debug surface: can we see LangGraph state from the Pi via SSH while the RG is playing?
Resource refs:
- All T-00 through T-07 outputs
- Pi + RG both on Tailscale, low-latency mesh
- LangSmith tracing (optional): LANGSMITH_TRACING=true for debugging graph execution
- Wiki ref: [[langgraph-gameplay-modes]] (turn cycle structure, deferred auto-run)
- Wiki ref: [[langgraph-unit-node-interaction-model]] (unit traversal pattern)
T-09: Build the LangGraph
Implement the 30-node LangGraph graph.
Research questions:
- What does each node function do? Stub initially, then wired to agent (T-02) per mission (T-04)?
- How are 30 nodes added to the graph? Programmatic loop or 30 individual add_node calls?
- Edge generation: manual edges or computed from phase labels? (Nodes in same phase connected sequentially? Cross-phase edges explicit?)
- State schema: what keys exist? nodes, edges, unit, phase_labels, history, turn?
- How does the unit's position update? Node function sets state["unit"]["current_node"] = "node_14"?
- Access gating: how does the node function know what other nodes' outputs to include in context?
- Output locale: where does node output get written? Wiki page? Forgejo issue comment? State key?
- What's the LangGraph compile command? graph = builder.compile(checkpointer=SqliteSaver(...))?
- Does the graph compile + run on Pi 4 within reasonable time? (<5 seconds compile, <1 second per super-step)
Resource refs:
- LangGraph StateGraph API: https://docs.langchain.com/oss/python/langgraph/graph-api
- LangGraph checkpointer: https://docs.langchain.com/oss/python/langgraph/persistence
- LangGraph interrupt(): https://docs.langchain.com/oss/python/langgraph/interrupts
- LangGraph Command: https://reference.langchain.com/python/langgraph/types/Command
- LangGraph Send: https://reference.langchain.com/python/langgraph/types/Send
- Pearl Brain: research-sandshrew-20260511-180837-57495d32 (LangGraph core mechanisms)
- Wiki ref: [[langgraph-node-anatomy]] (full node anatomy — state, config, runtime, return types)
- Wiki ref: [[langgraph-unit-node-interaction-model]] (unit state schema, access gating, output locales)
- Wiki ref: [[langgraph-hex-node-mapping]] (scale characteristics for N-node graphs)
Dependency Graph
T-00 (Pi start state) ── foundation, informs everything
T-01 (Wargame toolchain) ────┐
T-02 (Agent config) ─────────┤
T-03 (LangGraph config) ─────┤
T-04 (Mission selection) ────┤──► T-06 (Wiring) ──► T-09 (Build LangGraph) ──┐
T-05 (UI design) ────────────┘ │
├──► T-08 (Bake)
T-07 (Draw UI) ──────────────────────────────────────────────────────────────┘
Spec/PRD Output
After all 10 tasks are informed (one by one, interactively), the compiled output will serve as: - A build spec for LangGraph backend implementation sessions - A render spec for Wargame/pygame UI sessions - A wiring spec for HTTP bridge sessions - A deployment spec for Pi setup sessions
Each task's output becomes a section of the final PRD.