Home > Wiki > Architecture > GameSurface Prototype — Implementation Handoff Spec

GameSurface Prototype — Implementation Handoff Spec

Mission: Design and prototype a gamepad-first virtual keyboard system that upgrades text input on the RG40XXV game surface.

Status: STAGED — locked frames defined. LangGraph configs are recommendations only — team may freely explore and implement whatever works best, provided it's documented. Date: 2026-05-12

TAILSCALE / SSH ACCESS (All Devices)

The Pi, RG, and Mac are all on the same Tailscale mesh. This is the canonical access method.

Device	Tailscale IP	Hostname	SSH Command
Pi 4 (relik-pi4)	100.120.38.37	relik-pi4	`ssh mehdifarah@relik-pi4`
RG40XXV (knulli-1)	100.119.202.114	knulli-1	`ssh root@100.119.202.114`
Mac (mehdis-macbook-air-2)	100.93.138.108	—	—

Both on Tailscale 1.96.4 with Tailscale SSH enabled. No passwords, no sshpass, no ConnectTimeout. MagicDNS short hostnames not supported on RG (KNULLI limitation — always use IPs from the RG).

Hermes on Pi: If a command requires an interactive terminal (e.g., hermes model), use the hermesi wrapper at /home/mehdifarah/.local/bin/hermesi on the Pi. Pipe choices through stdin: printf '1\n1\n' | hermesi model.

Pi LangGraph server: Starts via cd /mnt/kitchen/pearl/game-surface-python && /home/mehdifarah/game-surface-venv/bin/python src/server.py. Serves on port 8000. RG reaches it at http://100.120.38.37:8000.

FORGEJO REPO

All work must be committed to the Forgejo repo at:

ssh://mehdifarah@relik-pi4:/home/mehdifarah/git/langgraph-game-surface.git

The repo already exists with prior commits (engine evaluations, prototype specs, Akashic Abyss archive). The implementation team should: - Commit all code, assets, configs to this repo - Log design decisions, architecture notes, and iteration history - Keep the repo well-organized (separate dirs for src/, docs/, assets/, config/) - Forgejo web UI at http://100.120.38.37:3001/ (may need configuration — SSH git works regardless)

LAYER 1: THE GAME — Player-Facing Surface

LOCKED (Non-Negotiable)

Assets

Core game assets already exist (hex tiles, unit sprites, UI elements). Use these as the base.
Team is free to create new animations, sprites, or polish assets to fill gaps or improve quality.
All assets committed to the Forgejo repo under assets/.

Mission

Design and prototype a gamepad-first virtual keyboard system for RG40XXV. Output: well-documented concepts, builds, prototypes for different keyboard systems, text input methods, and structured Hermes skills.

Design Philosophy

Double diamond: Discover → Define → Design → Deliver → Review

Hex Grid

6×6 flat-top hex grid, 36 hexes (00–35)
Each hex is a task node with a phase tag
Hexes have adjacency-based movement (up to 6 neighbors)
Renders on RG at 640×480 via Wargame Engine + pygame

Units

3 units: Rif (designer), Echo (developer), Sherpa (researcher) — names are placeholders
Per-unit interactive turns — player moves one unit at a time, unit completes work, player reviews
No auto-run mode
Only one unit per hex
Each unit carries: config (model/harness), trail context, access lists, history

Validation

Push builds to the RG40XXV when ready for testing
Iterate by taking screenshots on the RG and checking against the LangGraph state
Verify via curl http://100.120.38.37:8000/state to confirm graph state matches what's visible on RG

Full Node Grid (36 Hexes with Task Descriptions)

Hex	Phase	Task	Prerequisites	Unlocks
00	discover	archaeology of past local conventions — SSH into RG, extract patterns from Pearl Diver, Oracle Chamber, Mozart, IN-REACH	(none)	06
01	discover	Worldsim research — how Hermes nests structured multi-choice for gamepad response	(none)	06
02	discover	nesting in games — case survey of games with minimal text input for complex state	(none)	07
03	discover	nesting logic — algorithmic survey (MCTS, Socratic, decision trees)	(none)	07
04	discover	virtual keyboards — case survey of gamepad text input systems	(none)	06
05	discover	conversation trees — dynamic context, modular, multiple choice	(none)	08
06	discover	synth: keyboard + input signals — synthesize 00, 01, 04 into structured report	00, 01, 04 (2 of 3)	09, 10
07	discover	synth: nesting + response signals — synthesize 02, 03	02, 03 (1 of 2)	09, 11
08	discover	synth: convo structure + dynamic context — synthesize 05	05	10, 11
09	define	define: keyboard surface requirements — from synth 06	06	12
10	define	define: MC interview flow requirements — from synth 06, 08	06, 08	13, 14
11	define	define: Hermes skill bracing requirements — from synth 07, 08	07, 08	14
12	define	spec: keyboard UI params — concrete parameters from 09	09	15, 16
13	define	spec: MC interview params — concrete interview parameters from 10	10	17
14	define	spec: Hermes skill + integration params — from 10, 11	10, 11	17, 18
15	design	design: keyboard UI concept 1 (grid-based, D-pad) — from spec 12	12	19
16	design	design: keyboard UI concept 2 (radial/chips) — from spec 12	12	19, 25
17	design	design: MC interview flow — wireframes from specs 13, 14	13, 14	20, 21
18	design	design: integration + wildcard — from spec 14	14	21, 22
19	design	design: refined keyboard spec — synthesize 15 + 16, select strongest concept	15, 16	24
20	design	design: MC logic / nesting flow — refine interview logic from 17	17	26
21	design	design: conversation / reply / options flow — end-to-end interaction design	17, 18	24, 26
22	deliver	design: integration architecture — blueprint from 18	18	28
23	deliver	design: Hermes skill architecture — from 14, 18	14, 18	27
24	deliver	build keyboard prototype 1 — implement from design 19, 21	19, 21	29
25	deliver	build keyboard prototype 2 — implement from design 16, 21	16, 21	29
26	deliver	build MC / nesting logic prototype — from design 20, 21	20, 21	29
27	deliver	build Hermes skill(s) — self-skill from within harness, architecture from 23	23	29
28	deliver	integrate prototype into game surface — per architecture 22. ⚠ Verify Wargame scene system can handle overlay vs full-scene switching	22	30
29	deliver	test on RG hardware — deploy, test, screenshot, document	24, 25, 26, 27, 28	31, 32
30	deliver	document and package artifacts — compile all deliverables, README	28	31, 33
31	deliver	synthesize and polish — review all deliverables, gap analysis	29, 30	35
32	review	architecture review — Rif (has full Design context)	29	35
33	review	code + integration review — BOTH Rif AND separate Hermes instance review	30	35
34	review	UX / playability review — usability on RG, prototype comparison	29	35
35	review	post-mortem + metabolism plan — absorb concepts into OmniWiki	31, 32, 33, 34	(complete)

Interaction Model & Menu System ⚠️ CRITICAL — NOT OPTIONAL

The game is NOT just a hex grid with units. Every tile, unit, and node has nested menus. The full interaction tree is 4 levels deep. This is the core UX — the team MUST implement this.

Menu Tree (Collapsed)

L0  HEX GRID       — D-pad navigates, A selects hex
L1  TILE MENU      — unit summary + node summary. X→Unit Menu, Y→Node Menu
L2a UNIT MENU      — Status, Config, Trail, Access, History, Move, Interact
L2b NODE MENU      — Properties, Output, Edges, Config, History
L3                 — full detail screens for each L2 item
L4a                — expanded output section
L4b                — full output page (cross-device to Mac if too dense)
L4c                — routing action (Curate Context / Point Wholesale)

Key Interaction Patterns

Correction Interview (when output is wrong): Player selects "Correct" at interrupt. MC branching interview narrows problem: what's wrong? → which section? → suggested fix direction. Agent echoes back interpretation. Player: Correct / Close / Not-at-all. All gamepad — no text input required.

Context Routing: From a completed node's Output view → select Route → choose content scope (entire/section/curated) → choose method (Curate/Pointer) → return to grid in routing mode → select target hex → confirm.

Output Display: Every node produces two things: full output (wiki page on Pi) + summary (structured sections with header/body/status). RG displays summary. A expands section inline. Y opens full wiki page (on Mac if too dense).

Interactive Terminal: When a unit is active at a node, the terminal shows live agent output, chain of thought, and the interrupt prompt. Player can type corrections (fallback) or use MC interview (primary).

Live Prompt Injection: Player corrections accumulate per-node. On next turn, the node reads all prior corrections and restructures. Corrections never overwrite — they append via reducer.

Per-Unit Independence: Each unit has its own turn counter, config, trail, and access lists. Only one unit per hex. Cross-unit context sharing via access lists.

Gamepad Controls

Button	Context	Action
D-pad	Grid	Move cursor between hexes
D-pad	Menu	Navigate items
A	Grid	Open Tile Menu
A	Menu	Select / Confirm / Drill down
B	Any	Back one level
X	Grid	Cycle unit focus
X	Tile Menu	Open Unit Menu
Y	Tile Menu	Open Node Menu
Y	Output	Open full page (cross-device)
SELECT	Grid	Toggle phase color overlay
START	Any	Pause menu (save, quit, settings)
L1/R1	Grid	Previous/next unit

Wiki Reference Pages (Read These)

Topic	Wiki Page
Full menu system spec (L0→L4)	`wiki/concepts/menu-system-full-spec.md`
Correction interview protocol	`wiki/research/correction-interview-protocol.md`
Poller/stager pattern	`wiki/concepts/poller-stager-pattern.md`
Context curation pattern	`wiki/concepts/langgraph-context-curation-pattern.md`
Live prompt injection	`wiki/concepts/langgraph-live-prompt-injection.md`
Output summary protocol	`wiki/concepts/output-summary-protocol.md`
Routing submenu design	`wiki/research/routing-submenu-design.md`
Edge case protocols	`wiki/research/edge-case-protocols.md`
Gamepad input patterns (research)	`wiki/research/gamepad-input-patterns.md`
Nested menu protocol	`wiki/concepts/nested-menu-protocol.md`
Node config catalog (12 dimensions)	`wiki/concepts/langgraph-node-config-catalog.md`
Soft mechanics (context/prompts/comms)	`wiki/concepts/langgraph-soft-mechanics.md`

Phases & Node Distribution

Phase	Hexes	Count
Discover	00–08	9
Define	09–14	6
Design	15–21	7
Deliver	22–31	10
Review	32–35	4

Full Node Definitions

Hex	Phase	Task
00	discover	archaeology of past local conventions — SSH into RG, extract patterns from Pearl Diver, Oracle Chamber, Mozart, IN-REACH
01	discover	Worldsim research — how Hermes nests structured multi-choice for gamepad response
02	discover	nesting in games — case survey of games with minimal text input for complex state
03	discover	nesting logic — algorithmic survey (MCTS, Socratic, decision trees)
04	discover	virtual keyboards — case survey of gamepad text input systems
05	discover	conversation trees — dynamic context, modular, multiple choice
06	discover	synth: keyboard + input signals (from 00, 01, 04)
07	discover	synth: nesting + response signals (from 02, 03)
08	discover	synth: convo structure + dynamic context (from 05)
09	define	define: keyboard surface requirements
10	define	define: MC interview flow requirements
11	define	define: Hermes skill bracing requirements
12	define	spec: keyboard UI params
13	define	spec: MC interview params
14	define	spec: Hermes skill + integration params
15	design	design: keyboard UI concept 1
16	design	design: keyboard UI concept 2
17	design	design: MC interview flow
18	design	design: integration + wildcard
19	design	design: refined keyboard spec (synthesis of 15+16)
20	design	design: MC logic / nesting flow
21	design	design: conversation / reply / options flow
22	deliver	design: integration architecture
23	deliver	design: Hermes skill architecture
24	deliver	build keyboard prototype 1
25	deliver	build keyboard prototype 2
26	deliver	build MC / nesting logic prototype
27	deliver	build Hermes skill(s) for keyboard input
28	deliver	integrate prototype into game surface
29	deliver	test on RG hardware
30	deliver	document and package artifacts
31	deliver	synthesize and polish
32	review	architecture review
33	review	code + integration review (both Rif and separate Hermes instance review)
34	review	UX / playability review
35	review	post-mortem + metabolism plan

Gating (Prerequisites)

A hex is locked until its prerequisites reach "completed" or "deepened" status.

06 needs 00, 01, 04 (at least 2 of 3)
07 needs 02, 03 (at least 1 of 2)
08 needs 05
09 needs 06
10 needs 06, 08
11 needs 07, 08
12 needs 09          |  19 needs 15, 16       |  24 needs 19, 21
13 needs 10          |  20 needs 17          |  25 needs 16, 21
14 needs 10, 11      |  21 needs 17, 18      |  26 needs 20, 21
15 needs 12          |  22 needs 18          |  27 needs 23
16 needs 12          |  23 needs 14, 18      |  28 needs 22
17 needs 13, 14                               |  29 needs 24, 25, 26, 27, 28
18 needs 14                                    |  30 needs 28
                                               |  31 needs 29, 30
                                               |  32 needs 29
                                               |  33 needs 30
                                               |  34 needs 29
                                               |  35 needs 31, 32, 33, 34

Acceptance Criteria

At least 2 distinct keyboard prototype concepts documented
At least 1 working prototype running on RG (colored shapes acceptable)
Hermes skill(s) for keyboard input created and tested
All artifacts in Forgejo repo + wiki runtime section
Post-run metabolism plan: concepts mergeable into OmniWiki under "Gamepad Input"

RECOMMENDED (Strong Suggestions)

Interaction Model

Pre-filtered context access. Each unit's context_access list gates what Hermes receives. Curation happens upstream (player via routing menu, or curation agent). No LLM self-selection of context for prototype.
Correction interviews. Inline interrupt() loop — 3-4 MC questions in the node function. Player narrows problem → agent echoes → player confirms/refines.
Curation as standalone service. Not a hex on the map. Separate endpoint or RG-side preprocessing.
Nested menu tree. See full menu spec in wiki for L0→L4 hierarchy.
Output summaries. Structured sections with header/body/status. RG displays summary; full output lives on Pi wiki.
Context-specific wiki views. Node Y opens that node's page. Unit Y opens unit's trail page.

Gamepad Controls

D-pad: navigate. A: select/confirm. B: back. X: unit menu. Y: node menu. SELECT: phase overlay. START: pause/save. L1/R1: cycle units.

Rendering

Wargame Engine (Python + pygame) — selected after 7-engine survey
Colored shapes + text for prototype (zero assets needed)
Hex grid fits at ~100×87px per hex on 640×480

ADVISORY (Explored, Team Can Adapt)

Alternative Engines Evaluated

OpenRA (C#, no Python bridge), RecoilEngine (C++ RTS, overkill), Spring RTS (same), openage (dead), LOVE (adds Lua boundary), Fabula (dead, client-server pattern useful conceptually), Godot (overkill), Panda3D (3D, wrong domain).

Gamepad Input Patterns (Sub-Agent Research Available)

Radial menus, chips composition, branching interviews, T9 predictive, tags/presets, thumbs up/down drill-down
Correction interview protocol with 6 branching scenarios
Full routing submenu design (content → method → target → confirm)
Edge case protocols (incomplete output, stuck unit, direction disagreement, hex collision, revisit, routing failure, save/load)

Pending Verification

Wargame Engine scene system: can it handle overlay vs. full-scene switching for the keyboard?
Which keyboard concept gets built first? (Research determines this — hexes 00-05, synthesized at 19)

LAYER 2: THE MEMBRANE — Wiring Between RG and Pi

LOCKED (Non-Negotiable)

Network

Pi 4 (Tailscale IP: 100.120.38.37) ↔ RG40XXV (Tailscale IP: 100.119.202.114)
Both on Tailscale 1.96.4 with Tailscale SSH enabled
SSH: ssh mehdifarah@relik-pi4 (Pi), ssh root@100.119.202.114 (RG)
No passwords, no sshpass, no ConnectTimeout gymnastics
MagicDNS short hostnames not supported on RG (KNULLI limitation — use IPs)

Probing Results (Verified)

Ping Pi↔RG: 3–5ms steady state
HTTP roundtrip RG→Pi: 16ms avg
State fetch (36 nodes + 3 units): 18ms, 5.9KB payload
Invoke (0.8s mock agent): 839ms total (agent dominates, not wire)

Save State

Canonical save = LangGraph checkpointer on Pi
RG "save" = 640×480 screenshot + hex/unit/turn metadata (visual bookmark only)

RECOMMENDED (Strong Suggestions)

HTTP API

GET /state — returns full game state as JSON
POST /invoke — unit moves, graph activates node, returns state + interrupt
POST /checkpoint — creates named save point
GET /health — connectivity check
GET /nodes — node definitions + adjacency + gating rules

State Sync

Polling at 100ms — 3-5ms latency makes polling imperceptible
No WebSocket/streaming needed for prototype

Server

FastAPI + uvicorn on Pi (lightweight, async, auto-docs)
Port 8000

Wiki Serving

Wiki files on HDD at /mnt/kitchen/pearl/wiki/
Served over HTTP: GET /wiki/architecture/langgraph-game-surface.md
RG renders context-specific pages; full wiki pages open on Mac for dense content

ADVISORY

State Payload

5.9KB baseline at 36 nodes. Grows with history but <50KB for full sessions. Not a bottleneck.

Polling vs Streaming

Polling chosen for simplicity. 3-5ms latency confirmed. Streaming (LangGraph astream) available as future upgrade.

LAYER 3: THE GRAPH / Pi RUNTIME — LangGraph Backend

IMPORTANT: The LangGraph configurations below are RECOMMENDATIONS based on our prototyping. The implementation team is free to explore and implement any LangGraph config that works best — use what's here as a starting point, not a constraint. Whatever you choose, document it properly in the Forgejo repo so it's traceable.

LOCKED (Non-Negotiable)

Agent Stack

LangGraph on Python (venv at /home/mehdifarah/game-surface-venv/)
Hermes as agent harness (Qwen 3.6+ primary via Nous Portal OAuth, Kimi K2.6 fallback)
One runtime — Python for LangGraph + Hermes (no language boundary)
Hermes call: hermes -z "prompt" --model qwen/qwen3.6-plus

Storage

High-write data on 3.6TB HDD at /mnt/kitchen/pearl/
SD card for OS/Docker/Forgejo only (low-write)
Game code at /mnt/kitchen/pearl/game-surface-python/
Wiki at /mnt/kitchen/pearl/wiki/
Checkpoints at /mnt/kitchen/pearl/checkpoints/

Process

Single Bun/Python process on host, or Docker container (team's choice)
Forgejo stays in Docker (existing, working)
probe server and old d3-tui container can be retired (wiki already archived)

Concurrent Load

Up to 5 concurrent Hermes calls (3 units + poller + curation agent)
Pi 4 has 1.8GB headroom at full load
Hermes Portal OAuth rate limit is the real cap — not Pi hardware

RECOMMENDED (Strong Suggestions)

LangGraph Graph (Compiles and Runs — Verified on Pi)

Graph status: ✅ Compiles, 36 nodes, hex adjacency, MemorySaver checkpointer. Server running on port 8000.

Two invocations tested: - Sherpa moved to hex 02 (turn 2, position updated, state persisted) - Rif moved to hex 16 (locked — prerequisite hex 12 not completed, gating enforced)

State size: 16.9KB after 2 invokes with Hermes output.

State Schema (Omni Config)

GameState:
  nodes: dict          # 36 hex entries (merge reducer)
  units: dict          # 3 unit entries (merge reducer)
  node_player_prompts: dict  # corrections per hex (append reducer)
  staged_configs: dict       # poller → unit handoff
  routed_context: dict       # curation → target handoff
  mission: dict              # objective, criteria
  decisions: list            # upstream choices (append reducer)
  turn_log: list             # traceability log (append reducer)

Single Thread, All Units

One thread_id for the game. All 3 units share state. Merge reducers prevent cross-unit overwrites. SqliteSaver serializes writes (fine for turn-based).

Node Functions

36 stubs — each calls Hermes -z one-shot, produces output_summary, calls interrupt() for player review
Factory pattern: make_node_function(hex_id) generates per-hex functions from NODE_DEFS
Interrupt options: accept, correct, skip, deepen

Hex Adjacency

Computed from 6×6 flat-top hex grid geometry
Each hex connects to up to 6 neighbors
Hex 14 neighbors: 07, 08, 13, 15, 19, 20

Gating at Entry

Node function checks prerequisites on entry
If locked, returns status="locked" with missing prerequisites listed
RG client also prevents move selection to locked hexes (UI gating)

Fallback Chain

Qwen 3.6+ (Nous Portal OAuth — free, unlimited)
  ↓ if rate-limited/error
Kimi K2.6 (API key — already configured)
  ↓ if unavailable
MiniMax (API key — deferred)

ADVISORY

Poller/Stager

Recommended: inline interrupt() loop inside node function. 3-4 MC questions. No subgraph needed for prototype. Can be upgraded to subgraph later.

Curation Agent

Recommended: standalone endpoint/service, not a hex on the map. Keeps hex graph pure.

MemorySaver vs SqliteSaver

MemorySaver used for prototype (simpler, state lives as long as Python process). SqliteSaver for production (state survives process restarts).

Hermes Self-Skills

Hermes creates its own skills from within the harness. Only intervene externally if it fails twice. Do NOT pre-build skills.

Pi Surface Migration

Full 7-phase migration plan documented in wiki (q-01-migration-plan.md). Phases: backup → cleanup → install Bun → create container → directory conventions → OAuth/fallback → verify.

APPENDIX: Full Documentation Index

All design docs live in the wiki at /workcell/llm-wiki/wiki/ and the repo at ssh://mehdifarah@relik-pi4:/home/mehdifarah/git/langgraph-game-surface.git.

Document	Location
Infrastructure canonical config	`wiki/architecture/infrastructure-canonical-config.md`
Tailscale config audit	`wiki/architecture/tailscale-canonical-config.md`
Wiring probe results	`wiki/architecture/wiring-probe-results.md`
Q-01 migration plan (7 phases)	`wiki/decisions/q-01-migration-plan.md`
Q-01 storage layout (HDD/SD)	`wiki/decisions/q-01-storage-layout.md`
Q-01 concurrency assessment	`wiki/decisions/q-01-concurrency-assessment.md`
Q-01 + Q-02 resolved (Python path)	`wiki/decisions/q-01-q-02-resolved.md`
Python-over-Bun decision	`wiki/decisions/python-over-bun-decision.md`
Q-03 LangGraph spec + ambiguities	`wiki/research/q-03-langgraph-spec.md`
Q-03 memory scopes	`wiki/research/q-03-memory-scopes.md`
Q-07 mission spec	`wiki/research/q-07-mission-spec.md`
Menu system full spec	`wiki/concepts/menu-system-full-spec.md`
Node config catalog (12 dimensions)	`wiki/concepts/langgraph-node-config-catalog.md`
Soft mechanics (context/prompts/comms)	`wiki/concepts/langgraph-soft-mechanics.md`
Context curation pattern	`wiki/concepts/langgraph-context-curation-pattern.md`
Live prompt injection pattern	`wiki/concepts/langgraph-live-prompt-injection.md`
Output summary protocol	`wiki/concepts/output-summary-protocol.md`
Nested menu protocol	`wiki/concepts/nested-menu-protocol.md`
Poller/stager pattern	`wiki/concepts/poller-stager-pattern.md`
Per-unit turns (Wargame)	`wiki/concepts/per-unit-turns-wargame.md`
Omni config + constraints	`wiki/concepts/omni-config-and-constraints.md`
Engine evaluation (7 engines)	`wiki/architecture/langgraph-engine-evaluation.md`
Wargame/RG deployment feasibility	`wiki/architecture/wargame-engine-rg-deployment.md`
Asset pipeline (prototype)	`wiki/architecture/asset-pipeline-wargame.md`
Gamepad input patterns (research)	`wiki/research/gamepad-input-patterns.md`
Correction interview protocol	`wiki/research/correction-interview-protocol.md`
Wiki integration design	`wiki/research/wiki-integration-design.md`
Routing submenu design	`wiki/research/routing-submenu-design.md`
Edge case protocols	`wiki/research/edge-case-protocols.md`
Hermes interactive over SSH	`wiki/concepts/hermes-interactive-over-ssh.md`
Forgejo repo (git)	`ssh://mehdifarah@relik-pi4:/home/mehdifarah/git/langgraph-game-surface.git`

Spec Location on Pi

Wiki:  /workcell/llm-wiki/wiki/architecture/implementation-spec.md
       (inside d3-tui-pi-teams-proto Docker container)

Repo:  docs/IMPLEMENTATION_SPEC.md
       (in langgraph-game-surface.git, commit 8ee849a)

Desktop: ~/Desktop/IMPLEMENTATION_SPEC.md (this file)