Q-04: Work Shape / Lifecycle

Status: ANSWERED (revised with wild examples) Agent: opencode/ext-agent Timestamp UTC: 2026-05-11T02:40:00Z

Short Answer

Use a lightweight claim→research→plan→dispatch→implement→validate→close cycle, not a rigid 7-stage clock. Only two hard gates: plan approval before implementation, validation before close. Everything else is a lead-owned checklist. Evidence from the wild shows no project uses formal stage gates — the d3-tui-triad is already more structured than any open-source multi-agent coding system.

Evidence from the Wild

Project	Stars	Agents	Stages	Gates
Claude Code Teams (Anthropic)	122k	Multi	0-1 (plan approval optional)	Optional
pi-teams (port of above)	91	Multi	0-1	Optional
SWE-agent / Mini-SWE (Princeton, NeurIPS 2024)	19k	Single	0 (agent loop)	None
OpenHands	73k	Single	0 (conversation)	None

Key findings from the wild: - No project uses mandatory stage artifacts or gated pipelines - Mini-SWE-agent achieves 65% on SWE-bench in 100 lines of Python — minimalism wins - Claude Code teams use only 3 states: pending, in_progress, completed - Plan approval is optional even in Claude Code teams (the ancestor of pi-teams) - The d3-tui-triad T0-T7 lifecycle is more rigorous than anything in production

Quotes that matter: - SWE-agent authors: "Most of our development effort is on mini-swe-agent, which has superseded SWE-agent. It matches the performance while being much simpler." - Claude Code teams MCP: tasks have only pending/in_progress/completed/deleted states - OpenHands: no stage gates. Agent writes code, human reviews, they iterate.

Revised Recommendation: Two Hard Gates + Checklist

Hard Gate 1: Plan Before Implement (mechanical)

Mechanism: pi-teams plan approval mode When: Builder-reviewer submits plan to lead inbox before editing files Lead action: Approve or reject with feedback Wild precedent: Claude Code team plan approval (identical pattern)

Hard Gate 2: Validate Before Close (mechanical)

Mechanism: Builder-reviewer runs make, reports result When: After implementation, before closeout Lead action: Review validation, decide close vs rework Wild precedent: SWE-agent runs tests before submitting PR

Everything Else: Lead-Owned Checklist

☐ T0: Lead identifies issue, writes scope
☐ T1: Researcher reviews code/docs, writes findings
☐ T2: Lead drafts approach
☐ T3: Lead dispatches chunks
☐ → HARD GATE: Plan approval
☐ T4: Builder-reviewer implements
☐ T5: Builder-reviewer validates
☐ → HARD GATE: Validation review
☐ T6: Researcher reviews diff (optional)
☐ T7: Lead writes closeout

Stages can be concurrent or skipped. Trivial issues skip T1 (research). Simple fixes skip T6 (diff review).

Duration Caps

Phase	Target	Hard Cap
T0-T3 (pre-implementation)	20 min	45 min
T4 (implement)	30 min	60 min
T5-T7 (post-implementation)	15 min	30 min

Coordination: Inbox Messages, Not Formal Acknowledges

Following the Claude Code teams pattern: 1. Lead creates task on pi-teams task board 2. Researcher checks board, researches, reports via inbox 3. Lead reviews, sends plan to builder-reviewer 4. Builder-reviewer submits plan (plan approval mode) 5. Lead approves → builder-reviewer implements 6. Builder-reviewer reports completion + validation 7. Lead reviews, closes task, writes closeout note

No formal "acknowledge" actions needed — inbox messages are sufficient for a 3-agent team.

Role Mapping

Role	Checklist Items	Gates
lead	T0, T2, T3, T7	Both hard gates
researcher	T1, T6 (advisory)	None
builder-reviewer	T4, T5	Must pass validation gate

Risks / Failure Modes

Over-engineering trap: The original 7-gate design with mandatory artifacts at each stage is more complex than anything in production use. Risk of agents spending more time on lifecycle paperwork than coding.
Plan approval bottleneck: If lead is slow to approve, builder-reviewer sits idle. Mitigation: 10-minute approval timeout, then builder-reviewer escalates to human.
Checklist fatigue: Even with only 2 hard gates, 7 checklist items may feel heavy. Monitor after 3 tasks.

Schedule Adherence Test

After the first 3 real tasks, check: are agents hitting the 20/30/15 minute targets? If consistently over, reassess caps or simplify further.

Decisions Needed From Mehdi

Adopt "two hard gates + checklist" model? (Recommended: yes. Matches wild precedent, still provides structure.)
Enable plan approval mode? (Recommended: yes — consistent with Q-01 and Q-02 recommendations.)