Skip to content

title: "Gather — "What do we NOT know?"" source: "tasks/TFW-42__research_cycle_restructure/research/gather.md"


Gather — "What do we NOT know?"

Parent: HL-TFW-42 Goal: Determine how TFW should present multi-agent research orchestration to users via iterations.yaml.

Dimensions

Dimension Alt A Alt B Alt C Alt D
D1: Agent specification Free-text name only (agent: antigravity) Structured profile (agent: {name, strengths, model}) Enum from registry (agent: antigravity validated against .tfw/agents.yaml) No agent field — implicit from context
D2: Iteration dependency Linear chain (depends_on: [iter-1]) DAG (arbitrary depends_on graph) None (pure sequential by number) Implicit (each reads predecessor RES)
D3: Framework guidance Document-only (conventions describe multi-agent as possible) Prompt-in-template (iterations.yaml template includes agent: with comment guidance) Active suggestion (plan.md Step 6b prompts coordinator to consider agent assignment) Auto-detect (framework infers best agent from iteration focus)

Findings

G1: Multi-agent framework landscape (external research)

Surveyed 4 major multi-agent frameworks (CrewAI, AutoGen, LangGraph, MetaGPT) + industry methodology patterns.

Key patterns discovered:

Framework Agent specification Task-agent binding Config format Orchestration
CrewAI YAML: role, goal, backstory, allow_delegation agent: field on task YAML Separate agents.yaml + tasks.yaml Sequential or Hierarchical (code)
AutoGen Python: name, description, llm_config speaker_selection_method (auto/manual/custom) Code + JSON for LLM config GroupChat manager selects dynamically
LangGraph Python: node functions Conditional edges (routing functions) Code (state machine) Graph-based with state schema
MetaGPT YAML: role-specific LLM config _watch / _act cycle (message-passing) config2.yaml for LLM, Python for structure SOP (Standard Operating Procedure)

Critical distinction: All frameworks separate agent DEFINITION (who the agent is) from task ASSIGNMENT (which task the agent works on). CrewAI is the closest analogue to TFW's model — YAML config, explicit agent assignment per task.

TFW's position: TFW doesn't need agent definition (agents are external tools — Antigravity, Claude Code, Codex CLI). TFW only needs task assignment — which iteration gets which agent. This is simpler than all surveyed frameworks.

G2: AFD-2 empirical patterns (internal analysis)

From HL §2 and iterations.yaml focus field:

Iteration Agent Focus Why this agent
1 Antigravity Initial investigation (web research + domain analysis) Best at web search, multi-source synthesis, large context
2 Codex CLI Code audit (Gradle structure analysis) Fast file traversal, code-native, can run commands
3 Codex CLI Deeper code audit (module dependency mapping) Continuation of iter 2, same tool strengths
4 Antigravity Architecture synthesis (26-module design) Required web research (Gradle best practices) + large synthesis
5 Claude Code Server reconnaissance (production infra audit) MCP server access, interactive SSH, real-time queries
6 Antigravity Domain modeling (port-interface contracts) Large context window needed for cross-module design
7 Antigravity Build-logic research (convention plugins) Web research on Gradle conventions + synthesis
8 Antigravity Final synthesis (implementation strategy) Cross-iteration synthesis requiring full context

Patterns observed: 1. Agent choice correlates with iteration TYPE, not with sequential position 2. Three agent archetypes emerge: Web Researcher (Antigravity), Code Auditor (Codex), Infra Operator (Claude Code) 3. Consecutive iterations with same agent (2→3, 6→7→8) suggest "continuation" is a natural pattern 4. Agent switching happened at domain boundaries, not arbitrary points 5. No iteration used more than one agent — agent is per-iteration, not per-stage

G3: TFW coordinator workflow analysis (internal)

Current plan.md Step 6b creates iterations.yaml. The coordinator decides: - How many iterations - What each iteration investigates (focus + hypotheses) - Status tracking

Missing from current workflow: - No prompt to consider agent assignment - No guidance on WHEN different agents are useful - No mechanism to express "this iteration needs web research" vs "this needs code audit"

The agent field in HL-TFW-42's proposed schema is a string — free text. This matches AFD-2's actual usage where agents were named informally.

G4: Industry methodology patterns (external research)

Supervisor-Worker is the dominant pattern for high-stakes multi-agent research: - Central coordinator decomposes work - Specialized workers execute subtasks - Results flow back to coordinator for synthesis

This maps directly to TFW's model: Coordinator (plan.md) → Researcher (research/base.md) → back to Coordinator.

Human-in-the-loop patterns: - "Pause points" / structured gates — TFW already has these (🛑 WAIT gates) - "Progressive autonomy" — TFW supports via CL/AG modes - "Role-based specialization" — TFW has Role Lock

Key insight from methodology research: The industry consensus is that agent selection should be coordinator-driven with guidance, not automated. Auto-detection requires capability models that don't exist yet for AI coding tools. Best practice: provide a decision framework, let the human/coordinator choose.

G5: Counter-evidence — when agent field adds overhead (deep mode requirement)

Counter-argument: Most TFW projects use a single agent. Adding agent field creates noise for the common case.

Evidence for counter: - Of all TFW-42's predecessor tasks (TFW-2 through TFW-41), ALL used a single agent (Antigravity or Claude Code) - Multi-agent research is observed only in AFD-2 (a large, multi-domain project) - Single-agent projects would write agent: antigravity on every iteration — pure noise

Mitigation found in HL §7 P5: "Optional enrichment — new iterations.yaml fields are optional." This principle already addresses the counter-argument. The field exists for those who need it, doesn't burden those who don't.

Checkpoint

Found Remaining
3 dimensions identified with 4 alternatives each None — space well-mapped
4 external frameworks analyzed + 1 methodology survey None
AFD-2 empirical data mapped (8 iterations, 3 agent archetypes) None
Counter-evidence found and addressed (single-agent overhead) None

Sufficiency: - [x] External source used? (5 web searches: CrewAI, AutoGen, LangGraph, MetaGPT, methodology patterns) - [x] Briefing gap closed? (H1 evidence gathered from both external and internal sources) - [x] Dimensions identified? (3 dimensions × 4 alternatives each)

Deep mode criteria: - [x] Hypothesis tested? (H1 — agent field in iterations.yaml vs separate mechanism — evidence gathered) - [x] Counter-evidence sought? (G5: single-agent overhead)

Metacognitive check: I discovered something NEW — the distinction between agent definition and task assignment. TFW doesn't need agent profiles (unlike CrewAI/MetaGPT) because agents are external tools. This simplifies the design significantly. I also discovered the 3-archetype pattern from AFD-2 data, which is actionable for framework guidance.

Stage complete: YES → User decision: proceed (autonomous mode)