title: "Gather — "What do we NOT know?" (Iteration 2)" source: "tasks/TFW-42__research_cycle_restructure/research2/gather.md"

Gather — "What do we NOT know?" (Iteration 2)¶

Parent: HL-TFW-42 Goal: Map agent capabilities for research subtasks and formalize how TFW guides coordinators in agent selection.

Dimensions¶

Dimension	Alt A	Alt B	Alt C
D1: Guidance specificity	Generic archetypes (Web Researcher / Code Auditor / Infra Operator)	Tool-specific recommendations (Claude Code for X, Codex for Y)	Capability-based (needs web search? → tool with web search)
D2: Guidance location	Comment in iterations.yaml template	Table in conventions.md (reference)	Prompt in plan.md Step 6b (workflow)
D3: Decision model	Human decides, no recommendation	Human decides, framework provides decision table	Framework suggests based on focus keywords

Findings¶

G1: AI coding tool capability matrix (external research, 4 web searches)¶

Built from external research on 5 major tool categories:

Capability	Antigravity	Claude Code	Codex CLI	Cursor	Aider
Web search	✅ Native (Google)	✅ Native WebSearch/WebFetch + MCP	❌ Sandboxed (no internet)	⚠️ Limited (doc lookup)	❌ None
MCP integration	✅ Full (Google Sheets, ClickHouse, etc.)	✅ Full (extensible)	❌ None	⚠️ Partial	❌ None
Context window	~1M tokens (Gemini)	200K standard, ~1M preview	~128K (o3/o4-mini)	Varies by model	Varies
File traversal	✅ Project-wide	✅ Project-wide	✅ Sandboxed clone	✅ IDE-native	✅ Git-aware
Shell commands	✅ Via run_command	✅ Native terminal	✅ Sandboxed	⚠️ Integrated terminal	✅ Native
Browser automation	✅ Built-in	❌ Via MCP only	❌ None	❌ None	❌ None
Multi-file editing	✅ Structured tools	✅ Native	✅ Native	✅ Composer mode	✅ Native
Image generation	✅ Built-in	❌ None	❌ None	❌ None	❌ None
Parallel agents	✅ Agent Manager	❌ Sequential	✅ Background tasks	❌ Sequential	❌ Sequential
Async (fire-and-forget)	❌ Interactive	❌ Interactive	✅ Cloud sandbox, PR output	❌ Interactive	❌ Interactive

G2: Research subtask type mapping¶

What kinds of work happen within TFW research iterations? Mapped from AFD-2 + TFW-42 iter 1:

Research subtask	Description	Key capability needed
Web research	Searching external sources, documentation, best practices	Web search, large context for synthesis
Code audit	Analyzing existing codebase structure, dependencies, patterns	File traversal, shell commands, fast navigation
Architecture synthesis	Designing systems from gathered findings	Large context window, multi-source integration
Infra reconnaissance	Querying live servers, databases, APIs	MCP integration, shell access, interactive sessions
Competitive analysis	Comparing external tools, frameworks, approaches	Web search, structured comparison
Data analysis	Querying databases, analyzing datasets	MCP (ClickHouse, PostgreSQL), data tools
Document review	Reading and synthesizing existing project artifacts	File traversal, large context
Prototype validation	Building small proofs-of-concept	Shell commands, file editing, test execution

G3: Subtask-to-tool mapping (combining G1 + G2)¶

Research subtask	Best-fit tools	Why
Web research	Antigravity, Claude Code	Both have native web search. Antigravity has larger context for synthesis
Code audit	Codex CLI, Claude Code, Cursor	Fast file traversal + shell. Codex excels at autonomous code analysis
Architecture synthesis	Antigravity, Claude Code	Large context needed. Antigravity edge: ~1M tokens
Infra reconnaissance	Claude Code, Antigravity	MCP integration for live server access. Claude Code = terminal-native
Competitive analysis	Antigravity, Claude Code	Web search + structured output
Data analysis	Antigravity, Claude Code	MCP servers (ClickHouse, PostgreSQL). Antigravity has Google Sheets MCP
Document review	Any tool	All can read files. Context window matters for large doc sets
Prototype validation	Codex CLI, Claude Code, Cursor	Shell commands + test execution. Codex = async background

Key observation: No tool is universally best. The choice depends on the iteration's PRIMARY subtask type. Most iterations involve 2-3 subtask types — the coordinator picks based on the dominant one.

G4: Where should guidance live? Analysis of TFW locations¶

Location	Pros	Cons	Maintenance burden
Comment in iterations.yaml template	Visible at decision point. No extra file to read.	Limited space. Can't include full table.	Low — update template only
Table in conventions.md	Authoritative reference. Full detail.	Not visible at decision point. Coordinator must remember to check.	Medium — update when tools evolve
Prompt in plan.md Step 6b	Active guidance. Forces coordinator to consider.	Adds workflow weight. May feel prescriptive.	Medium — update workflow
Separate reference doc	Full space for detailed guidance.	New file to maintain. Low discoverability.	High — easily forgotten

G5: Tool-agnostic vs tool-specific guidance tension¶

TFW's core principle: tool-agnostic (.tfw/ works with any AI tool). Naming specific tools (Claude Code, Codex CLI) would: - Break tool-agnosticism - Become stale as tools evolve (new tools appear, old tools gain capabilities) - Bind TFW to a specific ecosystem

Counter-argument: Generic archetypes ("Web Researcher") are too vague. Coordinators need concrete examples to act.

Resolution pattern found: Use a two-tier approach: 1. Conventions define CAPABILITY CATEGORIES (tool-agnostic): "web search", "code audit", "MCP integration" 2. Project-level config or KNOWLEDGE.md maps capabilities to SPECIFIC TOOLS (project-specific): "Antigravity = web search + MCP + large context"

This mirrors how TFW handles other tool-specific data (e.g., project_config.yaml has build commands, not tool names in conventions).

G6: Counter-evidence — does agent guidance add value? (deep mode)¶

Scenario: experienced coordinator. Already knows which tool to use. Guidance = noise. - AFD-2 coordinator chose agents intuitively based on experience. No guidance framework needed. - Counter: guidance helps NEW coordinators or multi-person teams where not everyone knows all tools.

Scenario: single-tool user. Has only Claude Code. Guidance about tool selection = irrelevant. - This is the MAJORITY case for most TFW users. - Counter: guidance still serves as inspiration ("maybe I should try another tool for this iteration").

Verdict: Guidance should be LIGHT and OPTIONAL. A brief comment in the template + a reference to a capability table. Never prescriptive, never required.

Checkpoint¶

Found	Remaining
5-tool capability matrix with 10 capabilities each	None
8 research subtask types mapped to best-fit tools	None
4 guidance locations analyzed with pros/cons	None
Tool-agnostic resolution: capability categories + project-level mapping	None

Sufficiency: - [x] External source used? (4 web searches on tool capabilities) - [x] Briefing gap closed? (capability mapping + formalization location answered) - [x] Dimensions identified? (3 dimensions × 3 alternatives)

Deep mode criteria: - [x] Hypothesis tested? (H1 extended: agent field works, guidance formalization explored) - [x] Counter-evidence sought? (G6: experienced coordinator, single-tool user)

Metacognitive check: NEW discovery — the two-tier resolution (capability categories in conventions vs specific tools in project config). This preserves TFW's tool-agnosticism while giving concrete guidance. Also discovered that research subtask types map cleanly to tool capabilities, making the guidance table actionable.

Stage complete: YES → User decision: proceed (autonomous mode)