title: "Extract — "What do we NOT see?"" source: "tasks/TFW-29__consistency_audit/research/extract.md"
Extract — "What do we NOT see?"¶
Parent: HL-TFW-29 Goal: Reference files (conventions.md, glossary.md) and 11 workflows are free from redundancy — agents load minimum tokens for maximum signal.
Findings¶
E1: Section Usage Matrix — conventions.md Sections × Workflows¶
Legend: R = reads/references at runtime, D = duplicates content from it, — = ignores
| § | Section | plan | research | handoff | review | resume | docs | knowledge | release | update | config | init |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Purpose | — | — | — | — | — | — | — | — | — | — | — |
| 2 | Required Artifacts | — | — | — | — | — | — | — | — | — | — | — |
| 3 | Artifact Types | — | — | — | — | — | — | — | — | — | — | — |
| 4 | Task Numbering | R | — | — | — | — | — | — | — | — | — | — |
| 5 | Task Statuses | R | — | — | — | — | — | — | — | — | — | — |
| 6 | Scope Budgets | R | — | — | — | — | — | — | — | — | R | — |
| 7 | Execution Modes | — | — | — | — | — | — | — | — | — | — | — |
| 8 | Workflows | — | — | — | — | — | — | — | — | — | — | — |
| 9 | Tool Adapter | — | — | — | — | — | — | — | — | R | — | R |
| 10 | Context Loading | R | R | D | D | — | — | — | — | — | — | — |
| 10.1 | Fact Categories | — | — | — | — | — | — | R | — | — | — | — |
| 10.2 | Knowledge Infra | — | — | — | — | — | — | — | — | — | — | — |
| 11 | Quality Standard | — | — | — | — | — | — | — | — | — | R | — |
| 12 | Safety | — | — | — | — | — | — | — | — | — | — | — |
| 13 | Trace Discipline | — | — | — | — | — | — | — | — | — | — | — |
| 14 | Anti-patterns | R | R | D | D | — | — | — | — | — | — | — |
| 15 | Role Lock | R | — | — | — | — | — | — | — | — | — | — |
| 16 | Compilable Contract | — | — | — | — | — | — | — | — | — | — | — |
| 16.1 | Source Manifest | — | — | — | — | — | — | — | — | — | — | — |
| 16.2 | Reference Format | — | — | — | — | — | — | — | — | — | — | — |
| 16.3 | Frontmatter | — | — | — | — | — | — | — | — | — | — | — |
| 16.4 | Output Nav | — | — | — | — | — | — | — | — | — | — | — |
Dead sections (never R or D by any workflow at runtime): - §1 Purpose - §2 Required Artifacts (list of files — onboarding/init reference only) - §3 Artifact Types (loaded as background knowledge, never explicitly referenced) - §7 Execution Modes (only referenced via AGENTS.md, not via conventions.md) - §8 Workflows (table, never used at runtime) - §10.2 Knowledge Infrastructure (table, knowledge.md reads the files directly) - §12 Safety - §13 Trace Discipline - §16 entire block (all 4 sub-sections)
Active sections (referenced at least once): - §4, §5, §6, §9, §10, §10.1, §11, §14, §15
That's 9 active § out of 20 total § — 55% of conventions.md content is never read by a workflow at runtime.
E2: Glossary Dependency Map — Term Categories¶
Based on G4 data, glossary terms fall into 3 clear tiers:
Tier 1: Unique terms (15) — MUST stay in glossary, no duplicate elsewhere - Stage (Research), Pass (Research), Read-only AG - Phase, Scope Budget - Fact Candidate, Strategic Insight, Topic File, Knowledge Gate, Consolidation - Config Sync Registry - Roles (User, Coordinator, Researcher, Executor, Reviewer) — 5 role definitions - Tool Adapter
Tier 2: Duplicated terms (10) — conventions.md has the SoT, glossary duplicates - HL, RES, TS, RF, ONB, REVIEW (full definitions — conventions §3 is authoritative) - CL/AG modes (conventions §7 is authoritative) - Task Naming (conventions §4 is authoritative) - Status Flow (conventions §5 is authoritative) - TECH_DEBT.md, KNOWLEDGE.md (conventions §2 is authoritative)
Tier 3: Meta/unused terms (6) — never referenced by any workflow at runtime - Concept Taxonomy (nice-to-have for humans, agents never read it) - Workflow (canonical) — self-referential, workflows don't need a definition of "workflow" - .tfw/ Directory — same, meta - Execution Engine, Progress Reporting — config-level, never used - tfw-init, tfw-release, tfw-update workflow descriptions — workflows self-describe, these are redundant
Compression plan: | Tier | Current lines | Proposed | Savings | |------|--------------|----------|---------| | Tier 1 (unique) | ~85 lines | Keep as-is | 0 | | Tier 2 (duplicated) | ~60 lines | 1-liner + "→ conventions.md §N" per item | ~45 lines saved | | Tier 3 (meta/unused) | ~25 lines | Remove entirely | ~25 lines saved | | Compilable Contract terms | ~12 lines | 1-liner + "→ conventions.md §16" | ~8 lines saved | | Total | ~182 lines | ~104 lines | ~78 lines saved |
Result: ~104 lines — larger than the HL's 80-line estimate, but the user flagged 80 as potentially too small. 104 preserves all Tier 1 terms at full length.
E3: Common Spine vs Divergence Report¶
Common Spine (shared by plan, research, handoff, review):
1. AGENTS.md
2. .tfw/conventions.md
3. .tfw/glossary.md
4. KNOWLEDGE.md (if exists)
5. Relevant task artifacts (HL/TS/RF)
Divergence points:
| Workflow | Divergence from spine | Intentional? |
|---|---|---|
| plan.md | Says "Read §10" (1-line ref) — trusts spine is in conventions | ✅ Intentional (Pattern A ref-inside-step) |
| research/base.md | Says "Read §10. Verify loaded:" — lists what to check | ✅ Intentional (same pattern) |
| handoff.md | Full expanded list (9 items, includes Phase HL, Code files) — duplicates §10 + adds executor-specific items | ⚠️ Mostly intentional (executor needs more context) but duplicates the first 4 items verbatim |
| review.md | Full expanded list (10 items, includes RF mandatory) — duplicates §10 + adds reviewer-specific items | ⚠️ Same pattern as handoff — duplicates core 4, adds role-specific items |
| resume.md | No Context Loading section at all | ❓ Unclear — coordinator workflow without a loading step |
| docs.md | No Context Loading section at all | ⚠️ Problematic — runs after review, should load something |
| knowledge.md | Own Prerequisites section (reads YAML, doesn't load conventions/glossary) | ✅ Intentional — knowledge workflow needs specific files, not general context |
| release.md | Own Prerequisites (RELEASE.md, CHANGELOG, VERSION) | ✅ Intentional — release-specific context |
| update.md | Own Prerequisites (CONFIG, upstream fetch) | ✅ Intentional — update-specific context |
| config.md | No Context Loading section | ⚠️ Reads CONFIG on demand in step — works but inconsistent |
| init.md | No Context Loading (creates everything from scratch) | ✅ Intentional — nothing exists yet |
Pattern: Workflows divide into 2 architecture styles: 1. "Spine + extend" (plan, research, handoff, review) — reference or expand the 4-step spine 2. "Own prerequisites" (knowledge, release, update, config, init) — task-specific context, no general spine
The resume and docs workflows are in neither camp — they have no explicit context loading at all.
E4: Anti-patterns Architecture Decision¶
From G3, the blocks divide cleanly:
| Source | Role | Items | Unique to source? |
|---|---|---|---|
| conventions.md §14 | All | 16 | 13 generic + 3 explicit role-lock |
| handoff.md | Executor | 13 | 2 unique (STOP after RF, continues past Phase 3) |
| review.md | Reviewer | 7 | 2 unique (reviews without reading RF, same-session reviewer) |
| .tfw/README.md | All | 9 | 0 unique — pure subset of §14 |
| resume.md | Coordinator | 6 | 6 unique (all resume-specific) |
| init.md | Coordinator | 6 | 6 unique (all init-specific) |
| config.md | Coordinator | 4 | 4 unique (all config-specific) |
Decision analysis:
- Option A: Unified list in §14 — merge handoff/review unique items into §14, tag by role. Simple governance, single source. But §14 grows to ~20 items.
- Option B: §14 = generic, workflows own role-specific — §14 keeps 16 generic items. Each workflow keeps only its unique items. .tfw/README.md block → "→ conventions.md §14".
- Option C: Separate §14 by role (user's lean) — §14.1 Executor, §14.2 Coordinator, §14.3 Reviewer. Workflows say "→ §14.{role}".
Recommendation: Option B. Reasons: 1. Handoff/review unique items are truly role-specific: "Executor STOP after RF" only makes sense inline in handoff. Putting it in conventions adds cognitive load for non-executors. 2. Resume/init/config anti-patterns are contextual — they'd be noise in a generic list. 3. §14 stays compact (16 items). README block becomes a ref. 4. This matches the "DNA/Library" pattern (D25): enforcement-critical stays inline (the role-specific anti-patterns at point of use), reference data goes via ref (generic anti-patterns via §14).
E5: Token Cost Estimation¶
Current duplication cost per session (agent loads conventions + glossary + workflow):
| Duplication category | Approx words duplicated | Tokens (~0.75 words/token) |
|---|---|---|
| Artifact defs (conventions §3 + glossary) | ~300 words | ~400 tokens |
| Status flow (conventions §5 + glossary) | ~100 words | ~133 tokens |
| Anti-patterns (.tfw/README = copy of §14) | ~150 words | ~200 tokens |
| Context Loading (§10 + handoff/review expansions) | ~100 words | ~133 tokens |
| CL/AG modes (conventions §7 + glossary + AGENTS.md) | ~80 words | ~107 tokens |
| Compilable Contract terms (glossary repeats §16 defs) | ~80 words | ~107 tokens |
| TECH_DEBT/KNOWLEDGE defs (§2 + glossary) | ~40 words | ~53 tokens |
| Total per session | ~850 words | ~1,133 tokens |
Not catastrophic per session, but accumulated over many sessions, it multiplies. More importantly, it's confusion cost — agents see the same concept defined slightly differently in 2-3 places, increasing misinterpretation risk.
Checkpoint¶
| Found | Remaining |
|---|---|
| Section Usage Matrix: 55% of conventions.md is dead at runtime | — |
| Glossary: 15 unique, 10 duplicated, 6 unused → can compress to ~104 lines | — |
| Common Spine: real pattern for 4 workflows, 2 styles, 2 workflows missing loading | — |
| Anti-patterns: Option B recommended (§14 stays generic, workflows own role-specific) | — |
| ~1,133 tokens/session wasted on duplication | — |
Sufficiency: - [x] External source used? (Section Usage Matrix built from full workflow corpus) - [x] Briefing gap closed? (All 4 Extract bullets from briefing covered)
Stage complete: YES → User decision: proceed to Challenge