RES β TFW-38: Quality Enforcement (Iteration 3)
Date: 2026-04-14
Author: Researcher
Status: π¬ RES β Complete
Parent HL: HL-TFW-38
Predecessors: RES iter 1, RES iter 2
Mode: Pipeline
Research Context
Iteration 3 addressed two user-directed threads: (A) TS over-specification audit β are coordinators writing implementation instead of requirements? (B) Complete naming refinement for review stages, modes, and all downstream references.
Decisions
| # |
Decision |
Rationale |
| D11 |
Review stage names: Map β Verify β Judge β Decide |
"Naming Creates Behavior" (TFW value): if you must explain it, it's named wrong. User rejected "comprehend" (unclear) and "assess" (vague, prefers "judge" β more intentional). All names are 1-2 syllables, active verbs from established disciplines (navigation, engineering, legal, management). Chain reads as sentence: "Map the work, Verify the claims, Judge the quality, Decide the verdict." |
| D12 |
Review mode names: code / docs / spec |
User rejected "prose" (meaningless). Modes describe output type (noun), not domain. All 1 syllable. "docs" replaces "prose" for writing/documentation tasks β clear, short, no namespace collision with project docs/ folders. |
| D13 |
TS over-specification is a tendency, not a systemic defect. Out of TFW-38 scope. |
4 TS sampled: 49-71% code. Coordinators write full code for complex/critical logic (SLA algorithms, security), signatures for standard patterns, pseudocode for UI. This varies naturally. The issue is missing guidance on WHEN to use which level (L1-L4). Recommend separate task for TS template conventions. |
Hypothesis Status (Cumulative)
| # |
Hypothesis |
Status |
Iteration |
| H1 |
Explicit Β§6-8 enumeration stops skipping |
π’ confirmed |
1 |
| H2 |
Audit step changes reviewer behavior |
π’ superseded by H3 |
1 |
| H3 |
Domain-specific review stages produce more reliable reviews |
π’ confirmed |
2 |
| H4 |
"Naming Creates Behavior" applied to review stages eliminates comprehension friction |
π’ confirmed |
3 (user test: "comprehend" fails, "map" passes) |
Final Naming Inventory
Review Stages
| Stage |
Name |
REVIEW Section |
Cognitive Mode |
Verbs Chain |
| 1 |
Map |
Β§1 Map |
"Do I understand what was done?" |
Map the work |
| 2 |
Verify |
Β§2 Verify |
"Are the claims true?" |
Verify the claims |
| 3 |
Judge |
Β§3 Judge |
"Is the quality sufficient?" |
Judge the quality |
| 4 |
Decide |
Β§4 Decide |
"What's the verdict?" |
Decide the verdict |
Review Modes
| Mode |
Tasks |
Checklist Items |
Verify Actions |
| code |
Implementation |
10 items (6 universal + 4 domain) |
Spot-check 2-3 files, check tests |
| docs |
Writing, docs, design |
8 items (6 universal + 2 domain) |
Verify deliverable existence, check structure |
| spec |
Analytical, research |
8 items (6 universal + 2 domain) |
Verify deliverables, check source citations |
File Paths
Cross-Reference: Review vs Research Naming
| Research |
Review |
Structural Parallel |
| Briefing |
Map |
Prepare/understand |
| Gather |
Verify |
Collect evidence |
| Extract |
Judge |
Apply cognitive framework |
| Challenge |
β |
(no challenge in review) |
| RES |
Decide |
Synthesize conclusion |
HL Update Recommendations
| # |
What to update |
Source |
| 1 |
Phase A: rename all stage references from Comprehend/Verify/Assess/Synthesize to Map/Verify/Judge/Decide |
D11 |
| 2 |
Phase A: rename mode "prose" to "docs" |
D12 |
| 3 |
Add new task recommendation to HL Β§7 Parking Lot: "TS Template Conventions β L1-L4 specification level guidance" |
D13, G2-G6 |
| 4 |
HL Β§10: add H4 (naming validation) |
D11 |
Fact Candidates
| # |
Category |
Candidate |
Source |
Confidence |
| F10 |
convention |
Review stages: Map β Verify β Judge β Decide. All 1-2 syllables, active verbs, from navigation/engineering/legal/management disciplines. Validated by TFW "Naming Creates Behavior" principle: "comprehend" failed user test (unclear), "assess" failed user test (too vague), "judge" passed (intentional, direct). |
User, 2026-04-14, Challenge C1-C5 |
High |
| F11 |
convention |
Review modes: code / docs / spec. All 1-syllable nouns describing output type. "prose" rejected by user (meaningless). Modes configure the Judge-stage checklist, not the domain. |
User, 2026-04-14, G10, C4 |
High |
| F12 |
process |
TS files across helpdesk and atamat contain 49-71% ready-to-paste code (average 58%). Coordinators write full implementation (L4) for complex/critical logic and method signatures (L3) for standard patterns. This is natural risk-averse behavior, not pathological. Missing: explicit guidance on when to use L1-L4 in TS template. |
G2-G6, C6, 4 TS sampled |
High |
| F13 |
philosophy |
User: "ΠΡΠΎΠ²Π΅ΡΠΈΡΡ ΡΡΠΎ ΠΊΠΎΠΎΡΠ΄ΠΈΠ½Π°ΡΠΎΡ Π² Π’Π‘ Π½Π΅ ΡΡΠ°Π²ΠΈΡ Π·Π°Π΄Π°ΡΡ ΡΠ»ΠΈΡΠΊΠΎΠΌ Π΄Π΅ΡΠ°Π»ΡΠ½ΠΎ, ΡΠ°ΠΊ ΡΡΠΎ ΡΠΆΠ΅ ΠΏΠΎΡΡΠΈ Π²ΡΠΏΠΎΠ»Π½ΡΠ΅Ρ Π΅Ρ. ΠΡΡΠ°Π²Π»ΡΠ΅Ρ Π»ΠΈ ΠΎΠ½ Π½Π° ΠΊΡΠ΅Π°ΡΠΈΠ² ΠΈΡΠΏΠΎΠ»Π½ΠΈΡΠ΅Π»Ρ." TS over-specification suppresses executor creativity and makes ONB trivial. Token double-spend when coordinator writes code without testing and executor copies it. |
User, 2026-04-14 |
High |
Strategic Insights
| # |
Category |
Insight |
Source |
Confidence |
| SS2 |
process |
TS content quality has a specification level spectrum (L1: Goal β L2: Requirement β L3: Design β L4: Implementation). Most TS files are at L4 for complex steps and L2-L3 for simple steps. The missing piece is explicit guidance: templates should recommend which specification level to use based on step complexity. This is a separate methodology improvement, not a quality enforcement fix. |
G2-G6, E3-E4, C6 |
β
β
β
|
Findings Map
ITERATION 3 SCOPE
ββββββββββββββββββ
Thread A: TS Over-Spec Thread B: Naming
ββββββββββββββββββββ ββββββββββββββββ
4 TS sampled: User feedback:
HD PhaseD: 71% code β "comprehend" = unclear
HD PhaseF: 56% code β "prose" = meaningless
AT PhaseH: 55% code β
"judge" = intentional
AT PhaseF2: 49% code
β β
Average: 58% code "Naming Creates Behavior"
β filter applied:
Tendency, not defect β
Complex β L4 (natural) Stages: Map β Verify β Judge β Decide
Simple β L2-L3 (natural) Modes: code / docs / spec
β β
Missing: L1-L4 guidance All 1-2 syllables
β Separate task, not [TFW-38](HL-TFW-38__quality_enforcement.md) All active, need no explanation
β
User test passed
CUMULATIVE RESEARCH (3 iterations)
ββββββββββββββββββββββββββββββββββ
Iter 1: Template-workflow disconnect β Β§6-8 skip rate 96-100%
Fix: explicit enumeration in handoff.md ([D1](../../knowledge-index.md#architecture-decisions))
Reviewer trust-chain β audit step needed ([D2](../../knowledge-index.md#architecture-decisions))
Iter 2: Review should have staged structure ([D6](../../knowledge-index.md#architecture-decisions))
3 modes based on output type ([D7](../../knowledge-index.md#architecture-decisions))
Stages as REVIEW sections, not files ([D8](../../knowledge-index.md#architecture-decisions))
Template must restructure to match ([D9](../../knowledge-index.md#architecture-decisions))
Diagrams β index, not copy ([D10](../../knowledge-index.md#architecture-decisions))
Iter 3: Stages renamed: Map β Verify β Judge β Decide ([D11](../../knowledge-index.md#architecture-decisions))
Modes renamed: code β docs β spec ([D12](../../knowledge-index.md#architecture-decisions))
TS over-spec = tendency, separate task ([D13](../../knowledge-index.md#architecture-decisions))
Iteration Status
- Iteration: 3 of 2 (min) / 3 (max)
- Hypotheses tested: H1 (π’), H2 (π’βH3), H3 (π’), H4 (π’)
- Hypotheses deferred: None
- Gaps discovered: TS specification level guidance needed (out of scope)
- Superseded decisions: D7 mode "prose" β D12 mode "docs"
Open Threads
| # |
Thread |
Status |
| 1 |
Staged review design |
β
Resolved β D6, D8, D9, D11 |
| 2 |
Review modes |
β
Resolved β D7βD12 |
| 3 |
Naming validation |
β
Resolved β D11, D12, user test |
| 4 |
TS over-specification |
β
Resolved β D13, out of scope, separate task |
| 5 |
Diagram indexing |
β
Resolved β D10 |
Recommendation
- [x] SUFFICIENT β proceed to
/tfw-plan to update HL and write TS
- [ ] MORE NEEDED
3 iterations completed (at max). All hypotheses tested. All open threads resolved. Naming validated by user. TS over-spec scoped out cleanly. Ready for TS specification.
Conclusion
Three research iterations produced a comprehensive evidence base for TFW-38 quality enforcement. Iteration 1 empirically confirmed the template-workflow disconnect (96-100% Β§6-8 skip rate across 80+ files) and the reviewer trust-chain failure. Iteration 2 designed a 4-stage domain-agnostic review flow with 3 output-type modes. Iteration 3 refined naming via the TFW's own "Naming Creates Behavior" principle β final stages Map β Verify β Judge β Decide, modes code / docs / spec β and identified the TS over-specification tendency as a separate improvement track (L1-L4 guidance). The research is ready for HL update and TS specification.
RES β TFW-38: Quality Enforcement (Iteration 3) | 2026-04-14