Research Log: Agent-EA v2 Engagement Playbook
Research Log: Agent-EA v2 Engagement Playbook
Target: production-lines/agent-ea/playbook/engagement.md Eval criteria: auto-research/eval-criteria.md (12 binary questions)
Iteration #0 — 2026-05-31 (baseline)
Score: 8/12 Status: Baseline
Criteria results
| # | Question (short) | Score | Justification |
|---|---|---|---|
| 1 | Golden rule up front | YES | stated on line 5 |
| 2 | A-codes defined/linked | NO | Routing table lists A251/A170/A370… with no definition or link |
| 3 | Stages in order, Routing at head | NO | stages table starts at Contract; Routing described below, not integrated |
| 4 | Gate + owner per stage | YES | all 5 rows have both |
| 5 | Exact command per automated stage | NO | node commands live in README, not here nor cross-linked |
| 6 | First action of a new request | YES | "Contract scoping — Routing (do this first)" |
| 7 | Data contract is a checklist | YES | concrete - [ ] checklist |
| 8 | Free of TODO/placeholder | YES | parity proof filled, no gaps |
| 9 | Follow-on mandates covered | YES | "Adding a mandate" paragraph |
| 10 | Conventions bind to D1 model | YES | kind object/co, change_type junction, epicId FK |
| 11 | Routing table = canonical bundles | YES | exact match to framework-guide scenarios |
| 12 | Open issues + spec/README links | NO | TES-CO vs PROJ/CHG not surfaced; no spec/plan/README link |
Observations
Structure, fidelity and completeness are strong. The gaps are about operability (no commands) and traceability (A-codes unexplained, Routing not in the stage table, open modeling issue + source links missing). All four NO are additive fixes — low regression risk.
Next direction (iteration #1)
- Q3: add Routing as stage 0 in the Stages & gates table (gate = A-code bundle selected; owner = Consulting intake).
- Q5: add the exact node commands per automated stage (or cross-link README quickstart).
- Q2: add a one-line pointer to the canonical A-code source (framework-guide) + compact legend.
- Q12: add a short "References & open issues" footer (spec/plan, engine README, TES-CO vs PROJ/CHG note).
Iteration #1 — 2026-05-31
Score: 12/12 (was 8/12, +4) Status: Kept
Changes applied
- A. Added Scope (Routing) as stage 0 in the Stages & gates table (gate = engagement type → A-code bundle; owner = Consulting intake).
- B. Added "Running the automated stages (commands)" block — exact
seed.mjs/publish.mjs/d1-export.mjsinvocations + README cross-link + note that Contract/Review are human gates. - C. Added A-code reference line (15 code meanings + link to the canonical Macroscope
framework-guide.md). - D. Added "References & open issues" section (spec/plan, engine README, schema; open: TES-CO vs PROJ/CHG duplication, canonical seed location).
Criteria results
| # | Question (short) | Before | After | Delta |
|---|---|---|---|---|
| 1 | Golden rule up front | YES | YES | — |
| 2 | A-codes defined/linked | NO | YES | +1 |
| 3 | Stages in order, Routing at head | NO | YES | +1 |
| 4 | Gate + owner per stage | YES | YES | — |
| 5 | Exact command per automated stage | NO | YES | +1 |
| 6 | First action of a new request | YES | YES | — |
| 7 | Data contract is a checklist | YES | YES | — |
| 8 | Free of TODO/placeholder | YES | YES | — |
| 9 | Follow-on mandates covered | YES | YES | — |
| 10 | Conventions bind to D1 model | YES | YES | — |
| 11 | Routing table = canonical bundles | YES | YES | — |
| 12 | Open issues + spec/README links | NO | YES | +1 |
Observations
All four fixes were additive — no regressions on the 8 prior YES. The Scope (Routing) row (A) and the commands block (B) together turn the playbook from a "policy doc" into an executable runbook. The A-code legend (C) removes the implicit dependency on tribal knowledge of Macroscope codes.
Next direction
Score is perfect (12/12) — loop stopped. Future eval evolution: if the engine grows more renderer formats (word/pdf/bi, TFD-0029), add a criterion on whether the playbook documents the manifest.formats selection. Resolving the TES-CO vs PROJ/CHG open issue should later be reflected back here.