Production Brief — WO-0008: Maya (Conversational KB Talent)

Production Brief — WO-0008: Maya (Conversational KB Talent)

Stage: 1 — Intake (internal pilot, dogfooding) Owner: Pablo (Production Line Architect) Date: 2026-05-26 Decision ref: TFD-0023 Technical canon: RD-0017 (Karpathy LLM Wiki)


Client Profile (M1 — internal pilot)

Field Value
Client Talent Factory (internal, dogfooding)
Segment Camille (CS triage), Riley (R&D retrieval), Oscar (CEO knowledge queries)
Domain Factory knowledge — decisions, requests, KB learnings, HTML deliverables
Methodology Karpathy LLM Wiki pattern (RD-0017)
Language FR primary, EN mixed
AI platform Claude API (Haiku retrieval, Opus/Sonnet synthesis), pluggable per model-config-pattern
Corpus path C:\Projects\talent-factory\ (filtered — see below)
Documentation platform Same repo (markdown + HTML)
Naming maya-* skills, WO-0008 per TFD-0009

Discovery Summary

Business Context

The factory has accumulated ~150 markdown files (TFDs, requests, R&D analyses, KB) plus the intranet's 405 Astro pages and ~30 HTML EA deliverables. Camille currently does client triage by manual grep + memory. Riley re-reads the same RD analyses every R&D session. The CEO asks recurring questions whose answers are spread across 4-6 files at a time. JSM-Confluence deflection (CON-0006 Stack A) was the prior plan; TFD-0023 supersedes it with Maya.

Pain Points

  1. No semantic retrieval over the corpus — keyword search misses bilingual rephrasings and cross-file concepts.
  2. No connection layer — each query starts from zero; nothing compounds.
  3. Confluence cost + format mismatch — paid per seat, can't render the factory's HTML deliverables (capsules, diagrams), weak FR/EN.
  4. No deflection mechanism — every question becomes a CEO interrupt.

Current State

  • Markdown corpus: company/decisions/, departments/*/requests/, references/videos/RD-*/, KB lessons.
  • HTML deliverables: intranet/dist/, EA pages under client OneDrive (out of scope for Maya-Factory M1; in scope for Maya-STM M2).
  • Beta-portal live (project memory: beta-portal) — host for the Maya widget.
  • JSM live on jackson-creek-tech.atlassian.net — ticket route target for deflection.

Product Definition

Product Type

A digital talent packaged as a deployable bundle: agent definition + skills + widget + corpus config. Two deployment profiles share the same core (see TFD-0023 Action 1):

  • M1 — Maya-Factory: corpus = factory repo (filtered). Users = factory team. Host = beta-portal widget.
  • M2 — Maya-Client (STM POC): corpus = OneDrive-STM/agent-ea/. Users = STM staff via JCT portal. Bundled with EA handover #1.

Architecture (RD-0017 canonical pattern)

maya/
├── raw/                 ← read-only sources (symlink or copy from corpus_path)
├── wiki/                ← agent-maintained markdown KB (the synthesized layer)
├── CLAUDE.md            ← schema: purpose, folders, ingest workflow, formatting, QA
├── corpus.config.yaml   ← {corpus_path, filters, language, deflection_target}
├── manifest.json        ← generated index (titles, summaries, paths)
└── widget/              ← embeddable JS (Astro + standalone bundle)

Three behaviors driven by the schema:

  1. Ingest — on add/update of raw source: extract concepts, update existing wiki pages, create new pages, link, log changes.
  2. Query — multi-turn FR/EN; consult wiki first (not raw); cite source paths; flag uncertainty.
  3. Lint/lint-wiki: contradictions, orphan pages, outdated claims, concepts without page. Folded into RD-0031 toolkit-catalog bundle per TFD-0023.

Core Capabilities (from order.md, re-prioritized for M1)

# Capability M1 priority Notes
1 Corpus ingestion → manifest + wiki Must Manifest-based, no vector DB <10k docs
2 Conversational retrieval (multi-turn FR/EN) Must Claude long context + manifest
3 Citation by paragraph Must Native Claude API citations
4 Deflection → Telegram (M1) / JSM (M2) Must M1 uses Telegram (existing channel); M2 uses JSM
5 Embeddable widget Must Astro component for beta-portal
6 Bilingual native Must No language toggle
7 Per-deployment corpus Must corpus.config.yaml is the only thing that changes between deployments
8 Re-indexing on commit Should Git hook → manifest refresh <60s
9 /lint-wiki skill Should Co-developed with RD-0031

Out of M1 scope

Vector DB, authentication (host portal handles it), analytics dashboard (JSM handles deflection rate), HTML deliverables ingestion beyond markdown extraction (M2 problem).

Scope

  • In: Conversational RAG with citations, deflection routing, embeddable widget, FR/EN, manifest-based indexing, wiki ingest workflow, /lint-wiki.
  • Out (v1): Vector DB, auth, analytics, multi-tenant (each Maya is single-corpus by design).
  • Compliance: Each instance reads only its configured corpus. No cross-tenant leakage. Citations always include source path.

Feasibility Assessment

Risk: Low. Pattern proven by RD-0017. Stack is factory-native (markdown + Claude Code + Astro). No new infrastructure. Order.md is fully specified (AC + DoD already written). The 4-week sequence per TFD-0023 has slack — week 4 STM POC depends only on M1 + STM corpus access (already available).

Open Questions for Stage 2

  1. Corpus filters for M1 — which paths in talent-factory/ are in vs out? Proposal: include company/, departments/*/requests/, references/videos/, production-lines/orders/*/order.md. Exclude .claude/, node_modules/, intranet/dist/, OneDrive client folders.
  2. Wiki location — does the wiki live in the repo (maya/wiki/ committed) or in a sibling folder? Pablo to decide based on git noise tolerance.
  3. Re-index cadence — git hook (every commit) or scheduled (hourly)? Cheap to try both.
  4. Widget styling — match Anthropic warm cream / Trustworthy Blue per docs-design-system?

4-Week Sequence (per TFD-0023)

Week Stage Owner Output
1 (now → 2026-06-02) Stage 1 close + Stage 2 design Pablo + Riley Sandbox proof (3 TFDs) + Stage 2 solution spec
2 (2026-06-03 → 09) Stage 3 pattern selection + Stage 4 build start Pablo Schema CLAUDE.md frozen, ingest + query skills built
3 (2026-06-10 → 16) Stage 4 build complete + Stage 5 QA Pablo + Quinn Maya-Factory live for Camille; QA cert
4 (2026-06-17 → 23) Stage 6 deploy + Stage 7 delivery (STM POC) Diego + Dana Maya-STM bundled with EA handover #1

Week-1 Action List (Pablo)

  1. Run Riley's RD-0017 sandbox (~25 min): create process/sandbox/{raw,wiki,CLAUDE.md}, ingest TFD-0019/021/022, run a cross-cutting question + /lint-wiki. Capture transcript in process/sandbox/sandbox-report.md. This is the Stage 1 acceptance gate.
  2. Decide the 4 open questions above — drop a one-pager process/stage-1-decisions.md.
  3. Adapt CLAUDE.md schema from Karpathy starter to factory context (FR primary, citation format machine-parseable for widget, deflection to Telegram for M1, JSM for M2).
  4. Open Stage 2process/stage-2-solution-spec.md covering: manifest schema, query loop (multi-turn + citation), widget contract, corpus filter spec, deflection payload format.
  5. Sync with Diego on RD-0031 bundle convention so /lint-wiki ships in the right channel from day 1.
  6. Sync with Riley on what RD-0017 did not answer — write any residual unknowns as REQ-EXEC tickets, do not absorb silently into the build.

References

  • production-lines/orders/WO-0008-maya-conversational-kb-talent/order.md — full AC + DoD
  • company/decisions/TFD-0023-maya-load-bearing-infrastructure-talent.md — sequencing & ownership
  • references/videos/RD-0017/out/flashcard_rd-0017.md — technical pattern (canon)
  • references/videos/RD-0017/intrants/transcript_rd-0017.md — source material
  • TFD-0009 (request folder standard), TFD-0012 (R&D pipeline), TFD-0021 (toolkit-catalog Go)
  • Memory: maya-rag-wiki-pattern, delivery-model-foundry-not-hosted, documentation-format, model-config-pattern

Stage 1 Acceptance Gate

Stage 1 closes when:

  • Sandbox proof runs end-to-end on 3 TFDs and produces a non-trivial wiki + lint report
  • 4 open questions answered in stage-1-decisions.md
  • Adapted CLAUDE.md schema committed under process/stage-1-claude-md-draft.md
  • Stage 2 solution spec opened (even empty header)

Target close date: 2026-06-02 EOD.