amduat-api/docs/ai-plan.md
2026-02-08 07:55:43 +01:00

1.7 KiB

AI v2 Plan

Goal

Ship one reliable AI vertical slice on top of the v2 graph API:

  1. ingest deterministic graph facts,
  2. retrieve graph context for a root,
  3. answer with grounding evidence,
  4. execute a minimal planner loop with persisted run state.

Scope Rules

  • Prioritize app-level AI workflow work in this repo.
  • Treat backend fault investigation as out-of-scope unless it blocks the vertical slice.
  • Keep vendor/amduat-api pinned while iterating on prompts/evals.

Working Lane

  • Use branch: feat/ai-v2-experiments.
  • Keep core command stable: ./scripts/v2_app.sh ai-vertical-slice.
  • Track prompt/eval tweaks under ai/.

Acceptance Criteria

  • ./scripts/v2_app.sh ai-vertical-slice passes on a running daemon with Ollama.
  • Output contains non-empty answer text with grounding.has_evidence == true.
  • tests/ai_eval.sh and tests/ai_answer_eval.sh pass in the same environment.
  • ./scripts/v2_app.sh ai-agent --json 'doc-ai-1' 'What domain is doc-ai-1 in?' 'ms.within_domain' writes checkpoint state under ai/runs/.

Quick Run Sequence

  1. Start daemon (or let the vertical slice auto-start it): ./scripts/dev_start_daemon.sh
  2. Run AI vertical slice: ./scripts/v2_app.sh ai-vertical-slice
  3. If daemon may not be running, use: ./scripts/v2_app.sh ai-vertical-slice --auto-start-daemon
  4. Run minimal agent loop: ./scripts/v2_app.sh ai-agent --json --auto-start-daemon 'doc-ai-1' 'What domain is doc-ai-1 in?' 'ms.within_domain'

Stop Conditions

  • If startup, ingest, or retrieve fails due to backend regression, log the failure and pause AI iteration until fixed.
  • Do not switch scope to broad backend cleanup without an explicit decision.