amduat-api/docs/ai-plan.md

46 lines
1.7 KiB
Markdown
Raw Normal View History

# AI v2 Plan
## Goal
Ship one reliable AI vertical slice on top of the v2 graph API:
1. ingest deterministic graph facts,
2. retrieve graph context for a root,
3. answer with grounding evidence,
4. execute a minimal planner loop with persisted run state.
## Scope Rules
- Prioritize app-level AI workflow work in this repo.
- Treat backend fault investigation as out-of-scope unless it blocks the vertical slice.
- Keep `vendor/amduat-api` pinned while iterating on prompts/evals.
## Working Lane
- Use branch: `feat/ai-v2-experiments`.
- Keep core command stable: `./scripts/v2_app.sh ai-vertical-slice`.
- Track prompt/eval tweaks under `ai/`.
## Acceptance Criteria
- `./scripts/v2_app.sh ai-vertical-slice` passes on a running daemon with Ollama.
- Output contains non-empty answer text with `grounding.has_evidence == true`.
- `tests/ai_eval.sh` and `tests/ai_answer_eval.sh` pass in the same environment.
- `./scripts/v2_app.sh ai-agent --json 'doc-ai-1' 'What domain is doc-ai-1 in?' 'ms.within_domain'` writes checkpoint state under `ai/runs/`.
## Quick Run Sequence
1. Start daemon (or let the vertical slice auto-start it):
`./scripts/dev_start_daemon.sh`
2. Run AI vertical slice:
`./scripts/v2_app.sh ai-vertical-slice`
3. If daemon may not be running, use:
`./scripts/v2_app.sh ai-vertical-slice --auto-start-daemon`
4. Run minimal agent loop:
`./scripts/v2_app.sh ai-agent --json --auto-start-daemon 'doc-ai-1' 'What domain is doc-ai-1 in?' 'ms.within_domain'`
## Stop Conditions
- If startup, ingest, or retrieve fails due to backend regression, log the failure and pause AI iteration until fixed.
- Do not switch scope to broad backend cleanup without an explicit decision.