amduat-api/docs/v2-app-developer-guide.md

292 lines
9.2 KiB
Markdown
Raw Normal View History

# Amduat v2 App Developer Guide
This is the compact handoff guide for building an application against `amduatd` v2.
For machine-readable contracts, see `registry/amduatd-api-contract.v2.json`.
## 1) Runtime Model
- One daemon instance serves one ASL store root.
- Transport is local Unix socket HTTP.
- Auth is currently filesystem/socket permission based.
- All graph APIs are under `/v2/graph/*`.
Minimal local run:
```sh
./vendor/amduat/build/amduat-asl index init --root .amduat-asl
./build/amduatd --root .amduat-asl --sock amduatd.sock --store-backend index
```
If you run with `--store-backend index`, initialize the root with `index init`
instead of `init`.
## 2) Request Conventions
- Use `X-Amduat-Space: <space_id>` for app isolation.
- Treat graph cursors as opaque tokens (`g1_*`), do not parse.
- Use `as_of` for snapshot-consistent reads.
- Use `include_tombstoned=true` only when you explicitly want retracted facts.
## 3) Core App Flows
### A. High-throughput ingest
Use `POST /v2/graph/batch` with:
- `idempotency_key` for deterministic retries
- `mode=continue_on_error` for partial-apply behavior
- per-item `metadata_ref` or `provenance` for trust/debug
Expect response:
- `ok` (overall)
- `applied` aggregate counts
- `results[]` with `{kind,index,status,code,error}`
### B. Multi-hop retrieval for agents
Primary endpoints:
- `GET /v2/graph/subgraph`
- `POST /v2/graph/retrieve`
- `POST /v2/graph/query` for declarative filtering
Use:
- `as_of` for stable reasoning snapshots
- `max_depth`, `max_fanout`, `limit_nodes`, `limit_edges`, `max_result_bytes`
- provenance filters where needed (`provenance_ref` / `provenance_min_confidence`)
### C. Incremental sync loop
Use `GET /v2/graph/changes`:
- Start with `since_cursor` (or bootstrap with `since_as_of`)
- Persist returned `next_cursor` after successful processing
- Handle `410` as replay-window expiry (full resync required)
- Optional long poll: `wait_ms`
### D. Fact correction
- Edge retraction: `POST /v2/graph/edges/tombstone`
- Node-version retraction: `POST /v2/graph/nodes/{name}/versions/tombstone`
Reads default to exclude tombstoned facts on retrieval surfaces unless `include_tombstoned=true`.
## 4) Endpoint Map (what to use when)
- Write node: `POST /v2/graph/nodes`
- Write version: `POST /v2/graph/nodes/{name}/versions`
- Write edge: `POST /v2/graph/edges`
- Batch write: `POST /v2/graph/batch`
- Point-ish read: `GET /v2/graph/nodes/{name}`
- Edge scan: `GET /v2/graph/edges`
- Neighbor scan: `GET /v2/graph/nodes/{name}/neighbors`
- Path lookup: `GET /v2/graph/paths`
- Subgraph: `GET /v2/graph/subgraph`
- Declarative query: `POST /v2/graph/query`
- Agent retrieval: `POST /v2/graph/retrieve`
- Changes feed: `GET /v2/graph/changes`
- Export: `POST /v2/graph/export`
- Import: `POST /v2/graph/import`
- Predicate policy: `GET/POST /v2/graph/schema/predicates`
- Health/readiness/metrics: `GET /v2/healthz`, `GET /v2/readyz`, `GET /v2/metrics`
- Graph runtime/capability: `GET /v2/graph/stats`, `GET /v2/graph/capabilities`
## 5) Provenance and Policy
Provenance object fields for writes:
- required: `source_uri`, `extractor`, `observed_at`, `ingested_at`, `trace_id`
- optional: `confidence`, `license`
Policy endpoint:
- `POST /v2/graph/schema/predicates`
Key modes:
- predicate validation: `strict|warn|off`
- provenance enforcement: `optional|required`
## 6) Error Handling and Retry Rules
- Retry-safe writes: only retries with same `idempotency_key` and identical payload.
- Validation failures: `400` or `422` (do not blind-retry).
- Not found for references/nodes: `404`.
- Cursor window expired: `410` on `/changes` (rebootstrap sync state).
- Result guard triggered: `422` (`max_result_bytes` or traversal/search limits).
- Internal errors: `500` (retry with backoff).
## 7) Performance and Safety Defaults
Recommended client defaults:
- Set explicit `limit` on scans.
- Always pass `max_result_bytes` on large retrieval requests.
- Keep `max_depth` conservative (start with 2-4).
- Enable `include_stats=true` in development to monitor scanned/returned counts and selected plan.
- Call `/v2/graph/capabilities` once at startup for feature/limit negotiation.
## 8) Minimal Startup Checklist (for external app)
1. Probe `GET /v2/readyz`.
2. Read `GET /v2/graph/capabilities`.
3. Configure schema/provenance policy (`POST /v2/graph/schema/predicates`) if your app owns policy.
4. Start ingest path (`/v2/graph/batch` idempotent).
5. Start change-consumer loop (`/v2/graph/changes`).
6. Serve retrieval via `/v2/graph/retrieve` and `/v2/graph/subgraph`.
7. Monitor `/v2/metrics` and `/v2/graph/stats`.
## 9) Useful Local Helpers
- `scripts/graph_client_helpers.sh` contains practical shell helpers for:
- idempotent batch ingest
- one-step changes sync
- subgraph retrieval
For integration tests/examples:
- `scripts/test_graph_queries.sh`
- `scripts/test_graph_contract.sh`
- `scripts/changes_consumer.sh` (durable changes loop with event handler hook)
- `tests/changes_consumer_410.sh` (forced cursor-expiry path)
- `tests/changes_consumer_handler.sh` (cursor advances only on handler success)
Canonical CLI entrypoint for ongoing sync:
- `./scripts/v2_app.sh consume-changes`
## 10) Copy/Paste Integration Skeleton
Set local defaults:
```sh
SOCK="amduatd.sock"
SPACE="app1"
BASE="http://localhost"
```
Startup probes:
```sh
curl --unix-socket "${SOCK}" -sS "${BASE}/v2/readyz"
curl --unix-socket "${SOCK}" -sS "${BASE}/v2/graph/capabilities"
```
Idempotent batch ingest:
```sh
curl --unix-socket "${SOCK}" -sS -X POST "${BASE}/v2/graph/batch" \
-H "Content-Type: application/json" \
-H "X-Amduat-Space: ${SPACE}" \
-d '{
"idempotency_key":"app1-batch-0001",
"mode":"continue_on_error",
"nodes":[{"name":"doc:1"}],
"edges":[
{"subject":"doc:1","predicate":"ms.within_domain","object":"topic:alpha",
"provenance":{"source_uri":"urn:app:seed","extractor":"app-loader","observed_at":1,"ingested_at":2,"trace_id":"trace-1"}}
]
}'
```
Incremental changes loop (bash skeleton):
```sh
cursor=""
while true; do
if [ -n "${cursor}" ]; then
path="/v2/graph/changes?since_cursor=${cursor}&limit=200&wait_ms=15000"
else
path="/v2/graph/changes?limit=200&wait_ms=15000"
fi
# Capture response body and status for explicit 410 handling.
raw="$(curl --unix-socket "${SOCK}" -sS -w '\n%{http_code}' "${BASE}${path}" -H "X-Amduat-Space: ${SPACE}")" || break
code="$(printf '%s\n' "${raw}" | tail -n1)"
resp="$(printf '%s\n' "${raw}" | sed '$d')"
if [ "${code}" = "410" ]; then
echo "changes cursor expired; rebootstrap required" >&2
cursor="" # Option: switch to a since_as_of bootstrap strategy here.
sleep 1
continue
fi
[ "${code}" = "200" ] || { echo "changes failed: HTTP ${code}" >&2; sleep 1; continue; }
# Process each event. Replace this with app-specific handler logic.
ok=1
while IFS= read -r ev; do
kind="$(printf '%s' "${ev}" | jq -r '.kind // "unknown"')"
ref="$(printf '%s' "${ev}" | jq -r '.edge_ref // .node_ref // "n/a"')"
echo "apply event kind=${kind} ref=${ref}"
# handle_event "${ev}" || { ok=0; break; }
done < <(printf '%s' "${resp}" | jq -c '.events[]?')
# Advance cursor only after all events in this batch are handled successfully.
if [ "${ok}" = "1" ]; then
next="$(printf '%s' "${resp}" | jq -r '.next_cursor // empty')"
[ -n "${next}" ] && cursor="${next}"
fi
done
```
Failure semantics for the loop above:
- Keep `cursor` in durable storage (file/db) and load it at process startup.
- Update stored cursor only after successful event processing for that response.
- On `410 Gone`, your stored cursor is outside replay retention; reset and rebootstrap with `since_as_of` or a full sync.
Agent retrieval call:
```sh
curl --unix-socket "${SOCK}" -sS -X POST "${BASE}/v2/graph/retrieve" \
-H "Content-Type: application/json" \
-H "X-Amduat-Space: ${SPACE}" \
-d '{
"roots":["doc:1"],
"goal_predicates":["ms.within_domain"],
"max_depth":2,
"max_fanout":1024,
"limit_nodes":200,
"limit_edges":400,
"max_result_bytes":1048576
}'
```
Subgraph snapshot read:
```sh
curl --unix-socket "${SOCK}" -sS \
"${BASE}/v2/graph/subgraph?roots[]=doc:1&max_depth=2&dir=outgoing&limit_nodes=200&limit_edges=400&include_stats=true&max_result_bytes=1048576" \
-H "X-Amduat-Space: ${SPACE}"
```
Edge correction (tombstone):
```sh
EDGE_REF="<edge_ref_to_retract>"
curl --unix-socket "${SOCK}" -sS -X POST "${BASE}/v2/graph/edges/tombstone" \
-H "Content-Type: application/json" \
-H "X-Amduat-Space: ${SPACE}" \
-d "{\"edge_ref\":\"${EDGE_REF}\"}"
```
## 11) AI Answer Wrapper (Grounded)
For local app usage via this scaffold:
```sh
./scripts/v2_app.sh ai-answer 'doc-1' 'What topic is doc-1 in?' 'ms.within_domain'
./scripts/v2_app.sh ai-answer --json 'doc-1' 'What topic is doc-1 in?' 'ms.within_domain'
./scripts/v2_app.sh ai-answer --json --require-evidence 'doc-1' 'What topic is doc-1 in?' 'ms.within_domain'
```
Behavior notes:
- The command retrieves graph context first (`/v2/graph/retrieve` with `/v2/graph/subgraph` fallback).
- JSON output includes an `evidence[]` array with normalized triplets and refs (`predicate_ref` retained, `predicate_name` preferred when resolvable).
- `--require-evidence` enforces strict grounding: command exits non-zero when no supporting edges are found.