amduat/registry/README.md
2025-12-20 12:54:32 +01:00

100 lines
3.3 KiB
Markdown

# Registry Model (draft)
This directory documents how Amduat registries are represented in the repo
before the full stack and code generation are available. The goal is to keep
registry entries stable and machine-readable so they can drive docs, code, and
graph mappings later without rewriting identifiers.
## Core idea
- **Registry keys are concepts.** The key itself (for example a `HashId`) has a
stable handle like `amduat.hash.asl1.id.0001@1`.
- **Registry values are data.** Each entry has a descriptor row that is data
(name, digest length, status, notes). Values are not treated as concepts.
## Naming conventions
- **Spec series** use the `/1` form (e.g. `ASL/1-CORE`, `ASL/1-STORE`).
- **Family/registry tokens** use the compact form (e.g. `HASH/ASL1`,
`ENC/ASL1-CORE`, `TYPE/ASL1`). This keeps registry IDs short and aligns with
the existing hash/encoding identifiers.
This matches MS/1: concept handles are stable identifiers; the descriptor bytes
are data nodes that can be hashed and referenced.
## JSONL manifests
Each registry has a `registry/<name>.jsonl` manifest. Each line is one entry.
The manifest is the source of truth for codegen and documentation tables.
Common fields (registry-specific schemas may add fields):
- `registry`: registry name (e.g. `HASH/ASL1`).
- `hash_id` (or other key field): registry key as hex string.
- `handle`: concept handle for the key.
- `name`, `digest_len`, `status`, `spec_ref`, `notes`: descriptor fields.
- `descriptor_sha256`: digest of the canonical descriptor (see below).
## Handle naming scheme
Handles are opaque identifiers in the `amduat` namespace and do **not** depend
on DNS:
```
amduat.<domain>.<registry>.<kind>.<id>@<version>
```
Example (HASH/ASL1):
```
amduat.hash.asl1.id.0001@1
```
`id` is the canonical key (often lowercase hex, zero-padded). `version` is
incremented only when the meaning of that key changes (rare).
## Descriptor digest rule
`descriptor_sha256` is computed over a canonical JSON object with fields in a
fixed order. Each registry schema defines the exact ordering; for HASH/ASL1
see `registry/hash-asl1.schema.md`.
This digest can be used as a stable Data-node hash to anchor graphs, evidence,
or generated artifacts without requiring a full ASL encoder yet.
## Canonical key encoding
Each registry defines the byte encoding of its key. This is the encoding used
when building an ASL registry artifact:
- HASH/ASL1: `hash_id` is encoded as big-endian `u16` (see
`amduat_hash_asl1_key`).
## Mapping into ASL
When the stack is available:
1. Encode each descriptor row as an ASL Artifact (data bytes).
2. Store it to obtain a `Reference`.
3. Build a registry value where each entry maps:
```
key_bytes -> value_ref
```
This uses the generic registry container in `amduat/asl/registry`.
## Current manifests
- `registry/hash-asl1.jsonl` — ASL1 HashId assignments.
- `registry/enc-asl1-core.jsonl` — ASL1 core encoding profile assignments.
- `registry/type-tag.jsonl` — ASL1 TypeTag assignments.
- `registry/enc-pel1.jsonl` — PEL encoding profile assignments.
## Design constraints
- Handles MUST be stable and immutable once published.
- Reserved IDs are explicit rows with `status = reserved`.
- The manifest should be sufficient to regenerate code tables and docs without
additional context.