diff --git a/tier1/asl-core-index.md b/tier1/asl-core-index.md index 99a6d16..48a901e 100644 --- a/tier1/asl-core-index.md +++ b/tier1/asl-core-index.md @@ -1,214 +1,5 @@ -# ASL/1-CORE-INDEX — Semantic Index Model +# ASL/1-CORE-INDEX — moved -Status: Draft -Owner: Niklas Rydberg -Version: 0.1.0 -SoT: No -Last Updated: 2025-11-16 -Tags: [deterministic, index, semantics] +Canonical spec: `vendor/amduat/tier1/asl-core-index-1.md`. -**Document ID:** `ASL/1-CORE-INDEX` -**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle) - -**Depends on (normative):** - -* `ASL/1-CORE` -* `ASL/1-STORE` - -**Informative references:** - -* `ASL-STORE-INDEX` — store lifecycle and replay contracts -* `ENC-ASL-CORE-INDEX` — bytes-on-disk encoding profile (`tier1/enc-asl-core-index.md`) -* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding) -* `ASL/LOG/1` — append-only semantic log (segment visibility) -* `TGK/1` — TGK edge visibility and traversal alignment -* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment) - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX. - ---- - -## 1. Purpose & Non-Goals - -### 1.1 Purpose - -ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts: - -* It specifies what it means to map an artifact identity to a byte location. -* It defines visibility, immutability, and shadowing semantics. -* It ensures deterministic lookup for a fixed snapshot and log prefix. - -### 1.2 Non-goals - -ASL/1-CORE-INDEX explicitly does **not** define: - -* On-disk layouts, segment files, or memory representations. -* Block allocation, packing, GC, or lifecycle rules. -* Snapshot implementation details, checkpoints, or log storage. -* Performance optimizations (bloom filters, sharding, SIMD). -* Federation, provenance, or execution semantics. - ---- - -## 2. Terminology - -* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE. -* **Reference** — ASL/1 content address of an Artifact (hash_id + digest). -* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE). -* **Block** — immutable storage unit containing artifact bytes. -* **BlockID** — opaque identifier for a block. -* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block. -* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes. -* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state. -* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot. -* **CURRENT** — effective state after replaying a log prefix on a snapshot. - ---- - -## 3. Core Mapping Semantics - -### 3.1 Index Mapping - -The index defines a semantic mapping: - -``` -Reference -> ArtifactLocation -``` - -For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state. - -### 3.2 Determinism - -For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics. - -### 3.3 StoreConfig Consistency - -All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig. - ---- - -## 4. ArtifactLocation Semantics - -* An ArtifactLocation is an **ordered list** of ArtifactExtents. -* Each extent references immutable bytes within a block. -* The artifact bytes are defined by **concatenating extents in order**. -* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes. -* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries. -* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks. -* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact. -* An ArtifactLocation is valid only while all referenced blocks are retained. -* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping. - ---- - -## 5. Visibility Model - -An index entry is **visible** at CURRENT if and only if: - -1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot). -2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules). - -Visibility is binary; entries are either visible or not visible. - ---- - -## 6. Snapshot and Log Semantics - -Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes. - -The index state for a given CURRENT is defined as: - -``` -Index(CURRENT) = Index(snapshot) + replay(log_prefix) -``` - -Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed. - ---- - -## 7. Immutability and Shadowing - -### 7.1 Immutability - -* Index entries are never mutated. -* Once visible, an entry’s meaning does not change. -* Referenced bytes are immutable for the lifetime of the entry. - -### 7.2 Shadowing - -* Later entries MAY shadow earlier entries with the same Reference. -* Precedence is determined solely by log order. -* Snapshot boundaries do not alter shadowing semantics. - ---- - -## 8. Tombstones (Optional) - -Tombstone entries MAY be used to invalidate prior mappings. - -* A tombstone shadows earlier entries for the same Reference. -* Visibility rules are identical to regular entries. -* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used. - ---- - -## 9. Determinism Guarantees - -For fixed: - -* StoreConfig -* Snapshot -* Log prefix - -ASL/1-CORE-INDEX guarantees: - -* Deterministic lookup results -* Deterministic shadowing resolution -* Deterministic visibility - ---- - -## 10. Normative Invariants - -Conforming implementations MUST enforce: - -1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored). -2. No mutation of visible index entries. -3. Referenced bytes remain immutable for the entry’s lifetime. -4. Shadowing follows strict log order. -5. Snapshot + log replay uniquely defines CURRENT. -6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation. - -Violation of any invariant constitutes index corruption. - ---- - -## 11. Relationship to Other Specifications - -| Layer | Responsibility | -| ------------------ | ---------------------------------------------------------- | -| ASL/1-CORE | Artifact semantics and identity | -| ASL/1-STORE | StoreSnapshot and put/get logical model | -| ASL/1-CORE-INDEX | Semantic mapping of Reference → ArtifactLocation | -| ASL-STORE-INDEX | Lifecycle, replay, and visibility contracts | -| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records | - ---- - -## 12. Summary - -ASL/1-CORE-INDEX specifies the semantic meaning of the index: - -* It maps artifact References to byte locations deterministically. -* It defines visibility and shadowing rules across snapshot + log replay. -* It guarantees immutability and deterministic lookup. - -It answers one question: - -> *Given a Reference and a CURRENT state, where are the bytes?* +This placeholder avoids drift between repos. diff --git a/tier1/asl-index-accel-1.md b/tier1/asl-index-accel-1.md index e68a083..ae64810 100644 --- a/tier1/asl-index-accel-1.md +++ b/tier1/asl-index-accel-1.md @@ -1,276 +1,5 @@ -# ASL/INDEX-ACCEL/1 — Index Acceleration Semantics +# ASL/INDEX-ACCEL/1 — moved -Status: Draft -Owner: Niklas Rydberg -Version: 0.1.0 -SoT: No -Last Updated: 2025-11-16 -Tags: [deterministic, index, acceleration] +Canonical spec: `vendor/amduat/tier1/asl-index-accel-1.md`. -**Document ID:** `ASL/INDEX-ACCEL/1` -**Layer:** L1 — Acceleration rules over index semantics (no storage / encoding) - -**Depends on (normative):** - -* `ASL/1-CORE-INDEX` - -**Informative references:** - -* `ASL-STORE-INDEX` — store lifecycle and replay contracts -* `ENC-ASL-CORE-INDEX` — bytes-on-disk encoding profile (`tier1/enc-asl-core-index.md`) -* `TGK/1` — TGK semantics and visibility alignment -* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ASL/INDEX-ACCEL/1 defines **acceleration semantics only**. It MUST NOT change index meaning defined by ASL/1-CORE-INDEX. - ---- - -## 1. Purpose - -ASL/INDEX-ACCEL/1 defines **acceleration mechanisms** used by ASL-based indexes, including: - -* Routing keys -* Sharding -* Filters (Bloom, XOR, Ribbon, etc.) -* SIMD execution -* Hash recasting - -All mechanisms defined herein are **observationally invisible** to ASL/1-CORE-INDEX semantics. - ---- - -## 2. Scope - -Applies to: - -* Artifact indexes (ASL) -* Projection and graph indexes (e.g., TGK) -* Any index layered on ASL/1-CORE-INDEX semantics - -Does **not** define: - -* Artifact or edge identity -* Snapshot semantics -* Storage lifecycle -* Encoding details - ---- - -## 3. Canonical Key vs Routing Key - -### 3.1 Canonical Key - -The **Canonical Key** uniquely identifies an indexable entity. - -Examples: - -* Artifact: `Reference` -* TGK Edge: canonical key defined by `TGK/1` and `TGK/1-CORE` (opaque here) - -Properties: - -* Defines semantic identity -* Used for equality, shadowing, and tombstones -* Stable and immutable -* Fully compared on index match - -### 3.2 Routing Key - -The **Routing Key** is a **derived, advisory key** used exclusively for acceleration. - -Properties: - -* Derived deterministically from Canonical Key and optional attributes -* MAY be used for sharding, filters, SIMD layouts -* MUST NOT affect index semantics -* MUST be verified by full Canonical Key comparison on match - -Formal rule: - -``` -CanonicalKey determines correctness -RoutingKey determines performance -``` - ---- - -## 4. Filter Semantics - -### 4.1 Advisory Nature - -All filters are **advisory only**. - -Rules: - -* False positives are permitted -* False negatives are forbidden -* Filter behavior MUST NOT affect correctness - -Invariant: - -``` -Filter miss => key is definitely absent -Filter hit => key may be present -``` - -### 4.2 Filter Inputs - -Filters operate over **Routing Keys**, not Canonical Keys. - -A Routing Key MAY incorporate: - -* Hash of Canonical Key -* Artifact type tag (if present) -* TGK `EdgeTypeId` or other immutable classification attributes (TGK/1-CORE) -* Direction, role, or other immutable classification attributes - -Absence of optional attributes MUST be encoded explicitly. - -### 4.3 Filter Construction - -* Filters are built only over **sealed, immutable segments** -* Filters are immutable once built -* Filter construction MUST be deterministic -* Filter state MUST be covered by segment checksums -* Filters SHOULD be snapshot-scoped or versioned with their segment to avoid - unbounded false-positive accumulation over time - ---- - -## 5. Sharding Semantics - -### 5.1 Observational Invisibility - -Sharding is a **mechanical partitioning** of the index. - -Invariant: - -``` -LogicalIndex = union(all shards) -``` - -Rules: - -* Shards MUST NOT affect lookup results -* Shard count and boundaries may change over time -* Rebalancing MUST preserve lookup semantics - -### 5.2 Shard Assignment - -Shard assignment MAY be based on: - -* Hash of Canonical Key -* Routing Key -* Composite routing strategies - -Shard selection MUST be deterministic per snapshot. - ---- - -## 6. Hashing and Hash Recasting - -### 6.1 Hashing - -Hashes MAY be used for routing, filtering, or SIMD layout. - -Hashes MUST NOT be treated as identity. - -### 6.2 Hash Recasting - -Hash recasting (changing hash functions or seeds) is permitted if: - -1. It is deterministic -2. It does not change Canonical Keys -3. It does not affect index semantics - -Recasting is equivalent to rebuilding acceleration structures. - ---- - -## 7. SIMD Execution - -SIMD operations MAY be used to: - -* Evaluate filters -* Compare routing keys -* Accelerate scans - -Rules: - -* SIMD must operate only on immutable data -* SIMD must not short-circuit semantic checks -* SIMD must preserve deterministic behavior - ---- - -## 8. Multi-Dimensional Routing Examples (Normative) - -### 8.1 Artifact Index - -* Canonical Key: `Reference` -* Routing Key components: - - * `H(Reference)` - * `type_tag` (if present) - * `has_typetag` - -### 8.2 TGK Edge Index - -* Canonical Key: defined by `TGK/1` and `TGK/1-CORE` (opaque here) -* Routing Key components: - - * `H(CanonicalEdgeKey)` - * `EdgeTypeId` (if present in the TGK profile) - * Direction or role (optional) - ---- - -## 9. Snapshot Interaction - -Acceleration structures: - -* MUST respect snapshot visibility rules -* MUST operate over the same sealed segments visible to the snapshot -* MUST NOT bypass tombstones or shadowing - -Snapshot cuts apply **after** routing and filtering. - ---- - -## 10. Normative Invariants - -1. Canonical Keys define identity and correctness -2. Routing Keys are advisory only -3. Filters may never introduce false negatives -4. Sharding is observationally invisible -5. Hashes are not identity -6. SIMD is an execution strategy, not a semantic construct -7. All acceleration is deterministic per snapshot - ---- - -## 11. Non-Goals - -ASL/INDEX-ACCEL/1 does not define: - -* Specific filter algorithms -* Memory layout -* CPU instruction selection -* Encoding formats -* Federation policies - ---- - -## 12. Summary - -ASL/INDEX-ACCEL/1 establishes a strict contract: - -> All acceleration exists to make the index faster, never different. - -It formalizes Canonical vs Routing keys and constrains filters, sharding, hashing, and SIMD so that correctness is preserved under all optimizations. +This placeholder avoids drift between repos. diff --git a/tier1/asl-indexes-1.md b/tier1/asl-indexes-1.md index 62cfa6d..fed6b26 100644 --- a/tier1/asl-indexes-1.md +++ b/tier1/asl-indexes-1.md @@ -1,119 +1,5 @@ -# ASL/INDEXES/1 -- Index Taxonomy and Relationships +# ASL/INDEXES/1 — moved -Status: Draft -Owner: Architecture -Version: 0.1.0 -SoT: No -Last Updated: 2025-01-17 -Tags: [indexes, content, structural, materialization] +Canonical spec: `vendor/amduat/tier1/asl-indexes-1.md`. -**Document ID:** `ASL/INDEXES/1` -**Layer:** L2 -- Index taxonomy (no encoding) - -**Depends on (normative):** - -* `ASL/1-CORE-INDEX` -* `ASL-STORE-INDEX` - -**Informative references:** - -* `ASL/SYSTEM/1` -* `TGK/1` - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ASL/INDEXES/1 defines index roles and relationships. It does not define encodings or storage layouts. - ---- - -## 1. Purpose - -This document defines the minimal set of indexes used by ASL systems and their dependency relationships. - ---- - -## 2. Index Taxonomy (Normative) - -ASL systems use three distinct indexes: - -### 2.1 Content Index - -Purpose: map semantic identity to bytes. - -``` -ArtifactKey -> ArtifactLocation -``` - -Properties: - -* Snapshot-relative and append-only -* Deterministic replay -* Optional tombstone shadowing - -This is the ASL/1-CORE-INDEX and is the only index that governs visibility. - -### 2.2 Structural Index - -Purpose: map structural identity to a derivation DAG node. - -``` -SID -> DAG node -``` - -Properties: - -* Deterministic and rebuildable -* Does not imply materialization -* May be in-memory or persisted - -### 2.3 Materialization Cache - -Purpose: record previously materialized content for a structural identity. - -``` -SID -> ArtifactKey -``` - -Properties: - -* Redundant and safe to drop -* Recomputable from DAG + content index -* Pure performance optimization - ---- - -## 3. Dependency Rules (Normative) - -Dependencies MUST follow this direction: - -``` -Structural Index -> Materialization Cache -> Content Index -``` - -Rules: - -* The Content Index MUST NOT depend on the Structural Index. -* The Structural Index MUST NOT depend on stored bytes. -* The Materialization Cache MAY depend on both. - ---- - -## 4. PUT/GET Interaction (Informative) - -* PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index. -* GET consults only the Content Index and reads bytes from the store. -* The Structural Index and Materialization Cache are optional optimizations for PUT. - ---- - -## 5. Non-Goals - -ASL/INDEXES/1 does not define: - -* Encodings for any index -* Storage layout or sharding -* Query operators or traversal semantics +This placeholder avoids drift between repos. diff --git a/tier1/asl-log-1.md b/tier1/asl-log-1.md index 1c0fa90..9803562 100644 --- a/tier1/asl-log-1.md +++ b/tier1/asl-log-1.md @@ -1,295 +1,5 @@ -# ASL/LOG/1 — Append-Only Semantic Log +# ASL/LOG/1 — moved -Status: Draft -Owner: Niklas Rydberg -Version: 0.1.0 -SoT: No -Last Updated: 2025-11-16 -Tags: [deterministic, log, snapshot] +Canonical spec: `vendor/amduat/tier1/asl-log-1.md`. -**Document ID:** `ASL/LOG/1` -**Layer:** L1 — Domain log semantics (no transport) - -**Depends on (normative):** - -* `ASL-STORE-INDEX` - -**Informative references:** - -* `ASL/1-CORE-INDEX` — index semantics -* `TGK/1` — TGK edge visibility and traversal alignment -* `ENC-ASL-LOG` — bytes-on-disk encoding profile (`tier1/enc-asl-log.md`) -* `ENC-ASL-CORE-INDEX` — index segment encoding (`tier1/enc-asl-core-index.md`) -* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment) - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ASL/LOG/1 defines **semantic log behavior**. It does not define transport, replication protocols, or storage layout. - ---- - -## 1. Purpose - -ASL/LOG/1 defines the **authoritative, append-only log** for an ASL domain. - -The log records **semantic commits** that affect: - -* Index segment visibility -* Tombstone policy -* Snapshot anchoring -* Optional publication metadata - -The log is the **sole source of truth** for reconstructing CURRENT state. - ---- - -## 2. Core Properties (Normative) - -An ASL log MUST be: - -1. Append-only -2. Strictly ordered -3. Deterministically replayable -4. Hash-chained -5. Snapshot-anchorable -6. Binary encoded per `ENC-ASL-LOG` -7. Forward-compatible - ---- - -## 3. Log Model - -### 3.1 Log Sequence - -Each record has a monotonically increasing `logseq`: - -``` -logseq: uint64 -``` - -* Assigned by the domain authority -* Total order within a domain -* Never reused - -### 3.2 Hash Chain - -Each record commits to the previous record: - -``` -record_hash = H(prev_record_hash || logseq || record_type || payload_len || payload) -``` - -This enables tamper detection, witness signing, and federation verification. - -### 3.3 Record Envelope - -All log records share a common envelope whose **exact byte layout** is defined -in `ENC-ASL-LOG`. The envelope MUST include: - -* `logseq` (monotonic sequence number) -* `record_type` (type tag) -* `payload_len` (bytes) -* `payload` (type-specific bytes) -* `record_hash` (hash-chained integrity) - ---- - -## 4. Record Types (Normative) - -## 4.0 Common Payload Encoding (Informative) - -The byte-level payload schemas are defined in `ENC-ASL-LOG`. The shared -artifact reference encoding is: - -```c -typedef struct { - uint32_t hash_id; - uint16_t digest_len; - uint16_t reserved0; // must be 0 - uint8_t digest[digest_len]; -} ArtifactRef; -``` - -### 4.1 SEGMENT_SEAL - -Declares an index segment visible. - -Payload (encoding): - -```c -typedef struct { - uint64_t segment_id; - uint8_t segment_hash[32]; -} SegmentSealPayload; -``` - -Semantics: - -* From this `logseq` onward, the referenced segment is visible for lookup and replay. -* Segment MUST be immutable. -* All referenced blocks MUST already be sealed. -* Segment contents are not re-logged. - -### 4.2 TOMBSTONE - -Declares an artifact inadmissible under domain policy. - -Payload (encoding): - -```c -typedef struct { - ArtifactRef artifact; - uint32_t scope; - uint32_t reason_code; -} TombstonePayload; -``` - -Semantics: - -* Does not delete data. -* Shadows prior visibility. -* Applies from this logseq onward. - -### 4.3 TOMBSTONE_LIFT - -Supersedes a previous tombstone. - -Payload (encoding): - -```c -typedef struct { - ArtifactRef artifact; - uint64_t tombstone_logseq; -} TombstoneLiftPayload; -``` - -Semantics: - -* References an earlier TOMBSTONE. -* Does not erase history. -* Only affects CURRENT at or above this logseq. - -### 4.4 SNAPSHOT_ANCHOR - -Binds semantic state to a snapshot. - -Payload (encoding): - -```c -typedef struct { - uint64_t snapshot_id; - uint8_t root_hash[32]; -} SnapshotAnchorPayload; -``` - -Semantics: - -* Defines a replay checkpoint. -* Enables log truncation below anchor with care. - -### 4.5 ARTIFACT_PUBLISH (Optional) - -Marks an artifact as published. - -Payload (encoding): - -```c -typedef struct { - ArtifactRef artifact; -} ArtifactPublishPayload; -``` - -Semantics: - -* Publication is domain-local. -* Federation layers may interpret this metadata. - -### 4.6 ARTIFACT_UNPUBLISH (Optional) - -Withdraws publication. - -Payload (encoding): - -```c -typedef struct { - ArtifactRef artifact; -} ArtifactUnpublishPayload; -``` - ---- - -## 5. Replay Semantics (Normative) - -To reconstruct CURRENT: - -1. Load latest snapshot anchor (if any). -2. Initialize visible segments from that snapshot. -3. Replay all log records with `logseq > snapshot.logseq`. -4. Apply records in order: - - * SEGMENT_SEAL -> add segment - * TOMBSTONE -> update policy state - * TOMBSTONE_LIFT -> override policy - * PUBLISH/UNPUBLISH -> update visibility metadata - -Replay MUST be deterministic. - ---- - -## 6. Index Interaction - -* Index segments contain index entries. -* The log never records individual index entries. -* Visibility is controlled solely by SEGMENT_SEAL. -* Index rebuild = scan visible segments + apply policy. - ---- - -## 7. Garbage Collection Constraints - -* A segment may be GC'd only if: - - * No snapshot references it. - * No log replay <= CURRENT requires it. - -* Log truncation is only safe at SNAPSHOT_ANCHOR boundaries. - ---- - -## 8. Versioning & Extensibility - -* Unknown record types MUST be skipped and MUST NOT break replay. -* Payloads are opaque outside their type. -* New record types may be added in later versions. - ---- - -## 9. Non-Goals - -ASL/LOG/1 does not define: - -* Federation protocols -* Network replication -* Witness signatures -* Block-level events -* Hydration / eviction -* Execution receipts - ---- - -## 10. Invariant (Informative) - -> If it affects visibility, admissibility, or authority, it goes in the log. -> If it affects layout or performance, it does not. - ---- - -## 10. Summary - -ASL/LOG/1 defines the minimal semantic log needed to reconstruct CURRENT. - -If it affects visibility or admissibility, it goes in the log. If it affects layout or performance, it does not. +This placeholder avoids drift between repos. diff --git a/tier1/asl-store-index.md b/tier1/asl-store-index.md index a69a9f3..39a1305 100644 --- a/tier1/asl-store-index.md +++ b/tier1/asl-store-index.md @@ -1,375 +1,5 @@ -# ASL-STORE-INDEX +# ASL/STORE-INDEX/1 — moved -### Store Semantics and Contracts for ASL Core Index (Tier1) +Canonical spec: `vendor/amduat/tier1/asl-store-index-1.md`. ---- - -## 1. Purpose - -This document defines the **operational and store-level semantics** required to implement ASL-CORE-INDEX. - -It specifies: - -* **Block lifecycle**: creation, sealing, retention, GC -* **Index segment lifecycle**: creation, append, seal, visibility -* **Snapshot identity and log positions** for deterministic replay -* **Append-only log semantics** -* **Lookup, visibility, and crash recovery rules** -* **Small vs large block handling** - -It **does not define encoding** (see ENC-ASL-CORE-INDEX at `tier1/enc-asl-core-index.md`) or semantic mapping (see ASL/1-CORE-INDEX). - -**Informative references:** - -* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment) -* `TGK/1` — TGK semantics and visibility alignment -* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions - ---- - -## 2. Scope - -Covers: - -* Lifecycle of **blocks** and **index entries** -* Snapshot and CURRENT consistency guarantees -* Deterministic replay and recovery -* GC and tombstone semantics -* Packing policy for small vs large artifacts - -Excludes: - -* Disk-level encoding -* Sharding or acceleration strategies (see ASL/INDEX-ACCEL/1) -* Memory residency or caching -* Federation, PEL, or TGK semantics (see `TGK/1` and `TGK/1-CORE`) - ---- - -## 3. Core Concepts - -### 3.1 Block - -* **Definition:** Immutable storage unit containing artifact bytes. -* **Identifier:** BlockID (opaque, unique). -* **Properties:** - - * Once sealed, contents never change. - * Can be referenced by multiple artifacts. - * May be pinned by snapshots for retention. - * Allocation method is implementation-defined (e.g., hash or sequence). - -### 3.2 Index Segment - -Segments group index entries and provide **persistence and recovery units**. - -* **Open segment:** accepting new index entries, not visible for lookup. -* **Sealed segment:** closed for append, log-visible, snapshot-pinnable. -* **Segment components:** header, optional bloom filter, index records, footer. -* **Segment visibility:** only after seal and log append. - -### 3.3 Append-Only Log - -All store-visible mutations are recorded in a **strictly ordered, append-only log**: - -* Entries include: - - * Index additions - * Tombstones - * Segment seals -* Log is replayable to reconstruct CURRENT. -* Log semantics are defined in `ASL/LOG/1`. - -### 3.4 Snapshot Identity and Log Position - -To make CURRENT referencable and replayable, ASL-STORE-INDEX defines: - -* **SnapshotID**: opaque, immutable identifier for a snapshot. -* **LogPosition**: monotonic integer position in the append-only log. -* **IndexState**: `(SnapshotID, LogPosition)`. - -Deterministic replay is defined as: - -``` -Index(SnapshotID, LogPosition) = Snapshot[SnapshotID] + replay(log[0:LogPosition]) -``` - -Snapshots and log positions are required for checkpointing, federation, and deterministic recovery. - -### 3.5 Artifact Location - -* **ArtifactExtent**: `(BlockID, offset, length)` identifying a byte slice within a block. -* **ArtifactLocation**: ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes. -* Multi-extent locations allow a single artifact to be striped across multiple blocks. - ---- - -## 4. PUT/GET Contract (Normative) - -### 4.1 PUT Signature - -``` -put(artifact) -> (ArtifactKey, IndexState) -``` - -* `ArtifactKey` is the content identity (ASL/1-CORE-INDEX). -* `IndexState = (SnapshotID, LogPosition)` after the PUT is admitted. - -### 4.2 PUT Semantics - -1. **Structural registration (if applicable)**: if a structural index (SID -> DAG) exists, it MUST register the artifact and reuse existing SID entries. -2. **Materialization (if applicable)**: if the artifact is lazy, materialize deterministically to derive `ArtifactKey`. -3. **Deduplication**: lookup `ArtifactKey` at CURRENT. If present, PUT MUST succeed without writing bytes or adding a new index entry. -4. **Storage**: if absent, write bytes to one or more sealed blocks and produce `ArtifactLocation`. -5. **Index mutation**: append an index entry mapping `ArtifactKey -> ArtifactLocation` and record visibility via log order. - -### 4.3 PUT Guarantees - -* PUT is idempotent for identical artifacts. -* No visible index entry points to mutable or missing bytes. -* Visibility follows log order and seal rules defined in this document. - -### 4.4 GET Signature - -``` -get(ArtifactKey, IndexState?) -> bytes | NOT_FOUND -``` - -* `IndexState` defaults to CURRENT when omitted. - -### 4.5 GET Semantics - -1. Resolve `ArtifactKey -> ArtifactLocation` using `Index(snapshot, log_prefix)`. -2. If no entry exists, return `NOT_FOUND`. -3. Otherwise, read exactly the referenced `(BlockID, offset, length)` bytes and return them verbatim. - -GET MUST NOT mutate state or trigger materialization. - -### 4.6 Failure Semantics - -* Partial writes MUST NOT become visible. -* Replay of snapshot + log after crash MUST reconstruct a valid CURRENT. -* Implementations MAY use caching, but MUST preserve determinism. - ---- - -## 5. Block Lifecycle Semantics - -| Event | Description | Semantic Guarantees | -| ------------------ | ------------------------------------- | ------------------------------------------------------------- | -| Creation | Block allocated; bytes may be written | Not visible to index until sealed | -| Sealing | Block is finalized and immutable | Sealed blocks are stable and safe to reference from index | -| Retention | Block remains accessible | Blocks referenced by snapshots or CURRENT must not be removed | -| Garbage Collection | Block may be deleted | Only unpinned, unreachable blocks may be removed | - -Notes: - -* Sealing ensures any index entry referencing the block is immutable. -* Retention is driven by snapshot and log visibility rules. -* GC must **never violate CURRENT reconstruction guarantees**. - ---- - -## 6. Segment Lifecycle Semantics - -### 5.1 Creation - -* Open segment is allocated. -* Index entries appended in log order. -* Entries are invisible until segment seal and log append. - -### 5.2 Seal - -* Segment is closed to append. -* Seal record is written to append-only log. -* Segment becomes visible for lookup. -* Sealed segment may be snapshot-pinned. - -### 5.3 Snapshot Interaction - -* Snapshots capture sealed segments. -* Open segments need not survive snapshot. -* Segments below snapshot are replay anchors. - ---- - -## 7. Visibility and Lookup Semantics - -### 6.1 Visibility Rules - -* Entry visible **iff**: - - * The block is sealed. - * Log record exists at position ≤ CURRENT. - * Segment seal recorded in log. - -* Entries above CURRENT or referencing unsealed blocks are invisible. - -### 6.2 Lookup Semantics - -To resolve an `ArtifactKey`: - -1. Identify all visible segments ≤ CURRENT. -2. Search segments in **reverse seal-log order** (highest seal log position first). -3. Return first matching entry. -4. Respect tombstones to shadow prior entries. - -Determinism: - -* Lookup results are identical across platforms given the same snapshot and log prefix. -* Accelerations (bloom filters, sharding, SIMD) **do not alter correctness**. - ---- - -## 8. Snapshot Interaction - -* Snapshots capture the set of **sealed blocks** and **sealed index segments** at a point in time. -* Blocks referenced by a snapshot are **pinned** and cannot be garbage-collected until snapshot expiration. -* CURRENT is reconstructed as: - -``` -CURRENT = snapshot_state + replay(log) -``` - -Segment and block visibility rules: - -| Entity | Visible in snapshot | Visible in CURRENT | -| -------------------- | ---------------------------- | ------------------------------ | -| Open segment/block | No | Only after seal and log append | -| Sealed segment/block | Yes, if included in snapshot | Yes, replayed from log | -| Tombstone | Yes, if log-recorded | Yes, shadows prior entries | - ---- - -## 9. Garbage Collection - -Eligibility for GC: - -* Segments: sealed, no references from CURRENT or snapshots. -* Blocks: unpinned, unreferenced by any segment or artifact. - -Rules: - -* GC is safe **only on sealed segments and blocks**. -* Must respect snapshot pins. -* Tombstones may aid in invalidating unreachable blocks. -* Snapshots retained for provenance or receipt verification MUST remain pinned. - -Outcome: - -* GC never violates CURRENT reconstruction. -* Blocks can be reclaimed without breaking provenance. - ---- - -## 10. Tombstone Semantics - -* Optional marker to invalidate prior mappings. -* Visibility rules identical to regular index entries. -* Used to maintain deterministic CURRENT in face of shadowing or deletions. - ---- - -## 11. Small vs Large Block Handling - -### 11.1 Definitions - -| Term | Meaning | -| ----------------- | --------------------------------------------------------------------- | -| **Small block** | Block containing artifact bytes below a threshold `T_small`. | -| **Large block** | Block containing artifact bytes ≥ `T_small`. | -| **Mixed segment** | Segment containing both small and large blocks (discouraged). | -| **Packing** | Combining multiple small artifacts into a single physical block. | -| **BlockID** | Opaque identifier for a block; addressing is identical for all sizes. | - -Small vs large classification is **store-level only** and transparent to ASL-CORE and index layers. -`T_small` is configurable per deployment. - -### 11.2 Packing Rules - -1. **Small blocks may be packed together** to reduce storage overhead. -2. **Large blocks are never packed with other artifacts**. -3. Mixed segments are **allowed but discouraged**; implementations MAY warn when mixing occurs. - -### 11.3 Segment Allocation Rules - -1. Small blocks are allocated into segments optimized for packing efficiency. -2. Large blocks are allocated into segments optimized for sequential I/O. -3. Segment sealing and visibility rules remain unchanged. - -### 11.4 Indexing and Addressing - -All blocks are addressed uniformly: - -``` -ArtifactExtent = (BlockID, offset, length) -ArtifactLocation = [ArtifactExtent...] -``` - -Packing does **not** affect index semantics or determinism. Multi-extent ArtifactLocations are allowed. - -### 11.5 GC and Retention - -1. Packed small blocks can be reclaimed only when **all contained artifacts** are unreachable. -2. Large blocks are reclaimed per block. - -Invariant: GC must never remove bytes still referenced by CURRENT or snapshots. - ---- - -## 12. Crash and Recovery Semantics - -* Open segments or unsealed blocks may be lost; no invariant is broken. -* Recovery procedure: - - 1. Mount last checkpoint snapshot. - 2. Replay append-only log from checkpoint. - 3. Reconstruct CURRENT. - -* Recovery is **deterministic and idempotent**. -* Segments and blocks **never partially visible** after crash. - ---- - -## 13. Normative Invariants - -1. Sealed blocks are immutable. -2. Index entries referencing blocks are immutable once visible. -3. Shadowing follows strict log order. -4. Replay of snapshot + log uniquely reconstructs CURRENT. -5. GC cannot remove blocks or segments needed by snapshot or CURRENT. -6. Tombstones shadow prior entries without deleting underlying blocks prematurely. -7. IndexState `(SnapshotID, LogPosition)` uniquely identifies CURRENT. - ---- - -## 14. Non-Goals - -* Disk-level encoding (ENC-ASL-CORE-INDEX). -* Memory layout or caching. -* Sharding or performance heuristics. -* Federation / multi-domain semantics (handled elsewhere). -* Block packing strategies beyond the policy rules here. - ---- - -## 15. Relationship to Other Layers - -| Layer | Responsibility | -| ------------------ | ---------------------------------------------------------------------------- | -| ASL-CORE | Artifact semantics, existence of blocks, immutability | -| ASL-CORE-INDEX | Semantic mapping of ArtifactKey → ArtifactLocation | -| ASL-STORE-INDEX | Lifecycle and operational contracts for blocks and segments | -| ENC-ASL-CORE-INDEX | Bytes-on-disk layout for segments, index records, and optional bloom filters | - ---- - -## 16. Summary - -The tier1 ASL-STORE-INDEX specification: - -* Defines **block lifecycle** and **segment lifecycle**. -* Makes **snapshot identity and log positions** explicit for replay. -* Ensures deterministic visibility, lookup, and crash recovery. -* Formalizes GC safety and tombstone behavior. -* Adds clear **small vs large block** handling without changing core semantics. +This placeholder avoids drift between repos. diff --git a/tier1/asl-system-1.md b/tier1/asl-system-1.md index cb4a1e0..8731c1e 100644 --- a/tier1/asl-system-1.md +++ b/tier1/asl-system-1.md @@ -1,194 +1,5 @@ -# ASL/SYSTEM/1 — Unified ASL + TGK + PEL System View +# ASL/SYSTEM/1 — moved -Status: Draft -Owner: Architecture -Version: 0.1.0 -SoT: No -Last Updated: 2025-01-17 -Tags: [deterministic, federation, pel, tgk, index] +Canonical spec: `vendor/amduat/tier1/asl-system-1.md`. -**Document ID:** `ASL/SYSTEM/1` -**Layer:** L2 — Cross-cutting system view (no new encodings) - -**Depends on (normative):** - -* `ASL/1-CORE` -* `ASL/1-CORE-INDEX` -* `ASL-STORE-INDEX` -* `ASL/LOG/1` -* `ENC-ASL-CORE-INDEX` - -**Informative references:** - -* `ASL/INDEX-ACCEL/1` -* `TGK/1` — Trace Graph Kernel semantics -* PEL draft specs (program DAG, execution receipts) -* `ASL/FEDERATION/1` — core federation semantics -* `ASL/FEDERATION-REPLAY/1` — cross-node deterministic replay -* `ASL/DAP/1` — domain admission -* `ASL/POLICY-HASH/1` — policy binding - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are -to be interpreted as in RFC 2119. - -ASL/SYSTEM/1 is an integration view. It does not define new encodings or -storage formats; those remain in the underlying layer specs. - ---- - -## 1. Purpose & Scope - -This document aligns the cross-cutting semantics of: - -* ASL index and log behavior -* PEL deterministic execution -* TGK edge semantics and traversal -* Federation visibility and replay - -It ensures a single, consistent model for determinism, snapshot bounds, and -domain visibility. - -Non-goals: - -* New on-disk encodings -* New execution operators -* Domain policy or governance rules - ---- - -## 2. Core Objects (Unified View) - -* **Artifact**: immutable byte value (ASL/1-CORE). -* **PER**: PEL Execution Receipt stored as an artifact. -* **TGK Edge**: immutable edge record linking artifacts and/or PERs. -* **Snapshot + Log Prefix**: boundary for deterministic visibility and replay. -* **Domain Visibility**: internal vs published visibility embedded in index - records (ENC-ASL-CORE-INDEX). - -All of these objects are addressed and stored via the same index semantics. - ---- - -## 3. Determinism & Snapshot Boundaries - -For a fixed `(SnapshotID, LogPrefix)`: - -* Index lookup is deterministic (ASL/1-CORE-INDEX). -* TGK traversal is deterministic when bounded by the same snapshot/log prefix. -* PEL execution is deterministic when its inputs are bounded by the same - snapshot/log prefix. - -PEL MUST read only snapshot-scoped artifacts and receipts. It MUST NOT depend -on storage layout, block packing, or non-snapshot metadata. - -PEL outputs (artifacts and PERs) become visible only through normal index -admission and log ordering. - -PEL MUST NOT depend on physical storage metadata. It MAY read only: - -* snapshot identity -* execution configuration that is itself snapshot-scoped and immutable - ---- - -## 4. One PEL Principle (Resolution) - -There is exactly one PEL: a deterministic, snapshot-bound, authority-aware -derivation language mapping artifacts to artifacts. - -Distinctions such as "PEL-S" vs "PEL-P" are not separate languages. They are -policy decisions about how outputs are treated: - -* **Promotion** (truth vs view) is a domain policy decision. -* **Publication** (internal vs published) is a visibility decision encoded in - index metadata. -* **Retention** (store, cache, discard, recompute) is a store policy decision. - -Implementations MUST NOT fork PEL semantics into separate dialects. Any -classification of outputs MUST be expressed via policy, publication flags, or -receipt annotations, not by changing the execution language. - ---- - -## 5. PEL, PERs, and TGK Integration - -* PEL programs consume artifacts and/or PERs. -* PEL execution produces artifacts and a PER describing the run. -* TGK edges may reference artifacts, PERs, or projections derived from them. - ---- - -## 5.1 PERs and Snapshot State (Clarification) - -PERs are artifacts that bind deterministic execution to a specific snapshot -and log prefix. They do not introduce a separate storage layer: - -* The sequential log and snapshot define CURRENT. -* A PER records that execution observed CURRENT at a specific log prefix. -* Replay uses the same snapshot + log prefix to reconstruct inputs. -* PERs are artifacts and MAY be used as inputs, but programs embedded in - receipts MUST NOT be executed implicitly. - -TGK remains a semantic graph layer; it does not alter PEL determinism and does -not bypass the index. - ---- - -## 6. Federation Alignment - -Federation operates over the same immutable artifacts, PERs, and TGK edges. -Cross-domain visibility is governed by index metadata: - -* `domain_id` identifies the owning domain. -* `visibility` marks internal vs published. -* `cross_domain_source` preserves provenance for imported artifacts. - -Deterministic replay across nodes MUST respect: - -* Snapshot boundaries -* Log order -* Domain visibility rules - -Federation does not change PEL semantics. It propagates artifacts and receipts -that were already deterministically produced. - -Admission and policy compatibility gate foreign state: only admitted domains and -policy-compatible published state may be included in a federation view. - ---- - -## 7. Index Alignment - -The index is the shared substrate: - -* Artifacts, PERs, and TGK edges are all indexed via the same lookup semantics. -* Sharding, SIMD, and filters (ASL/INDEX-ACCEL/1) are advisory and MUST NOT - change correctness. -* Tombstones and shadowing remain the only visibility overrides. - ---- - -## 8. Glossary and Terminology Alignment (Informative) - -To prevent drift across layers, the following terms map as: - -* **EdgeBody** (`TGK/1-CORE`) — logical edge content (`from[]`, `to[]`, `payload`, `type`). -* **EdgeArtifact** (`TGK/1-CORE`) — ASL Artifact whose payload encodes an EdgeBody. -* **EdgeRef** (`TGK/1-CORE`) — ASL Reference to an EdgeArtifact. -* **TGK index record** (`TGK/1`, `ASL/1-CORE-INDEX`) — index entry that makes an EdgeRef visible under snapshot/log rules; contains no edge payload. -* **TGK traversal result** (`TGK/1`) — snapshot/log-bounded set of visible edges (EdgeRefs) and/or node references derived from indexed EdgeArtifacts. - ---- - -## 9. Summary - -ASL/SYSTEM/1 provides a single, consistent view: - -* One PEL, with policy-based output treatment -* TGK and PEL both bounded by snapshot + log determinism -* Federation mediated by index-level domain metadata -* Index semantics remain the core substrate for all objects +This placeholder avoids drift between repos. diff --git a/tier1/asl-tgk-execution-plan-1.md b/tier1/asl-tgk-execution-plan-1.md index 306a166..4bcf963 100644 --- a/tier1/asl-tgk-execution-plan-1.md +++ b/tier1/asl-tgk-execution-plan-1.md @@ -1,231 +1,5 @@ -# ASL/TGK-EXEC-PLAN/1 -- Unified Execution Plan Semantics +# ASL/TGK-EXEC-PLAN/1 — moved -Status: Draft -Owner: Architecture -Version: 0.1.0 -SoT: No -Last Updated: 2025-01-17 -Tags: [execution, query, tgk, determinism] +Canonical spec: `vendor/amduat/tier1/asl-tgk-execution-plan-1.md`. -**Document ID:** `ASL/TGK-EXEC-PLAN/1` -**Layer:** L2 -- Execution plan semantics (no encoding) - -**Depends on (normative):** - -* `ASL/1-CORE-INDEX` -* `ASL/LOG/1` -* `ASL/INDEX-ACCEL/1` -* `TGK/1` - -**Informative references:** - -* `ASL/SYSTEM/1` -* `ENC-ASL-CORE-INDEX` - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ASL/TGK-EXEC-PLAN/1 defines execution plan semantics for querying artifacts and TGK edges. It does not define encoding, transport, or runtime scheduling. - ---- - -## 1. Purpose - -This document defines the operator model and determinism rules for executing queries over ASL artifacts and TGK edges using snapshot-bounded visibility. - ---- - -## 2. Execution Plan Model (Normative) - -An execution plan is a DAG of operators: - -``` -Plan = { nodes: [Op], edges: [(Op -> Op)] } -``` - -Each operator includes: - -* `op_id`: unique identifier -* `op_type`: operator type -* `inputs`: upstream operator outputs -* `snapshot`: `(SnapshotID, LogPrefix)` -* `constraints`: canonical filters -* `projections`: output fields -* `traversal`: optional traversal parameters -* `aggregation`: optional aggregation parameters - ---- - -## 2.1 Query Abstraction (Informative) - -A query can be represented as: - -``` -Q = { - snapshot: S, - constraints: C, - projections: P, - traversal: optional, - aggregation: optional -} -``` - -Where: - -* `constraints` describe canonical filters (artifact keys, type tags, edge types, roles, node IDs). -* `projections` select output fields. -* `traversal` declares TGK traversal depth and direction. -* `aggregation` defines deterministic reduction operations. - ---- - -## 3. Deterministic Ordering (Normative) - -All operator outputs MUST be ordered by: - -1. `logseq` ascending -2. canonical key ascending (tie-breaker) - -Parallel execution MUST preserve this order. - ---- - -## 4. Visibility Rules (Normative) - -Records are visible if and only if: - -* `record.logseq <= snapshot.log_prefix` -* The record is not shadowed by a later tombstone - -Unknown record types MUST be skipped without breaking determinism. - ---- - -## 5. Operator Types (Normative) - -### 5.1 SegmentScan - -* Inputs: sealed segments -* Outputs: raw record references -* Rules: - * Only segments with `segment.logseq_min <= snapshot.log_prefix` are scanned. - * Advisory filters MAY be applied but MUST NOT introduce false negatives. - * Shard routing MAY be applied prior to scan if deterministic. - -### 5.2 IndexFilter - -* Inputs: record stream -* Outputs: filtered record stream -* Rules: - * Applies canonical constraints (artifact key, type tag, TGK edge type, roles). - * Filters MUST be exact; advisory filters are not sufficient. - -### 5.3 TombstoneShadow - -* Inputs: record stream + tombstone stream -* Outputs: visible records only -* Rules: - * Later tombstones shadow earlier entries with the same canonical key. - -### 5.4 Merge - -* Inputs: multiple ordered streams -* Outputs: single ordered stream -* Rules: - * Order is `logseq` then canonical key. - * Merge MUST be deterministic regardless of shard order. - -### 5.5 Projection - -* Inputs: record stream -* Outputs: projected fields -* Rules: - * Projection MUST preserve input order. - -### 5.6 TGKTraversal - -* Inputs: seed node set -* Outputs: edge and/or node stream -* Rules: - * Expansion MUST respect snapshot bounds. - * Traversal depth MUST be explicit. - * Order MUST follow deterministic ordering rules. - -### 5.7 Aggregation (Optional) - -* Inputs: record stream -* Outputs: aggregate results -* Rules: - * Aggregation MUST be deterministic given identical inputs and snapshot. - -### 5.8 LimitOffset (Optional) - -* Inputs: ordered record stream -* Outputs: ordered slice -* Rules: - * Applies pagination or top-N selection. - * MUST preserve deterministic order from upstream operators. - -### 5.9 ShardDispatch (Optional) - -* Inputs: shard-local streams -* Outputs: ordered global stream -* Rules: - * Shard execution MAY be parallel. - * Merge MUST preserve deterministic ordering by `logseq` then canonical key. - -### 5.10 SIMDFilter (Optional) - -* Inputs: record stream -* Outputs: filtered record stream -* Rules: - * SIMD filters are advisory accelerators. - * Canonical checks MUST still be applied before output. - ---- - -## 6. Acceleration Constraints (Normative) - -Acceleration mechanisms (filters, routing, SIMD) MUST be observationally invisible: - -* False positives are permitted. -* False negatives are forbidden. -* Canonical checks MUST always be applied before returning results. - ---- - -## 7. Plan Serialization (Optional) - -Execution plans MAY be serialized for reuse or deterministic replay. - -```c -struct exec_plan { - uint32_t plan_version; - uint32_t operator_count; - struct operator_def operators[]; - struct operator_edge edges[]; -}; -``` - -Serialization MUST preserve operator parameters, snapshot bounds, and DAG edges. - ---- - -## 8. GC Safety (Informative) - -Records and edges MUST NOT be removed if they appear in a snapshot or are -reachable via traversal at that snapshot. - ---- - -## 9. Non-Goals - -ASL/TGK-EXEC-PLAN/1 does not define: - -* Runtime scheduling or parallelization strategy -* Encoding of operator plans -* Query languages or APIs -* Operator cost models +This placeholder avoids drift between repos. diff --git a/tier1/dds.md b/tier1/dds.md index 77b7da2..3b5efef 100644 --- a/tier1/dds.md +++ b/tier1/dds.md @@ -1,909 +1,5 @@ -# AMDUAT-DDS — Detailed Design Specification +# AMDUAT-DDS — moved -Status: Approved | Owner: Niklas Rydberg | Version: 0.5.0 | Last Updated: 2025-11-11 | SoT: Yes -Tags: [design, cas, composition] +Canonical spec: `vendor/amduat/tier1/dds.md`. -> **Note (scope):** -> This DDS covers **Phase 01 (Kheper CAS)** byte semantics and, where necessary, the canonical **binary encodings** for higher deterministic layers (FCS/1, PCB1, FER/1, FCT/1). -> **Behavioural semantics live in SRS.** This document governs the **bytes**. - -**Normative references:** ADR-001, ADR-003, ADR-006, SRS. - ---- - -## 1 – Content ID (CID) - -**Rule.** - -``` -CID = algo_id || H("CAS:OBJ\0" || payload_bytes) -``` - -* `algo_id`: 1-byte or VARINT identifier (default `0x01` = SHA-256). -* `H`: selected hash over **exact payload bytes**. -* Domain separation prefix must be present verbatim: `"CAS:OBJ\0"`. - -**Properties.** - -* Deterministic: identical payload → identical CID. -* Implementation-independent (SRS NFR-001). -* Crypto-agile via `algo_id`. - -**Errors.** - -* `ERR_ALGO_UNSUPPORTED` when `algo_id` not registered. -* Empty payload is allowed and canonical. - ---- - -## 2. Canonical Object Record (COR/1) - -COR/1 is the **only** canonical import/export envelope for CAS objects. Exact bytes are consensus; on-disk layout is not. - -### 2.1 Envelope Layout (exact bytes) - -``` -Header (7 bytes total): - MAGIC : 4 bytes = "CAS1" (0x43 0x41 0x53 0x31) - VERSION : 1 byte = 0x01 - FLAGS : 1 byte = 0x00 (reserved; MUST be 0) - RSV : 1 byte = 0x00 (reserved; MUST be 0) - -Body (strict TLV order; no padding): - 0x10 algo_id (VARINT) - 0x11 size (VARINT) - 0x12 payload (BYTES; length == size) -``` - -**Notes** - -* Fixed header invariants; any mismatch is rejection. -* No alignment/padding anywhere. - -### 2.2 Tag Semantics - -| Tag | Name | Type | Card. | Notes | -| ---: | ------- | ------ | ----: | ----------------------------------------------- | -| 0x10 | algo_id | VARINT | 1 | MUST equal algorithm used for the object’s CID. | -| 0x11 | size | VARINT | 1 | **Minimal VARINT**; MUST equal payload length. | -| 0x12 | payload | BYTES | 1 | Raw bytes; never normalized. | - -### 2.3 Canonicalization Rules (strict) - -1. **Order & uniqueness:** `0x10`, `0x11`, `0x12`, each exactly once. -2. **VARINTS:** Unsigned LEB128 **minimal** form only. -3. **BYTES:** `VARINT(len) || len bytes`, with `len == size`. -4. **No extras:** No unknown tags, no trailing bytes. -5. **Header invariants:** `MAGIC="CAS1"`, `VERSION=0x01`, `FLAGS=RSV=0x00`. -6. **Policy domain:** `size ≤ max_object_size` when enforced (ICD/1 §3). -7. **Raw byte semantics** (SRS FR-010). - -### 2.4 Decoder Validation Algorithm (normative) - -1. Validate header ⇒ else `ERR_COR_HEADER_INVALID`. -2. Read `0x10` minimal VARINT ⇒ else `ERR_COR_TAG_ORDER` / `ERR_VARINT_NON_MINIMAL`. -3. Read `0x11` minimal VARINT ⇒ same error rules. -4. Read `0x12` BYTES (length minimal VARINT) ⇒ else `ERR_VARINT_NON_MINIMAL`. -5. Enforce `size == len(payload)` ⇒ `ERR_COR_LENGTH_MISMATCH` on failure. -6. Ensure **no trailing bytes** ⇒ `ERR_TRAILING_BYTES`. -7. Recompute CID and compare ⇒ mismatch `ERR_CORRUPT_OBJECT`. - -### 2.5 Consistency with CID (normative) - -* **Export:** set `algo_id` to CID algorithm. -* **Import:** verify `algo_id` and hash component against expected CID. -* Mismatch ⇒ `ERR_ALGO_MISMATCH` / `ERR_CORRUPT_OBJECT`. - -### 2.6 Round-Trip Identity - -`import(COR/1) → export(CID)` MUST produce **byte-identical** envelope (SRS FR-005). Re-encoding is forbidden. - -### 2.7 Rejection Matrix (normative) - -| Violation | Example | Error | -| ------------------ | -------------------------------- | ------------------------- | -| Bad header | Wrong MAGIC/VERSION/FLAGS/RSV | `ERR_COR_HEADER_INVALID` | -| Unknown/extra tag | Any tag not 0x10/0x11/0x12 | `ERR_COR_UNKNOWN_TAG` | -| Out-of-order | `0x11` before `0x10` | `ERR_COR_TAG_ORDER` | -| Duplicate tag | Two `0x10` entries | `ERR_COR_DUPLICATE_TAG` | -| Non-minimal VARINT | Over-long algo/size/bytes length | `ERR_VARINT_NON_MINIMAL` | -| Length mismatch | `size != len(payload)` | `ERR_COR_LENGTH_MISMATCH` | -| Trailing bytes | Any bytes after payload | `ERR_TRAILING_BYTES` | -| Algo mismatch | `algo_id` conflicts with CID | `ERR_ALGO_MISMATCH` | -| Hash mismatch | Recomputed hash ≠ expected | `ERR_CORRUPT_OBJECT` | - ---- - -## 3. Instance Descriptor (ICD/1) - -ICD/1 publishes canonical instance configuration; its bytes are consensus. - -### 3.1 Envelope - -``` -Header: - MAGIC : "ICD1" - VERSION : 0x01 - -TLV (strict order; minimal VARINTs; no duplicates): - 0x20 algo_default (VARINT) - 0x21 max_object_size (VARINT) - 0x22 cor_version (VARINT) # 0x01 => COR/1 v1 - 0x23 gc_policy_id (VARINT; 0 if none) - 0x24 impl_id (BYTES; optional build/impl descriptor CID) -``` - -### 3.2 Derived Identity - -``` -instance_id = SHA-256("CAS:ICD\0" || bytes(ICD/1)) -``` - -**Rules:** Ordering/minimal VARINTs mirror COR/1. Exporters preserve canonical bytes; `instance_id` is stable. - ---- - -## 4. Encodings - -* **VARINT (unsigned LEB128)** — minimal form only; else `ERR_VARINT_NON_MINIMAL`. -* **BYTES** — `VARINT(length) || length bytes`. -* **Fixed-width integers** — big-endian if present. -* **No padding/alignment** in canonical encodings. - ---- - -## 5. Algorithm Registry - -**Default** - -* `0x01` → SHA-256 - -**Reserved** - -* `0x02` → SHA-512/256 -* `0x03` → BLAKE3 - -**Policy** - -* New entries require ADR + test vectors. Backward compatible by design. - ---- - -## 6. Filesystem Considerations (Informative) - -``` -cas/ -├─ sha256/ -│ ├─ aa/.. # fan-out by CID prefix (implementation detail) -│ └─ ff/.. -└─ amduat/ - └─ / - ├─ amduatcas - ├─ sha256/.. # private runtime state; never a put() target - ├─ interface/ - │ └─ libamduatcas.current - ├─ HEAD - └─ meta/ -``` - -**Rule:** Public CAS API acts only on `cas/sha256/`. The per-instance subtree is private and MUST NOT receive `put()` writes. - ---- - -## 7. Error Conditions & Higher-Layer Layouts (Normative) - -### 7.1 COR/1 & ICD/1 Enforcement (codes) - -* `ERR_COR_HEADER_INVALID`, `ERR_COR_UNKNOWN_TAG`, `ERR_COR_TAG_ORDER`, `ERR_COR_DUPLICATE_TAG`, - `ERR_COR_LENGTH_MISMATCH`, `ERR_VARINT_NON_MINIMAL`, `ERR_ALGO_UNSUPPORTED`, - `ERR_ALGO_MISMATCH`, `ERR_TRAILING_BYTES`, `ERR_CORRUPT_OBJECT`. - ---- - -### 7.2 FCS/1 Descriptor Layout — v1-min (Normative) - -> **Design principle:** *FCS/1 describes the deterministic execution recipe only.* -> Intent, roles, scope, authority, and registry policy are **not** encoded in FCS; they are captured at **certification time** in FCT/1. - -Header: `MAGIC="FCS1" VERSION=0x01 FLAGS=RSV=0x00` - -| Tag | Field | Type | Card. | Notes | -| ---: | ----------------- | ------ | ----: | ------------------------------------------ | -| 0x30 | `function_ptr` | CID | 1 | FPS/1 primitive or nested FCS/1 descriptor | -| 0x31 | `parameter_block` | CID | 1 | CID of PCB1 parameter block | -| 0x32 | `arity` | VARINT | 1 | Expected parameter slots | - -**Validation rules** - -1. Strict TLV order; duplicates/out-of-order → `ERR_FCS_TAG_ORDER`. -2. `parameter_block` MUST be valid PCB1 → `ERR_FCS_PARAMETER_FORMAT`. -3. `arity` MUST match slot count → `ERR_PCB_ARITY_MISMATCH`. -4. Descriptor graph MUST be acyclic → `ERR_FCS_CYCLE_DETECTED`. -5. **Any unknown or legacy governance tag** (`registry_policy 0x33`, `intent_vector 0x34`, `provenance_edge 0x35`, `notes 0x36`, or unregistered fields) → `ERR_FCS_UNKNOWN_TAG`. Such tags MUST never be tolerated in canonical streams. - ---- - -### 7.3 PCB1 Parameter Blocks (Normative) - -PCB1 payloads are COR/1 envelopes with header `MAGIC="PCB1"`, `VERSION=0x01`, `FLAGS=RSV=0x00`. - -| Tag | Field | Type | Notes | -| ---: | --------------- | ----- | ----------------------------------------------------- | -| 0x50 | `slot_manifest` | BCF/1 | Canonical slot descriptors `{index,name,type,digest}` | -| 0x51 | `slot_data` | BYTES | Packed slot bytes respecting manifest order | - -**Rules:** -Slots appear in ascending `index`. Numeric slots default to `0` when omitted. -Digest mismatches ⇒ `ERR_PCB_DIGEST_MISMATCH`. Non-deterministic ordering ⇒ `ERR_PCB_MANIFEST_ORDER`. -Arity mismatch vs FCS/1 ⇒ `ERR_PCB_ARITY_MISMATCH`. - ---- - -### 7.4 **FER/1 Receipt Layout (Normative)** - -FER/1 receipts reuse COR/1 framing with header `"FER1"` and are byte-deterministic. - -**Strict TLV order (no padding):** - -| Tag | Field | Type | Cardinality | Notes | -| ---- | --------------------- | ----------- | ----------- | ----- | -| 0x40 | `function_cid` | CID | 1 | Evaluated FCS/1 descriptor (must decode to v1-min). | -| 0x41 | `input_manifest` | CID | 1 | MUST decode to GS/1 BCF/1 set list (deduped, byte-lexicographic). | -| 0x42 | `environment` | CID | 1 | ICD/1 snapshot or PH03 environment capsule. | -| 0x43 | `evaluator_id` | BYTES | 1 | Stable evaluator identity (DID/descriptor CID). | -| 0x44 | `executor_set` | BCF/1 map | 1 | Map of executors → impl metadata (language/version/build); keys sorted. | -| 0x4F | `executor_fingerprint`| CID | 0–1 | SBOM/attestation CID feeding `run_id`; REQUIRED when `run_id` present. | -| 0x45 | `output_cid` | CID | 1 | Canonical output CID (single-output invariant). | -| 0x46 | `parity_vector` | BCF/1 list | 1 | Sorted by executor key; each entry carries `{executor, output, digest, sbom_cid}`. | -| 0x47 | `logs` | LIST | 0–1 | Typed log capsules (`kind`, `cid`, `sha256`). | -| 0x51 | `determinism_level` | ENUM | 0–1 | `"D1_bit_exact"` (default) or `"D2_numeric_stable"`. | -| 0x50 | `rng_seed` | BYTES | 0–1 | 0–32 byte seed REQUIRED when determinism ≠ D1. | -| 0x52 | `limits` | BCF/1 map | 0–1 | Resource envelope (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`). | -| 0x48 | `started_at` | UINT64 | 1 | Epoch seconds (FR-020 start bound). | -| 0x49 | `completed_at` | UINT64 | 1 | Epoch seconds ≥ `started_at`. | -| 0x53 | `parent` | CID | 0–1 | Optional lineage pointer for follow-up runs. | -| 0x4A | `context` | BCF/1 map | 0–1 | Optional scheduling hooks (WT/1 ticket, TA/1 branch tip, notes ref). | -| 0x4B | `witnesses` | BCF/1 list | 0–1 | Optional observer descriptors / co-signers. | -| 0x4E | `run_id` | BYTES[32] | 0–1 | Deterministic dedup anchor (`H("AMDUAT:RUN\0" || function || manifest || env || fingerprint)`). | -| 0x4C | `signature` | BCF/1 map | 1 | Primary Ed25519 signature over `H("AMDUAT:FER\0" || canonical bytes)`. | -| 0x4D | `signature_ext` | BCF/1 list | 0–1 | Reserved slot for multi-sig / threshold proofs (future). | - -**Validation:** - -1. TLV order strict; unknown tags ⇒ `ERR_FER_TAG_ORDER` / `ERR_FER_UNKNOWN_TAG`. -2. `function_cid` must decode to valid FCS/1 ⇒ `ERR_FER_FUNCTION_MISMATCH` otherwise. -3. `input_manifest` MUST decode to GS/1 set list (deduped + byte-lexicographic). Violations ⇒ `ERR_FER_INPUT_MANIFEST_SHAPE`. -4. `executor_set` keys MUST be byte-lexicographic and align with `parity_vector` entries. Ordering mismatches ⇒ `ERR_IMPL_PARITY_ORDER`; missing executors or divergent outputs ⇒ `ERR_IMPL_PARITY`. -5. Each parity entry MUST declare `sbom_cid` referencing the executor’s mini-SBOM CID. -6. `determinism_level` defaults to `D1_bit_exact`; when set to any other value a 0–32 byte `rng_seed` is REQUIRED ⇒ `ERR_FER_RNG_REQUIRED`. -7. `limits` (when present) MUST supply non-negative integers for `cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`. -8. `logs` (when present) MUST contain objects with `kind ∈ {stderr, stdout, metrics, trace}`, `cid`, and `sha256` (both 32-byte hex strings). -9. `run_id` (when present) MUST equal `H("AMDUAT:RUN\0" || function_cid || manifest_cid || environment_cid || executor_fingerprint)`; missing fingerprint ⇒ `ERR_FER_UNKNOWN_TAG`. -10. `completed_at < started_at` ⇒ `ERR_FER_TIMESTAMP` (FR-020 envelope enforcement). -11. Signatures MUST verify against `H("AMDUAT:FER\0" || canonical bytes)` ⇒ failure ⇒ `ERR_FER_SIGNATURE`. - -> **Manifest note:** `input_manifest` bytes MUST be the GS/1 canonical list; ingestion MUST reject producer-specific ordering. -> **Log capsule note:** `logs` entries bind `kind`, `cid`, and `sha256` together to avoid stdout/stderr hash confusion. -> **Dedup note:** `run_id` enables idempotent FER ingestion across registries while keeping the FER CID authoritative. -> **Provenance note:** FER/1 remains the exclusive home for run-time provenance and parity outcomes; governance stays in FCT/1. - -> **Graph note:** Ingestors emit `realizes`, `produced_by`, `consumed_by`, and (optionally) `fulfills` edges based solely on FER content. - ---- - -### 7.5 **FCT/1 Transaction Envelope (Normative)** - -> **Design principle:** *FCT/1 is the canonical home for **intent**, **domain scope**, **roles/authority**, and **policy snapshot*** captured at certification/publication time. - -FCT/1 serializes as ADR-003 BCF/1 map with canonical keys: - -| Key | Type | Notes | -| --------------------- | ----------- | ------------------------------------------------------- | -| `fct.version` | UINT8 | MUST be `1` | -| `fct.registry_policy` | UINT8 | Publication policy snapshot (0=Open,1=Curated,2=Locked) | -| `fct.function` | CID | Certified FCS/1 descriptor | -| `fct.receipts` | LIST | One or more FER/1 CIDs | -| `fct.authority_role` | ENUM | ADR-010C role | -| `fct.domain_scope` | ENUM | ADR-010B scope | -| `fct.intent` | SET | ADR-010 intents | -| `fct.constraints` | LIST | Optional constraint set | -| `fct.attestations` | LIST | Required when policy ≠ Open | -| `fct.timestamp` | UINT64 | Epoch seconds | -| `fct.publication` | CID | Optional ADR-007 digest | - -**Validation:** - -1. All receipts reference the same `function_cid` ⇒ else `ERR_FCT_RECEIPT_MISMATCH`. -2. If `registry_policy ≠ 0` then `attestations` **required** ⇒ `ERR_FCT_ATTESTATION_REQUIRED`. -3. All signatures/attestations verify ⇒ `ERR_FCT_SIGNATURE` on failure. -4. Receipt timestamps must be monotonic ⇒ `ERR_FCT_TIMESTAMP`. - ---- - -### 7.6 FPD/1 Publication Digest (Normative) - -> **Design principle:** *Federation publishes exactly one deterministic digest per event (ADR-007, SRS FR-022).* - -FPD/1 serializes as an ADR-003 BCF/1 map with canonical keys: - -| Key | Type | Notes | -| --------------- | ---------- | --------------------------------------------------------------------- | -| `fpd.version` | UINT8 | MUST be `1`. | -| `fpd.members` | LIST | Deterministic, byte-lexicographic list of member artefact CIDs. | -| `fpd.parent` | CID (opt) | Previous FPD/1 digest for the domain publication chain (or `null`). | -| `fpd.timestamp` | UINT64 | Epoch seconds aligned with `fct.timestamp` monotonic ordering. | -| `fpd.digest` | CID | Canonical digest over `{FCT/1 bytes, FER/1 receipts, governance edges}`. | - -**Construction:** - -1. Normalize and sign the FCT/1 record (per §7.5) writing canonical bytes to the payload area (PA). -2. Collect referenced FER/1 receipts and governance edges (`certifies`, `attests`, `publishes`) as canonical byte arrays. -3. Build `fpd.members` as the byte-lexicographic list of CIDs for the certified FCT/1 record, every FER/1 receipt, and the edge batch capsule. -4. Hash the concatenated canonical payloads using the federation digest algorithm (default `CIDv1/BCF`). Persist the resulting bytes and record the CID in `fpd.digest`. -5. If a prior publication exists, set `fpd.parent` to the previous digest CID; otherwise omit. -6. Emit the FPD/1 map, persist alongside the FCT/1 payload under `/logs/ph03/evidence/fct/`, and update `fct.publication` with the FPD/1 CID. - -**Validation:** - -* `fpd.members` MUST include exactly one FCT/1 CID and the full set of FER/1 receipt CIDs referenced by that transaction. -* Recomputing the digest from the persisted canonical payloads MUST yield `fpd.digest`; mismatches ⇒ `ERR_FPD_DIGEST` (registered under ADR-006). -* `fpd.timestamp` MUST be ≥ the largest FER/1 `completed_at` and ≥ the prior `fpd.timestamp` when `fpd.parent` is present ⇒ violations raise `ERR_FPD_TIMESTAMP`. -* Graph emitters MUST log governance edges via `lib/g1-emitter/` using the canonical digests referenced above. - -> **Graph note:** Publication surfaces emit `publishes(fct,fpd)` edges binding certification state to digest lineage for PH04 FLS/1 integration. - -### 7.7 Error Surface Registration (consolidated) - -All FCS/1, PCB1, FER/1, and FCT/1 errors map to ADR-006. -Additions since v0.3.0: - -| Code | Meaning | -| --------------------- | -------------------------------------------------------------------------------------- | -| `ERR_FCS_UNKNOWN_TAG` | Descriptor contained a tag outside the v1-min set (`0x30-0x32`). Rejected per ADR-006. | -| `ERR_EXEC_TIMEOUT` | Executor exceeded deterministic time envelope (Maat’s Balance). | -| `ERR_IMPL_PARITY` | Executor outputs/parity metadata diverged (missing executor, mismatched `output_cid`). | -| `ERR_IMPL_PARITY_ORDER` | Parity vector ordering did not match the canonical executor ordering. | -| `ERR_FER_UNKNOWN_TAG` | FER/1 payload contained an unknown tag or cardinality violation. | -| `ERR_FER_INPUT_MANIFEST_SHAPE` | `input_manifest` failed GS/1 set decoding (not deduped or unsorted). | -| `ERR_FER_RNG_REQUIRED` | `determinism_level` demanded an `rng_seed` but none was provided. | -| `ERR_FPD_DIGEST` | Recomputed federation digest did not match `fpd.digest` (non-deterministic publication). | -| `ERR_FPD_TIMESTAMP` | Publication timestamp regressed relative to receipts or parent digest. | -| `ERR_FPD_PARENT_REQUIRED` | Policy-enforced lineage expected `fpd.parent` but none was provided. | -| `ERR_FPD_MEMBER_DUP` | Duplicate member CID detected in the canonical set ordering. | -| `ERR_WT_UNKNOWN_KEY` | WT/1 map contained a key outside the v1-min schema. | -| `ERR_WT_VERSION_UNSUPPORTED` | `wt.version` not equal to `1`. | -| `ERR_WT_INTENT_EMPTY` | `wt.intent` list empty. | -| `ERR_WT_INTENT_DUP` | Duplicate ADR-010 intents detected in `wt.intent`. | -| `ERR_WT_TIMESTAMP` | `wt.timestamp` regressed relative to the previous ticket from the same author. | -| `ERR_WT_SIGNATURE` | Signature validation over `"AMDUAT:WT\0"` failed. | -| `ERR_WT_KEY_UNBOUND` | Declared `wt.pubkey` is not authorized for `wt.author` via the predicate registry. | -| `ERR_WT_INTENT_UNREGISTERED` | `wt.intent` entry not registered in ADR-010 predicate registry. | -| `ERR_WT_SCOPE_UNAUTHORIZED` | Router policy rejected the declared domain scope. | -| `ERR_WT_PARENT_UNKNOWN` | Optional `wt.parent` reference could not be resolved. | -| `ERR_WT_PARENT_REQUIRED` | Policy required `wt.parent` but the field was omitted. | -| `ERR_SOS_UNKNOWN_KEY` | SOS/1 map contained a key outside the v1-min schema. | -| `ERR_SOS_VERSION_UNSUPPORTED` | `sos.version` not equal to `1`. | -| `ERR_SOS_PREDICATE_UNREGISTERED` | Overlay predicate not registered in the CRS predicate registry. | -| `ERR_SOS_POLICY_INCOMPATIBLE` | `sos.policy` outside `{0,1,2}` or disallowed for the deployment lane. | -| `ERR_SOS_SIGNATURE_INVALID` | Signature validation over `"AMDUAT:SOS\0"` failed. | -| `ERR_SOS_COMPAT_EVIDENCE_REQUIRED` | Compat overlays missing MPR/1 + IER/1 references. | -| `ERR_SOS_TIMESTAMP_REGRESSION` | Overlay timestamp regressed relative to policy baseline. | - -### 7.8 FLS/1 and CRS/1 Byte Semantics - -Phase 04 establishes deterministic linkage between FLS/1 envelopes and CRS/1 concept graphs. ADR-018 governs the linkage envelope; ADR-020 governs concept and relation payloads. CI harnesses (`tools/ci/run_vectors.py`, `tools/ci/gs_snapshot.py`) provide conformance evidence. - -#### 7.8.1 FLS/1 Envelope TLVs (Draft) - -> **Scope:** Draft wire image aligned with ADR-018 v0.5.0. Stewardship will finalize signature semantics alongside multi-surface publication work. - -| Tag | Field | Type | Card. | Notes | -| ------ | -------------------- | ------ | ----- | ----- | -| `0x60` | `source_cid` | CID | 1 | Deterministic sender artefact/surface. | -| `0x61` | `target_cid` | CID | 1 | Deterministic recipient artefact/surface. | -| `0x62` | `payload_cid` | CID | 1 | Content payload (COR/1 capsule, CRS/1 concept, or CRR/1 relation). | -| `0x63` | `routing_policy_cid` | CID | 0-1 | Optional deterministic policy capsule. | -| `0x64` | `timestamp` | UINT64 | 0-1 | Optional bounded timing evidence (big-endian). | -| `0x65` | `signature` | BYTES | 0-1 | Optional Ed25519 signature with `"AMDUAT:FLS\0"` domain separator. | - -**Envelope rules (draft):** - -* Header MUST present `MAGIC="FLS1"`, `VERSION=0x01`, and zeroed `FLAGS/RSV` bytes. -* TLVs MUST appear in strictly increasing tag order. Duplicate tags ⇒ `ERR_FLS_DUPLICATE_TAG`; reordering ⇒ `ERR_FLS_TAG_ORDER`. -* Unknown tags are rejected until ADR updates extend this table (`ERR_FLS_UNKNOWN_TAG`). -* CID TLVs MUST present 32-byte payloads aligned with ADR-001 ⇒ `ERR_FLS_CID_LENGTH`. -* `timestamp` MUST be exactly eight bytes (UINT64, network byte order) ⇒ `ERR_FLS_TIMESTAMP_LENGTH`. -* `signature` MUST start with `"AMDUAT:FLS\0"` and carry a 64-byte Ed25519 signature ⇒ `ERR_FLS_SIGNATURE_DOMAIN` / `ERR_FLS_SIGNATURE_LENGTH`; failing Ed25519 verification raises `ERR_FLS_SIGNATURE`. -* When supplied, CRS payload bytes MUST hash to the declared `payload_cid` using `SHA-256("CAS:OBJ\0" || payload)` ⇒ `ERR_FLS_PAYLOAD_CID_MISMATCH`. -* CRS payload headers MUST match `CRS1` (concept) or `CRR1` (relation) when linkage metadata declares the type ⇒ `ERR_FLS_PAYLOAD_KIND`. -* Payloads MAY be CRS/1 concepts or CRR/1 relations; FLS/1 envelopes never mutate CRS graphs. - -#### 7.8.2 CRS/1 Concept & Relation TLVs (Normative) - -> **Scope:** Deterministic CRS/1 byte layout as ratified by ADR-020 v1.1.0. All TLVs -> use single-byte tags + single-byte lengths with fixed 32-byte payloads. - -**Concept Header** — `MAGIC="CRS1"`, `VERSION=0x01`, `FLAGS=0x00`, `RSV=0x00`. - -| Tag | Field | Type | Card. | Notes | -| ------ | ------------------ | ---- | ----- | ----- | -| `0x40` | `description_cid` | CID | 1 | Canonical COR/1/BCF descriptor for the concept text/essence. | -| `0x41` | `relations_cid` | CID | 1 | Deterministic list CID of outbound relation CIDs. | - -**Relation Header** — `MAGIC="CRR1"`, `VERSION=0x01`, `FLAGS=0x00`, `RSV=0x00`. - -| Tag | Field | Type | Card. | Notes | -| ------ | ----------------- | ---- | ----- | ----- | -| `0x42` | `source_cid` | CID | 1 | Originating Concept CID. | -| `0x43` | `target_cid` | CID | 1 | Destination Concept or artefact CID. | -| `0x44` | `predicate_cid` | CID | 1 | Registered predicate Concept CID. | - -**Validation rules** - -* Headers MUST match the values above; mismatches reject as malformed. -* TLVs MUST appear exactly once in the order listed. Missing or out-of-order - TLVs ⇒ `ERR_CRS_TAG_ORDER` (concept) or `ERR_CRR_TAG_ORDER` (relation). -* Duplicate relation tags ⇒ `ERR_CRR_DUPLICATE_TAG`. -* TLV payloads MUST be exactly 32 bytes ⇒ `ERR_CRS_LENGTH_MISMATCH` / `ERR_CRR_LENGTH_MISMATCH`. -* Unknown tags are rejected ⇒ `ERR_CRS_UNKNOWN_TAG` / `ERR_CRR_UNKNOWN_TAG`. -* `predicate_cid` MUST reference a CRS Concept (`ERR_CRR_PREDICATE_NOT_CONCEPT`). When a predicate taxonomy exists, predicates MUST declare `is_a → Predicate` (`ERR_CRR_PREDICATE_CLASS_MISSING`). - -**Error mapping (ADR-006)** - -| Code | Condition | -| ---- | --------- | -| `ERR_CRS_TAG_ORDER` | Concept TLVs missing, duplicated, or out of order. | -| `ERR_CRS_LENGTH_MISMATCH` | Concept TLV payload not exactly 32 bytes. | -| `ERR_CRS_UNKNOWN_TAG` | Concept TLV tag outside `0x40–0x41`. | -| `ERR_CRR_TAG_ORDER` | Relation TLVs missing, duplicated, or out of order. | -| `ERR_CRR_LENGTH_MISMATCH` | Relation TLV payload not exactly 32 bytes. | -| `ERR_CRR_UNKNOWN_TAG` | Relation TLV tag outside `0x42–0x44`. | -| `ERR_CRR_DUPLICATE_TAG` | Duplicate relation TLV encountered. | -| `ERR_CRR_PREDICATE_NOT_CONCEPT` | `predicate_cid` did not resolve to a CRS Concept. | -| `ERR_CRR_PREDICATE_CLASS_MISSING` | Predicate Concept missing `is_a → Predicate` taxonomy edge. | - -**CID derivation** - -``` -concept_cid = SHA-256("CAS:OBJ\0" || bytes(CRS/1 concept record)) -relation_cid = SHA-256("CAS:OBJ\0" || bytes(CRR/1 relation record)) -``` - -Byte-identical records MUST yield identical CIDs; any mutation requires a new -record. - -### 7.9 WT/1 Audited Ticket Intake (Normative) - -WT/1 (ADR-023) captures auditable intent-to-change tickets as an ADR-003 BCF/1 -map. Keys are UTF-8 strings sorted lexicographically; values use canonical BCF -types. - -| Key | Type | Cardinality | Notes | -| -------------- | ----------------- | ----------- | ----- | -| `wt.version` | UINT8 | 1 | MUST equal `1`. | -| `wt.author` | CID (hex string) | 1 | CRS Concept or DID capsule representing the submitting actor. | -| `wt.scope` | CID (hex string) | 1 | ADR-010B domain scope concept CID. | -| `wt.intent` | LIST | 1 | Non-empty ADR-010 intent identifiers; deduped and byte-lexicographically sorted. | -| `wt.payload` | CID (hex string) | 1 | CRS manifest, change plan, or opaque payload describing proposed work. | -| `wt.timestamp` | UINT64 | 1 | Epoch seconds; MUST be monotonic per `wt.author`. | -| `wt.pubkey` | BYTES[32] | 1 | Ed25519 public key used to verify `wt.signature`; MUST bind to `wt.author`. | -| `wt.signature` | BYTES[64] | 1 | Ed25519 signature over `H("AMDUAT:WT\0" || canonical_bytes_without_signature)`. | -| `wt.parent` | CID (hex string) | 0–1 | Optional lineage pointer to the previous WT/1 ticket for the same author. | - -**Encoding rules** - -1. `wt.intent` MUST be encoded as a list of unique UTF-8 strings sorted - lexicographically; duplicates ⇒ `ERR_WT_INTENT_DUP`; entries not registered in - ADR-010 ⇒ `ERR_WT_INTENT_UNREGISTERED`. -2. CIDs serialize as lowercase hex strings (32 bytes → 64 hex chars) matching - `SHA-256("CAS:OBJ\0" || payload)` outputs. -3. `wt.signature` is a 64-byte Ed25519 signature; `wt.pubkey` supplies the - 32-byte verification key. The signature domain-separates with - `"AMDUAT:WT\0"` and excludes the `wt.signature` field from the canonical byte - stream hashed for verification. - -**Validation** - -1. Unknown keys ⇒ `ERR_WT_UNKNOWN_KEY`. -2. `wt.version != 1` ⇒ `ERR_WT_VERSION_UNSUPPORTED`. -3. Empty `wt.intent` ⇒ `ERR_WT_INTENT_EMPTY`. -4. `wt.timestamp` less than the prior accepted ticket for the same `wt.author` - ⇒ `ERR_WT_TIMESTAMP`. When `wt.parent` is provided, its timestamp MUST NOT - exceed the child timestamp; violations ⇒ `ERR_WT_TIMESTAMP`. -5. Signature verification failure ⇒ `ERR_WT_SIGNATURE`. -6. Routers MUST verify `has_pubkey(wt.author, wt.pubkey)` (or registered - equivalent) ⇒ missing edge raises `ERR_WT_KEY_UNBOUND`. -7. Unknown ADR-010 intent ⇒ `ERR_WT_INTENT_UNREGISTERED`. -8. Router policy rejection of `wt.scope` ⇒ `ERR_WT_SCOPE_UNAUTHORIZED`. -9. Provided `wt.parent` that cannot be resolved ⇒ `ERR_WT_PARENT_UNKNOWN`. -10. Policy required lineage but omitted `wt.parent` ⇒ `ERR_WT_PARENT_REQUIRED`. - -**Router integration** - -* `POST /wt` (Protected Area) accepts WT/1 payloads, verifies signatures against - `wt.pubkey`, enforces ADR-010 intent membership, validates optional - `wt.parent` lineage, and rejects timestamp regressions. -* `GET /wt/:cid` returns canonical WT/1 bytes for replay. -* `GET /wt?after=&limit=` paginates deterministically by CID - (byte-lexicographic). `after` is an exclusive bound; routers enforce - `1 ≤ limit ≤ Nmax` and MUST preserve stable replay windows. -* Responses MUST include canonical WT/1 bytes; no rewriting or reformatting is - permitted. - -**Evidence & vectors** - -* `/amduat/logs/ph04/evidence/wt1/PH04-EV-WT-001/summary.md` — validator run linking - router behaviour to vectors. -* `/amduat/vectors/ph04/wt1/` — fixtures `TV-WT-001…009` covering success, - unknown key, signature failure, timestamp regression, key unbound, intent - unregistered, parent timestamp inversion, scope policy rejection, and - unresolved parent lineage. - -### 7.10 CT/1 Header (Normative) - -CT/1 headers serialize as ADR-003 BCF/1 maps with fixed key ordering. Keys and -types: - -| Key | Type | Notes | -| --------------------- | -------- | ----- | -| `ct.version` | `UINT8` | MUST equal `1`. | -| `ct.rcs_version` | `UINT8` | RCS/1 core schema version; MUST equal `1`. | -| `ct.topology` | `CID` | CRS/1 topology or manifest CID. | -| `ct.ac` | `CID` | AC/1 descriptor CID (ADR-028). | -| `ct.dtf` | `CID` | DTF/1 policy CID (ADR-028). | -| `ct.determinism_level`| `UINT8` | `0` = D1 (bit-exact), `1` = D2 (numeric stable). | -| `ct.kernel_cfg` | `CID` | Opaque kernel/tolerance configuration manifest. | -| `ct.tick` | `UINT64` | Monotonically increasing replay sequence number. | -| `ct.signature` | `BYTES` | 64-byte Ed25519 signature payload. | - -**Validation** - -1. BCF decode failures ⇒ `ERR_CT_MALFORMED`. -2. Key set/order mismatches ⇒ `ERR_CT_UNKNOWN_KEY`. -3. `ct.version` or `ct.rcs_version` ≠ `1` ⇒ `ERR_CT_VERSION`. -4. `ct.determinism_level ∉ {0,1}` ⇒ `ERR_CT_DET_LEVEL`. -5. Non-canonical CID strings ⇒ `ERR_CT_CID`. -6. `ct.tick` outside `UINT64` range or non-monotone progression ⇒ - `ERR_CT_FIELD_TYPE` / `ERR_CT_TICK`. -7. `ct.signature` length mismatch or Ed25519 verification failure ⇒ - `ERR_CT_SIGNATURE`. - -**Signature rules** - -`ct.signature` signs `H("AMDUAT:CT\0" || canonical_bytes_without_signature)`. Public -keys are registered in the determinism catalogue (this section) and referenced by -`ct.kernel_cfg` as needed for tolerance disclosure. - -**Evidence & vectors** - -* `/amduat/tools/validate/ct1_validator.py` — validation helper covering CT/1, - AC/1, and DTF/1 schemas. -* `/amduat/vectors/ph05/ct1/` — fixtures `TV-CT1-001…004`, `TV-AC1-001…002`, - `TV-DTF1-001…002`. -* `/amduat/tools/ci/ct_replay.py` — replay harness producing - `/amduat/logs/ph05/evidence/ct1/PH05-EV-CT1-REPLAY-001/` (D1 parity + D2 - tolerance runs). - -### 7.11 SOS/1 Semantic Overlays (Normative) - -SOS/1 (ADR-024) attaches typed overlays to CRS Concepts or Relations via an -ADR-003 BCF/1 map signed with the `"AMDUAT:SOS\0"` domain separator. - -| Key | Type | Cardinality | Notes | -| -------------- | ------------ | ----------- | ----- | -| `sos.version` | UINT8 | 1 | MUST equal `1`. | -| `sos.subject` | CID (hex) | 1 | CRS Concept or Relation CID receiving the overlay. | -| `sos.predicate`| CID (hex) | 1 | Registered predicate concept describing overlay semantics. | -| `sos.value` | CID (hex) | 1 | Opaque payload (text capsule, BCF/1 manifest, etc.). | -| `sos.policy` | ENUM | 1 | `0=open`, `1=curated`, `2=compat`. | -| `sos.timestamp`| UINT64 | 1 | Epoch seconds when authored. | -| `sos.signature`| BYTES[64] | 1 | Ed25519 signature over `H("AMDUAT:SOS\0" || canonical_bytes_without_signature)`. | - -**Validation** - -1. Unknown keys ⇒ `ERR_SOS_UNKNOWN_KEY`. -2. `sos.version != 1` ⇒ `ERR_SOS_VERSION_UNSUPPORTED`. -3. `sos.predicate` MUST resolve to a registered CRS predicate ⇒ - `ERR_SOS_PREDICATE_UNREGISTERED`. -4. `sos.policy` outside `{0,1,2}` or disallowed for deployment ⇒ - `ERR_SOS_POLICY_INCOMPATIBLE`. -5. Epoch-second timestamps that regress relative to policy baseline MAY raise - `ERR_SOS_TIMESTAMP_REGRESSION`. -6. Signature verification failure ⇒ `ERR_SOS_SIGNATURE_INVALID`. -7. Compat overlays (`sos.policy = 2`) MUST reference MPR/1 + IER/1 artefacts in - certification evidence ⇒ missing references raise - `ERR_SOS_COMPAT_EVIDENCE_REQUIRED`. - -**Router integration** - -* `POST /sos` (Protected Area) validates predicate registry membership, policy - lane, timestamp discipline, and signatures. -* `GET /sos/:cid` returns canonical SOS/1 bytes for replay. -* `GET /sos?subject=&after=&limit=` paginates overlays - deterministically by CID with stable replay windows. -* Compat responses MUST surface referenced MPR/1 hashes and IER/1 fingerprints - for auditors. - -**Evidence & vectors** - -* `/amduat/logs/ph04/evidence/sos1/PH04-EV-SOS-001/summary.md` — validator run covering - `TV-SOS-001…006`. -* `/amduat/vectors/ph04/sos1/` — canonical overlay fixtures exercising success, - unregistered predicate, policy mismatch, signature failure, timestamp - regression, and compat evidence gaps. - -### 7.12 MPR/1 Model Provenance (Normative) - -MPR/1 (ADR-025 v1.0.0) captures canonical model fingerprint triples for compat -policy lanes. - -| Key | Type | Cardinality | Notes | -| ------------------ | ------------ | ----------- | ----- | -| `mpr.version` | UINT8 | 1 | MUST equal `1`. | -| `mpr.model_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of model artefact. | -| `mpr.weights_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of weights bundle. | -| `mpr.tokenizer_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of tokenizer assets. | -| `mpr.build_info` | CID *(optional)* | 0..1 | Immutable build metadata capsule. | -| `mpr.signature` | BYTES[64] *(optional)* | 0..1 | Ed25519 signature over `"AMDUAT:MPR\0" || canonical_bytes_without_signature`. | - -**Validation** - -1. Unknown keys ⇒ `ERR_MPR_UNKNOWN_KEY`. -2. `mpr.version != 1` ⇒ `ERR_MPR_VERSION`. -3. Missing hash fields ⇒ `ERR_MPR_MISSING_FIELD`. -4. Hash fields not lowercase hex (≥64) ⇒ `ERR_MPR_HASH_FORMAT`; zero digests ⇒ `ERR_MPR_HASH_ZERO`. -5. `mpr.build_info` malformed ⇒ `ERR_MPR_BUILD_INFO`. -6. Signature verification failure ⇒ `ERR_MPR_SIGNATURE`. - -**Evidence & vectors** - -* `/amduat/logs/ph04/evidence/mpr1/PH04-EV-MPR-001/pass.jsonl` — validator harness (`python tools/ci/run_mpr_vectors.py`) covering `TV-MPR-001…003` with summary in `summary.md`. -* `/amduat/vectors/ph04/mpr1/` — fixtures exercising valid record, missing weights hash, and signature domain mismatch. - -### 7.13 IER/1 Inference Evidence (Normative) - -IER/1 (ADR-026 v1.0.0) binds FER/1 receipts to compat policy envelopes and MPR/1 fingerprints. - -| Key | Type | Cardinality | Notes | -| ------------------------ | --------------- | ----------- | ----- | -| `ier.version` | UINT8 | 1 | MUST equal `1`. | -| `ier.fer_cid` | CID | 1 | Referenced FER/1 receipt. | -| `ier.executor_fingerprint` | CID | 1 | MUST equal linked MPR/1 CID. | -| `ier.determinism_level` | ENUM | 1 | FER/1 determinism indicator. | -| `ier.rng_seed` | HEX *(conditional)* | 0..1 | Required (hex) when determinism ≠ `D1`. | -| `ier.policy_cid` | CID | 1 | Compat policy capsule authorising run. | -| `ier.log_digest` | HEX | 1 | `H("AMDUAT:IER:LOG\0" || concat(log.sha256))`. | -| `ier.log_manifest` | MAP *(optional)* | 0..1 | Non-empty list of log entries with `sha256`. | -| `ier.attestations` | LIST *(optional)* | 0..1 | Policy attestations (Ed25519 signatures). | - -**Validation** - -1. Unknown keys ⇒ `ERR_IER_UNKNOWN_KEY`. -2. `ier.version != 1` ⇒ `ERR_IER_VERSION`. -3. Malformed CIDs ⇒ `ERR_IER_POLICY`. -4. `ier.executor_fingerprint` mismatch ⇒ `ERR_IER_FINGERPRINT`. -5. Missing RNG seed when determinism ≠ `D1` ⇒ `ERR_FER_RNG_REQUIRED`. -6. `ier.log_digest` mismatch or malformed manifest ⇒ `ERR_IER_LOG_HASH` / `ERR_IER_LOG_MANIFEST`. -7. Attestation payloads not raw bytes ⇒ `ERR_IER_MALFORMED`. - -**Evidence & vectors** - -* `/amduat/logs/ph04/evidence/ier1/PH04-EV-IER-001/pass.jsonl` — validator harness (`python tools/ci/run_ier_vectors.py`) covering `TV-IER-001…004` with manifest summary in `summary.md`. -* `/amduat/vectors/ph04/ier1/` — fixtures exercising success, missing RNG seed, fingerprint mismatch, and log digest mismatch. - ---- - -## 8 – Test Vectors & Conformance - -### 8.1 COR/1 & ICD/1 - -* Payload → CID (algo `0x01`). -* COR/1 streams → CID and back (round-trip identity). -* ICD/1 → `instance_id`. - -### 8.2 FCS/1 v1-min - -* Positive: `{0x30,0x31,0x32}` only, strict order, valid PCB1, acyclic. -* Negative: any pre-v1-min tags (`0x33/0x34/0x35/0x36`) ⇒ reject per §7.2. -* Arity/PCB mismatch ⇒ `ERR_PCB_ARITY_MISMATCH`. -* Cycle ⇒ `ERR_FCS_CYCLE_DETECTED`. -* Negative: legacy tags (`0x33-0x36`) → `ERR_FCS_UNKNOWN_TAG` per §7.2. - -### 8.3 FER/1 - -* Signed receipt with monotonic timestamps; verify signature, executor set ↔ parity alignment, and linkage to FCS/1. -* Negative: timestamp inversion ⇒ `ERR_FER_TIMESTAMP`; bad signature ⇒ `ERR_FER_SIGNATURE`. -* Negative: parity drift (mismatched executor keys or output digests) ⇒ `ERR_IMPL_PARITY`. -* Negative: unknown TLV tag/cardinality ⇒ `ERR_FER_UNKNOWN_TAG`. - -### 8.4 FCT/1 - -* Multiple FER/1 receipts for same function; verify attestation coverage by policy. -* Negative: mismatched receipt function ⇒ `ERR_FCT_RECEIPT_MISMATCH`. -* Negative: missing attestation when policy ≠ Open ⇒ `ERR_FCT_ATTESTATION_REQUIRED`. - -### 8.5 FPD/1 - -* Deterministic reconstruction of `fpd.digest` over `{FCT/1 bytes, FER/1 receipts, governance edge capsule}` on repeated runs. -* Negative: perturbation of member ordering ⇒ `ERR_FPD_DIGEST`. -* Negative: timestamp regression versus FER receipts or parent digest ⇒ `ERR_FPD_TIMESTAMP`. - -**CI Requirements** - -* Import/export **byte-identity** round-trip for COR/1/FCS/1/FER/1. -* Canonical TLV/BCF ordering across descriptors. -* Multi-platform reproducibility (≥3) including signature verification parity. -* Timing evidence captured per SRS FR-020 (deterministic envelope). -* Federation digest fixture verifies stable FPD/1 CID under `tools/ci/fct_publish_check.py`. - ---- - -## 9. Security Considerations - -* Domain separation strings MUST be exact. -* Hash **exact payload bytes**, never decoded structures. -* Canonical rejection prevents ambiguous encodings. -* Certification places policy/intent in signed FCT/1, not in execution recipes. - ---- - -## 10. Change Management - -* **Behavioural semantics are in SRS.** -* Changes here require ADR + CCP approval. -* Versioning follows semantic versioning of encodings. -* On approval, update IDX and SRS references accordingly. - ---- - -## 11. ByteStore API & Persistence Discipline - -ByteStore is the canonical persistence boundary layered over COR/1 and ICD/1. -Implementations **must** honour the behaviours in this section; deviations are -governed by ADR-030. - -### 11.1 API Surface - -| API | Signature | Behaviour | Error Surfaces (ADR-006) | -| -------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------------------------------- | -| `put` | `(payload: bytes) → cid_hex` | Persist raw payload under CID derived from `H("CAS:OBJ\0" || payload)`. | `ERR_POLICY_SIZE`, `ERR_IDENTITY_MISMATCH` | -| `put_stream` | `(chunks: Iterable[bytes]) → cid_hex` | Deterministic chunked ingest; concatenated bytes hash to the same CID as `put`. | `ERR_STREAM_ORDER`, `ERR_STREAM_TRUNCATED` | -| `import_cor` | `(envelope: bytes) → cid_hex` | Validate COR/1, enforce policy, persist canonical envelope without re-encoding. | `ERR_POLICY_SIZE`, COR/1 decoder errors | -| `export_cor` | `(cid_hex: str) → envelope` | Return stored COR/1 bytes; must match the original import byte-for-byte. | `ERR_STORE_MISSING`, `ERR_IDENTITY_MISMATCH` | -| `get` | `(cid_hex: str) → bytes` | Return stored bytes (payload or COR envelope) exactly as persisted. | `ERR_STORE_MISSING` | -| `stat` | `(cid_hex: str) → {present: bool, size: int}` | Probe object presence and payload/envelope size without mutating state. | `ERR_STORE_MISSING` (absence reported via `present`) | -| `assert_area_isolation` | `(public_root: Path, secure_root: Path) → None` | Enforce SA/PA separation; raise if roots overlap or share ancestry. | `ERR_AREA_VIOLATION` | - -### 11.2 Deterministic Identity - -Canonical identity is derived per COR/1/SRS: - -``` -cid = algo_id || H("CAS:OBJ\0" || payload) -``` - -`algo_id` defaults to `0x01` (SHA-256). ByteStore **must** reuse the exact -domain separator and hash to remain compatible with CAS and DDS §1. - -### 11.3 COR/1 Round-Trip Identity - -`import_cor()` decodes the envelope, enforces policy (size ≤ ICD/1 -`max_object_size`), and persists the canonical bytes. `export_cor()` returns the -exact stored envelope; re-encoding is forbidden. Derived CID **must** equal the -envelope’s CID (DDS §2.5, SRS FR-BS-004). - -### 11.4 Atomic fsync Ladder - -All writes follow the deterministic ladder: - -1. Write payload/envelope to a unique `.tmp-` file in the shard. -2. `fsync(tmp)` to guarantee payload durability. -3. `rename(tmp, final)`. -4. `fsync(shard directory)` and then `fsync(ByteStore root)`. - -Crash-window simulation is exposed via `AMDUAT_BYTESTORE_CRASH_STEP` (“before_rename”). -Implementations **must** honour the hook and leave PA consistent on recovery -(DDS §11.8; vectors TV-BS-005, evidence bundle PH05-EV-BS-001). - -### 11.5 SA/PA Isolation & Pathing - -Public area (PA) payloads live under case-stable two-level fan-out (`/aa/bb/cid…`). -Secure area (SA) metadata is held outside the PA tree. `assert_area_isolation()` -enforces: - -* `public_root != secure_root` -* neither root is an ancestor of the other - -Violations raise `ERR_AREA_VIOLATION` and **must** be surfaced by callers. - -### 11.6 Chunked Ingest Determinism & Policy - -`put_stream()` concatenates byte chunks in order, rejecting non-bytes input or -missing data. The resulting CID **must** equal `put(payload)` for the same -payload (SRS FR-BS-005). ByteStore enforces ICD/1 `max_object_size` prior to -persisting data; exceeding the limit raises `ERR_POLICY_SIZE`. - -### 11.7 Error Mapping - -| Condition | Error Code | Notes | -| ---------------------------------- | --------------------- | -------------------------------------------------------------- | -| Payload exceeds policy limit | `ERR_POLICY_SIZE` | ICD/1 `max_object_size` (ADR-006 policy lane). | -| Streaming chunk type/order invalid | `ERR_STREAM_ORDER` | Non-bytes or out-of-order chunks (deterministic rejection). | -| Streaming missing payload | `ERR_STREAM_TRUNCATED`| Zero-length stream without payload. | -| Stored bytes mismatch CID | `ERR_IDENTITY_MISMATCH` | Raised when existing bytes conflict with derived identity. | -| SA/PA overlap | `ERR_AREA_VIOLATION` | Shared roots or ancestry (secure/public crossing). | -| Crash-window hook triggered | `ERR_CRASH_SIMULATION`| Simulated crash prior to rename/fsync ladder completion. | -| Missing object | `ERR_STORE_MISSING` | Reported when an object path is absent. | - -All other errors bubble from COR/1 decoding and map to existing ADR-006 codes -(see §2.7). - -### 11.8 Conformance & Evidence - -* Vectors: `/amduat/vectors/ph05/bytestore/` (`TV-BS-001…005`). -* Runner: `/amduat/tools/ci/bs_check.py` (dual-run determinism; emits JSONL). -* Evidence: `/amduat/logs/ph05/evidence/bytestore/PH05-EV-BS-001/` (runA/runB + - crash summary). -* Linked ADR: ADR-030 (ByteStore Persistence Contract). - ---- - -## Appendix A — Surface Version Table - -| Surface | Version | Notes | -| ------- | ------- | ----- | -| FCS/1 | v1-min | Execution-only descriptor (ADR-016); governance fields live in FCT/1. | -| FER/1 | v1.1 | Parity-first receipts with run_id dedup, executor fingerprints, typed logs, RNG envelope (ADR-017). | -| FCT/1 | v1.0 | Certification transactions binding policy/intent/attestations; publishes FER/1 receipts. | -| FPD/1 | v1.0 | Single-digest publication capsule linking FCT/1 and FER/1 sets. | - ---- - -**End of DDS 0.5.0** - ---- - -## Document History - -* 0.2.1 (2025-10-26) — Updated Phase Pack references; byte semantics unchanged; ADR-012 no-normalization. - -* 0.2.2 (2025-10-26) — Promoted PH01 design surfaces to Approved; synchronized anchors. - -* 0.2.3 (2025-10-27) — Marked DDS scope as PH01-only and referenced FPS/1 surfaces. - -* **0.2.4 (2025-11-14):** Added FCS/1 & PCB1 TLVs plus FER/1 receipt and FCT/1 transaction schemas with rejection mapping. - -* **0.2.5 (2025-11-15):** Registered PCB1 header invariants and arity/cycle validation errors. - -* **0.2.6 (2025-11-19):** Registered `ERR_EXEC_TIMEOUT` for deterministic timing envelope. - -* **0.3.0 (2025-11-02):** Trimmed **FCS/1 to v1-min** (execution recipe only: `function_ptr`, `parameter_block`, `arity`). Moved **intent/roles/scope/policy** to **FCT/1**; clarified provenance lives in **FER/1**. Added rejection guidance for legacy FCS tags. - -* **0.3.1 (2025-11-20):** Registered `ERR_FCS_UNKNOWN_TAG`; clarified that any legacy governance tag in FCS/1 is a hard rejection. No other layout changes. -* **0.3.2 (2025-11-21):** Adopted parity-first FER/1 TLVs (executor set, parity vector, context/witness hooks), registered `ERR_IMPL_PARITY` and `ERR_FER_UNKNOWN_TAG`, and refreshed conformance guidance. -* **0.3.3 (2025-11-22):** Added FPD/1 publication digest schema, registered federation digest/timestamp errors, and wired CI fixtures to deterministic publish checks. - -* **0.3.5 (2025-11-07):** Added surface version table and aligned FER/1 v1.1 maintenance metadata for Phase 04 handoff. - -* **0.3.6 (2025-11-08):** Seeded PH04 linkage & semantic placeholder section (DDS §7.8). - -* **0.3.7 (2025-11-08):** Seeded FLS/1 placeholder TLV table aligned with ADR-018 v0.3.0. -* **0.3.8 (2025-11-08):** Registered FLS/1 TLV registry (0x60–0x65), error mapping, and conformance vectors aligned with ADR-018 v0.4.0. -* **0.3.9 (2025-11-09):** Locked CRS/1 concept/relation TLVs and registered FLS payload CID/type errors with conformance evidence. - -* **0.4.0 (2025-11-08):** Promoted §7.8 FLS/1 & CRS/1 TLVs with error mapping and GS/1 snapshot evidence. - -* **0.4.1 (2025-11-09):** Extended CRS predicate rules and mapped new validation errors -* **0.4.2 (2025-11-09):** Registered router error codes (`ERR_FLS_UNKNOWN_TAG`, `ERR_FLS_TAG_ORDER`, `ERR_FLS_SIGNATURE`) and FPD parent-policy errors with GS diff evidence pointer. -* **0.4.3 (2025-11-09):** Added WT/1 intake layout, validation errors, and router API integration (§7.9). -* **0.4.4 (2025-11-20):** Refined WT/1 (§7.9) with `wt.pubkey`, signature preimage exclusion, lineage/policy errors, and - expanded validator vector coverage. - -* **0.4.6 (2025-11-22):** WT/1 and SOS/1 conformance evidence sealed via PH04-M4/M5 audit bundles. -* **0.4.5 (2025-11-21):** Registered SOS/1 overlays (§7.10) with compat evidence enforcement, aligned WT/1 error mapping (`ERR_WT_KEY_UNBOUND`, `ERR_WT_INTENT_UNREGISTERED`, `ERR_WT_PARENT_REQUIRED`), and expanded vector coverage to `TV-WT-001…009`. - -* **0.4.7 (2025-11-23):** Documented MPR/1 and IER/1 schemas, error surfaces, and validator evidence for compat policy lane. - -* **0.4.8 (2025-11-24):** Added §7.10 CT/1 header schema with error codes and renumbered downstream sections for PH05 replay. - -* **0.5.0 (2025-11-11):** Added §11 ByteStore API & Persistence discipline covering API surface, fsync ladder, SA/PA isolation, streaming determinism, and ADR-006 error mapping. +This placeholder avoids drift between repos. diff --git a/tier1/enc-asl-core-index.md b/tier1/enc-asl-core-index.md index 3c95aaa..a37a8f8 100644 --- a/tier1/enc-asl-core-index.md +++ b/tier1/enc-asl-core-index.md @@ -1,321 +1,5 @@ -# ENC-ASL-CORE-INDEX +# ENC/ASL-CORE-INDEX/1 — moved -### Encoding Specification for ASL Core Index +Canonical spec: `vendor/amduat/tier1/enc-asl-core-index-1.md`. ---- - -## 1. Purpose - -This document defines the **exact encoding of ASL index segments** and records for storage and interoperability. - -It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**. -Variable-length digest requirements are defined in ASL/1-CORE-INDEX (`tier1/asl-core-index.md`). -This document incorporates the federation encoding addendum. - -It is intended for: - -* C libraries -* Tools -* API frontends -* Memory-mapped access - -It does **not** define: - -* Index semantics (see ASL/1-CORE-INDEX) -* Store lifecycle behavior (see ASL-STORE-INDEX) -* Acceleration semantics (see ASL/INDEX-ACCEL/1) -* TGK edge semantics or encodings (see `TGK/1` and `TGK/1-CORE`) -* Federation semantics (see federation/domain policy layers) - ---- - -## 2. Encoding Principles - -1. **Little-endian** representation -2. **Fixed-width fields** for deterministic access -3. **No pointers or references**; all offsets are file-relative -4. **Packed structures**; no compiler-introduced padding -5. **Forward compatibility** via version field -6. **CRC or checksum protection** for corruption detection -7. **Federation metadata** embedded in index records for deterministic cross-domain replay - -All multi-byte integers are little-endian unless explicitly noted. - ---- - -## 3. Segment Layout - -Each index segment file is laid out as follows: - -``` -+------------------+ -| SegmentHeader | -+------------------+ -| BloomFilter[] | (optional, opaque to semantics) -+------------------+ -| IndexRecord[] | -+------------------+ -| DigestBytes[] | -+------------------+ -| ExtentRecord[] | -+------------------+ -| SegmentFooter | -+------------------+ -``` - -* **SegmentHeader**: fixed-size, mandatory -* **BloomFilter**: optional, opaque, segment-local -* **IndexRecord[]**: array of index entries -* **DigestBytes[]**: concatenated digest bytes referenced by IndexRecord -* **ExtentRecord[]**: concatenated extent lists referenced by IndexRecord -* **SegmentFooter**: fixed-size, mandatory - -Offsets in the header define locations of Bloom filter and index records. - -### 3.1 Fixed Constants and Sizes - -**Magic bytes (SegmentHeader.magic):** `ASLIDX03` - -* ASCII bytes: `0x41 0x53 0x4c 0x49 0x44 0x58 0x30 0x33` -* Little-endian uint64 value: `0x33305844494c5341` - -**Current encoding version:** `3` - -**Fixed struct sizes (bytes):** - -* `SegmentHeader`: 112 -* `IndexRecord`: 48 -* `ExtentRecord`: 16 -* `SegmentFooter`: 24 - -**Section packing (no gaps):** - -* `records_offset = header_size + bloom_size` -* `digests_offset = records_offset + (record_count * sizeof(IndexRecord))` -* `extents_offset = digests_offset + digests_size` -* `SegmentFooter` starts at `extents_offset + (extent_count * sizeof(ExtentRecord))` - -All offsets MUST be file-relative, 8-byte aligned, and point to their respective arrays exactly as above. - -### 3.2 Federation Defaults - -This encoding integrates federation metadata into segments and records. - -Legacy segments without federation fields MUST be treated as: - -* `segment_domain_id = local` -* `segment_visibility = internal` -* `domain_id = local` -* `visibility = internal` -* `has_cross_domain_source = 0` -* `cross_domain_source = 0` - ---- - -## 4. SegmentHeader - -```c -#pragma pack(push,1) -typedef struct { - uint64_t magic; // Unique magic number identifying segment file type - uint16_t version; // Encoding version - uint16_t shard_id; // Optional shard identifier - uint32_t header_size; // Total size of header including fields below - - uint64_t snapshot_min; // Minimum snapshot ID for which segment entries are valid - uint64_t snapshot_max; // Maximum snapshot ID - - uint64_t record_count; // Number of index entries - uint64_t records_offset; // File offset of IndexRecord array - - uint64_t bloom_offset; // File offset of bloom filter (0 if none) - uint64_t bloom_size; // Size of bloom filter (0 if none) - - uint64_t digests_offset; // File offset of DigestBytes array - uint64_t digests_size; // Total size in bytes of DigestBytes - - uint64_t extents_offset; // File offset of ExtentRecord array - uint64_t extent_count; // Total number of ExtentRecord entries - - uint32_t segment_domain_id; // Domain owning this segment - uint8_t segment_visibility; // 0 = internal, 1 = published - uint8_t federation_version; // 0 if unused - uint16_t reserved0; // Reserved (must be 0) - - uint64_t flags; // Segment flags (must be 0 in version 3) -} SegmentHeader; -#pragma pack(pop) -``` - -**Notes:** - -* `magic` ensures the reader validates the segment type. -* `version` allows forward-compatible extension. -* `snapshot_min` / `snapshot_max` are reserved for future use and carry no visibility semantics in version 3. -* `segment_domain_id` identifies the owning domain for all records in this segment. -* `segment_visibility` MUST be the maximum visibility of all records in the segment. -* `federation_version` MUST be `0` unless a future federation encoding version is defined. -* `reserved0` MUST be `0`. -* `header_size` MUST be `112`. -* `flags` MUST be `0`. Readers MUST reject non-zero values. - ---- - -## 5. IndexRecord - -```c -#pragma pack(push,1) -typedef struct { - uint32_t hash_id; // Hash algorithm identifier - uint16_t digest_len; // Digest length in bytes - uint16_t reserved0; // Reserved for alignment/future use - uint64_t digest_offset; // File offset of digest bytes for this entry - - uint64_t extents_offset; // File offset of first ExtentRecord for this entry - uint32_t extent_count; // Number of ExtentRecord entries for this artifact - uint32_t total_length; // Total artifact length in bytes - - uint32_t domain_id; // Domain identifier for this artifact - uint8_t visibility; // 0 = internal, 1 = published - uint8_t has_cross_domain_source; // 0 or 1 - uint16_t reserved1; // Reserved (must be 0) - - uint32_t cross_domain_source; // Source domain if imported (valid if has_cross_domain_source=1) - uint32_t flags; // Optional flags (tombstone, reserved, etc.) -} IndexRecord; -#pragma pack(pop) -``` - -**Notes:** - -* `hash_id` + `digest_len` + `digest_offset` store the artifact key deterministically. -* `digest_len` MUST be explicit in the encoding and MUST match the length implied by `hash_id` and StoreConfig. -* `digest_offset` MUST be within `[digests_offset, digests_offset + digests_size)`. -* `extents_offset` references the first ExtentRecord for this entry. -* `extent_count` defines how many extents to read (may be 0 for tombstones; see ASL/1-CORE-INDEX in `tier1/asl-core-index.md`). -* `total_length` is the exact artifact size in bytes. -* Flags may indicate tombstone or other special status. -* `domain_id` MUST be present and stable across replay. -* `visibility` MUST be `0` or `1`. -* `has_cross_domain_source` MUST be `0` or `1`. -* `cross_domain_source` MUST be `0` when `has_cross_domain_source=0`. -* `reserved0` and `reserved1` MUST be `0`. - -### 5.1 IndexRecord Flags - -``` -IDX_FLAG_TOMBSTONE = 0x00000001 -``` - -* If `IDX_FLAG_TOMBSTONE` is set, then `extent_count`, `total_length`, and `extents_offset` MUST be `0`. -* All other bits are reserved and MUST be `0`. Readers MUST reject unknown flag bits. -* Tombstones MUST retain valid `domain_id` and `visibility` to ensure domain-local shadowing. - ---- - -## 6. ExtentRecord - -```c -#pragma pack(push,1) -typedef struct { - uint64_t block_id; // ASL block identifier - uint32_t offset; // Offset within block - uint32_t length; // Length of this extent -} ExtentRecord; -#pragma pack(pop) -``` - -**Notes:** - -* Extents are concatenated in order to produce artifact bytes. -* `extent_count` MUST be > 0 for visible (non-tombstone) entries. -* `total_length` MUST equal the sum of `length` across the extents. -* `offset` and `length` MUST describe a contiguous slice within the referenced block. - ---- - -## 7. SegmentFooter - -```c -#pragma pack(push,1) -typedef struct { - uint64_t crc64; // CRC over header + bloom filter + index records + digest bytes + extents - uint64_t seal_snapshot; // Snapshot ID when segment was sealed - uint64_t seal_time_ns; // High-resolution seal timestamp -} SegmentFooter; -#pragma pack(pop) -``` - -**Notes:** - -* CRC ensures corruption detection during reads, covering all segment contents except the footer. -* Seal information allows deterministic reconstruction of CURRENT state. - ---- - -## 8. DigestBytes - -* Digest bytes are concatenated in a single byte array. -* Each IndexRecord references its digest via `digest_offset` and `digest_len`. -* The digest bytes MUST be immutable once the segment is sealed. - ---- - -## 9. Bloom Filter - -* The bloom filter is **optional** and opaque to semantics. -* Its purpose is **lookup acceleration**. -* Must be deterministic: same entries → same bloom representation. -* Segment-local only; no global assumptions. - ---- - -## 10. Versioning and Compatibility - -* `version` field in header defines encoding. -* Readers must **reject unsupported versions**. -* New fields may be added in future versions only via version bump. -* Existing fields must **never change meaning**. -* Version `1` implies single-extent layout (legacy). -* Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`. -* Version `3` introduces variable-length digest bytes with `hash_id` and `digest_offset`. -* Version `3` also integrates federation metadata in segment headers and index records. - -### 10.1 Federation Compatibility Rules - -* Legacy segments without federation fields are treated as local/internal (see 3.2). -* Tombstones MUST NOT shadow artifacts from other domains; domain matching is required. - ---- - -## 11. Alignment and Packing - -* All structures are **packed** (no compiler padding) -* Multi-byte integers are **little-endian** -* Memory-mapped readers can directly index `IndexRecord[]` using `records_offset`. -* Extents are accessed via `IndexRecord.extents_offset` relative to the file base. - ---- - -## 12. Summary of Encoding Guarantees - -The ENC-ASL-CORE-INDEX specification ensures: - -1. **Deterministic layout** across platforms -2. **Direct mapping from semantic model** (ArtifactKey → ArtifactLocation) -3. **Immutability of sealed segments** -4. **Integrity validation** via CRC -5. **Forward-compatible extensibility** - ---- - -## 13. Relationship to Other Layers - -| Layer | Responsibility | -| ------------------ | ---------------------------------------------------------- | -| ASL/1-CORE-INDEX | Defines semantic meaning of artifact → location mapping | -| ASL-STORE-INDEX | Defines lifecycle, visibility, and replay contracts | -| ASL/INDEX-ACCEL/1 | Defines routing, filters, sharding (observationally inert) | -| ENC-ASL-CORE-INDEX | Defines exact bytes-on-disk format for segment persistence | - -This completes the stack: **semantics → store behavior → encoding**. +This placeholder avoids drift between repos. diff --git a/tier1/enc-asl-log.md b/tier1/enc-asl-log.md index 909bcb4..489b276 100644 --- a/tier1/enc-asl-log.md +++ b/tier1/enc-asl-log.md @@ -1,213 +1,5 @@ -# ENC-ASL-LOG +# ENC/ASL-LOG/1 — moved -### Encoding Specification for ASL Append-Only Log +Canonical spec: `vendor/amduat/tier1/enc-asl-log-1.md`. ---- - -## 1. Purpose - -This document defines the **exact encoding** of the ASL append-only log. - -It translates **ASL/LOG/1** semantics into a deterministic **bytes-on-disk** format. - -It does **not** define log semantics (see `ASL/LOG/1`). - ---- - -## 2. Encoding Principles - -1. **Little-endian** integers -2. **Packed structures** (no compiler padding) -3. **Forward-compatible** versioning via header fields -4. **Deterministic serialization**: identical log content -> identical bytes -5. **Hash-chained integrity** as defined by ASL/LOG/1 - ---- - -## 3. Log File Layout - -``` -+----------------+ -| LogHeader | -+----------------+ -| LogRecord[] | -+----------------+ -``` - -* **LogHeader**: fixed-size, mandatory, begins file -* **LogRecord[]**: append-only entries, variable number - ---- - -## 4. LogHeader - -```c -#pragma pack(push,1) -typedef struct { - uint64_t magic; // "ASLLOG01" - uint32_t version; // Encoding version (1) - uint32_t header_size; // Total header bytes including this struct - uint64_t flags; // Reserved, must be zero for v1 -} LogHeader; -#pragma pack(pop) -``` - -Notes: - -* `magic` is ASCII bytes: `0x41 0x53 0x4c 0x4c 0x4f 0x47 0x30 0x31` -* `version` allows forward compatibility - ---- - -## 5. LogRecord Envelope - -Each record is encoded as: - -```c -#pragma pack(push,1) -typedef struct { - uint64_t logseq; // Monotonic sequence number - uint32_t record_type; // Record type tag - uint32_t payload_len; // Payload byte length - uint8_t payload[payload_len]; - uint8_t record_hash[32]; // Hash-chained integrity (SHA-256) -} LogRecord; -#pragma pack(pop) -``` - -Hash chain rule (normative): - -``` -record_hash = H(prev_record_hash || logseq || record_type || payload_len || payload) -``` - -* `prev_record_hash` is the previous record's `record_hash` -* For the first record, `prev_record_hash` is 32 bytes of zero -* `H` is SHA-256 for v1 - -Readers MUST skip unknown `record_type` values using `payload_len` and MUST -continue replay without failure. - ---- - -## 6. Record Type IDs (v1) - -These type IDs bind the ASL/LOG/1 semantics to bytes-on-disk: - -| Type ID | Record Type | -| ------- | ------------------ | -| 0x01 | SEGMENT_SEAL | -| 0x10 | TOMBSTONE | -| 0x11 | TOMBSTONE_LIFT | -| 0x20 | SNAPSHOT_ANCHOR | -| 0x30 | ARTIFACT_PUBLISH | -| 0x31 | ARTIFACT_UNPUBLISH | - ---- - -## 6.1 Payload Schemas (v1) - -All payloads are little-endian and packed. Variable-length fields are encoded -inline and accounted for by `payload_len`. - -### 6.1.1 ArtifactRef - -```c -#pragma pack(push,1) -typedef struct { - uint32_t hash_id; // Hash algorithm identifier - uint16_t digest_len; // Digest length in bytes - uint16_t reserved0; // Must be 0 - uint8_t digest[digest_len]; -} ArtifactRef; -#pragma pack(pop) -``` - -Notes: - -* `digest_len` MUST be > 0. -* If StoreConfig fixes the hash, `digest_len` MUST match that hash's length. - -### 6.1.2 SEGMENT_SEAL (Type 0x01) - -```c -#pragma pack(push,1) -typedef struct { - uint64_t segment_id; // Store-local segment identifier - uint8_t segment_hash[32]; // SHA-256 over the segment file bytes -} SegmentSealPayload; -#pragma pack(pop) -``` - -### 6.1.3 TOMBSTONE (Type 0x10) - -```c -#pragma pack(push,1) -typedef struct { - ArtifactRef artifact; - uint32_t scope; // Opaque to ASL/LOG/1 - uint32_t reason_code; // Opaque to ASL/LOG/1 -} TombstonePayload; -#pragma pack(pop) -``` - -### 6.1.4 TOMBSTONE_LIFT (Type 0x11) - -```c -#pragma pack(push,1) -typedef struct { - ArtifactRef artifact; - uint64_t tombstone_logseq; // logseq of the tombstone being lifted -} TombstoneLiftPayload; -#pragma pack(pop) -``` - -### 6.1.5 SNAPSHOT_ANCHOR (Type 0x20) - -```c -#pragma pack(push,1) -typedef struct { - uint64_t snapshot_id; - uint8_t root_hash[32]; // Hash of snapshot-visible state -} SnapshotAnchorPayload; -#pragma pack(pop) -``` - -### 6.1.6 ARTIFACT_PUBLISH (Type 0x30) - -```c -#pragma pack(push,1) -typedef struct { - ArtifactRef artifact; -} ArtifactPublishPayload; -#pragma pack(pop) -``` - -### 6.1.7 ARTIFACT_UNPUBLISH (Type 0x31) - -```c -#pragma pack(push,1) -typedef struct { - ArtifactRef artifact; -} ArtifactUnpublishPayload; -#pragma pack(pop) -``` - ---- - -## 7. Versioning Rules - -* `version = 1` for this specification. -* New record types MAY be added without bumping the version. -* Layout changes to `LogHeader` or `LogRecord` require a new version. - ---- - -## 8. Relationship to Other Layers - -| Layer | Responsibility | -| ---------------- | ------------------------------------------------ | -| ASL/LOG/1 | Semantic log behavior and replay rules | -| ASL-STORE-INDEX | Store lifecycle and snapshot/log contracts | -| ENC-ASL-LOG | Exact byte layout for log encoding (this doc) | -| ENC-ASL-CORE-INDEX | Exact byte layout for index segments | +This placeholder avoids drift between repos. diff --git a/tier1/enc-asl-tgk-exec-plan-1.md b/tier1/enc-asl-tgk-exec-plan-1.md index 35f7687..cee876c 100644 --- a/tier1/enc-asl-tgk-exec-plan-1.md +++ b/tier1/enc-asl-tgk-exec-plan-1.md @@ -1,183 +1,5 @@ -# ENC-ASL-TGK-EXEC-PLAN/1 -- Execution Plan Encoding +# ENC/ASL-TGK-EXEC-PLAN/1 — moved -Status: Draft -Owner: Architecture -Version: 0.1.0 -SoT: No -Last Updated: 2025-01-17 -Tags: [encoding, execution, tgk] +Canonical spec: `vendor/amduat/tier1/enc-asl-tgk-exec-plan-1.md`. -**Document ID:** `ENC-ASL-TGK-EXEC-PLAN/1` -**Layer:** L2 -- Execution plan encoding (bytes-on-disk) - -**Depends on (normative):** - -* `ASL/TGK-EXEC-PLAN/1` - -**Informative references:** - -* `ENC-ASL-CORE-INDEX` - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -ENC-ASL-TGK-EXEC-PLAN/1 defines the byte-level encoding for serialized execution plans. It does not define operator semantics. - ---- - -## 1. Operator Type Enumeration - -```c -typedef enum { - OP_SEGMENT_SCAN, - OP_INDEX_FILTER, - OP_MERGE, - OP_PROJECTION, - OP_TGK_TRAVERSAL, - OP_AGGREGATION, - OP_LIMIT_OFFSET, - OP_SHARD_DISPATCH, - OP_SIMD_FILTER, - OP_TOMBSTONE_SHADOW -} operator_type_t; -``` - ---- - -## 2. Operator Flags - -```c -typedef enum { - OP_FLAG_NONE = 0x00, - OP_FLAG_PARALLEL = 0x01, // shard or SIMD capable - OP_FLAG_OPTIONAL = 0x02 // optional operator (acceleration) -} operator_flags_t; -``` - ---- - -## 3. Snapshot Range Structure - -```c -typedef struct { - uint64_t logseq_min; // inclusive - uint64_t logseq_max; // inclusive -} snapshot_range_t; -``` - ---- - -## 4. Operator Parameter Union - -```c -typedef struct { - // SegmentScan parameters - struct { - uint8_t is_asl_segment; // 1 = ASL, 0 = TGK - uint64_t segment_start_id; - uint64_t segment_end_id; - } segment_scan; - - // IndexFilter parameters - struct { - uint32_t artifact_type_tag; - uint8_t has_type_tag; - uint32_t edge_type_key; - uint8_t has_edge_type; - uint8_t role; // 0=none, 1=from, 2=to, 3=both - } index_filter; - - // Merge parameters - struct { - uint8_t deterministic; // 1 = logseq ascending + canonical key - } merge; - - // Projection parameters - struct { - uint8_t project_artifact_id; - uint8_t project_tgk_edge_id; - uint8_t project_node_id; - uint8_t project_type_tag; - } projection; - - // TGKTraversal parameters - struct { - uint64_t start_node_id; - uint32_t traversal_depth; - uint8_t direction; // 1=from, 2=to, 3=both - } tgk_traversal; - - // Aggregation parameters - struct { - uint8_t agg_count; - uint8_t agg_union; - uint8_t agg_sum; - } aggregation; - - // LimitOffset parameters - struct { - uint64_t limit; - uint64_t offset; - } limit_offset; - - // ShardDispatch & SIMDFilter are handled via flags -} operator_params_t; -``` - ---- - -## 5. Operator Definition Structure - -```c -typedef struct operator_def { - uint32_t op_id; // unique operator ID - operator_type_t op_type; // operator type - operator_flags_t flags; // parallel/optional flags - snapshot_range_t snapshot; // snapshot bounds for deterministic execution - operator_params_t params; // operator-specific parameters - - uint32_t input_count; // number of upstream operators - uint32_t inputs[8]; // list of op_ids for input edges (DAG) -} operator_def_t; -``` - -Notes: - -* `inputs` defines DAG dependencies. -* The maximum input fan-in is 8 for v1. - ---- - -## 6. Execution Plan Structure - -```c -typedef struct exec_plan { - uint32_t plan_version; // version of plan encoding - uint32_t operator_count; // number of operators - operator_def_t *operators; // array of operator definitions -} exec_plan_t; -``` - -Operators SHOULD be serialized in topological order when possible. - ---- - -## 7. Serialization Rules (Normative) - -* All integers are little-endian. -* Operators MUST be serialized in a deterministic order. -* `operator_count` MUST match the serialized operator array length. -* `inputs[]` MUST reference valid `op_id` values within the plan. - ---- - -## 8. Non-Goals - -ENC-ASL-TGK-EXEC-PLAN/1 does not define: - -* Runtime scheduling or execution -* Query languages or APIs -* Operator semantics beyond parameter layout +This placeholder avoids drift between repos. diff --git a/tier1/srs.md b/tier1/srs.md index 20967e0..b6dff28 100644 --- a/tier1/srs.md +++ b/tier1/srs.md @@ -1,518 +1,5 @@ -# AMDUAT-SRS — Detailed Requirements Specification +# AMDUAT-SRS — moved -Status: Approved | Owner: Niklas Rydberg | Version: 0.4.0 | Last Updated: 2025-11-11 | SoT: Yes -Tags: [requirements, cas, kheper] +Canonical spec: `vendor/amduat/tier1/srs.md`. -> **Purpose:** Capture normative behavioural requirements for Phase PH01 (Kheper) and beyond. Long-lived semantics live here (not in Phase Packs). - ---- - -## 1. Objectives (from Tier-0 Charter; elaborated) - -* Deterministic addressing: identical payload bytes **MUST** yield identical CIDs. -* Immutability: new bytes → new CID; objects MUST NOT be mutated in place. -* Integrity by design: `verify()` MUST detect corruption; zero false positives. -* Instance isolation: storage layout and runtime state are implementation detail. -* Binary canonical substrate: COR/1 is the normative import/export envelope. -* Instance identity: ICD/1 defines stable `instance_id` for future transaction bindings. -* Crypto agility: default SHA-256; algorithm IDs extensible. -* Minimal tooling: reference CLI (`amduatcas`) and C library. -* Conformance: golden vectors and cross-impl CI enforce byte-identity. - ---- - -## 2. Scope (Behavioural) - -### 2.1 In Scope - -* Local, single-node Content-Addressable Storage (CAS) -* Deterministic hashing with domain separation -* Canonical envelopes (COR/1) and instance descriptor (ICD/1) -* CRUD-adjacent operations: put/get/stat/exists/verify -* Import/export of canonical bytestreams -* Optional listing/gc semantics - -### 2.2 Out of Scope (for PH01) - -* Networking, replication, consensus -* Multi-object transactions -* Semantic/provenance graphing -* Encryption/ACLs (layer externally) - ---- - -## 3. Functional Requirements - -### FR-001 Deterministic CID Production - -Given identical payload bytes and algo_id, the CID **MUST** match across compliant implementations. - -### FR-002 Immutability - -Objects **MUST NOT** be mutated; new payload → new CID. - -### FR-003 Idempotent Put - -Concurrent `put()` of identical payload MUST yield one canonical object; object integrity preserved. - -### FR-004 Verification - -`verify(CID)` MUST recompute the CID and detect corruption; zero false positives. - -### FR-005 Import/Export Canonicality - -Importing COR/1 and then exporting it MUST yield byte-identical bytestreams. - -### FR-006 Size Validation - -`get()` MUST validate payload length according to COR/1. - -### FR-007 Optional Verify-on-Read Policy - -Policy MAY require verify for cold reads; MUST NOT corrupt payload if disabled. - -### FR-008 Canonical Rejection - -CAS decoders MUST reject: - -* out-of-order TLV tags -* duplicate TLV tags -* extraneous tags -* trailing bytes -* malformed or over-long VARINT encodings -* payload length mismatches - -Rejection MUST be deterministic and symbolic. - -### FR-009 Concurrency Discipline - -Concurrent `put()` operations for identical payloads MUST NOT yield divergent COR/1 envelopes. Only one canonical envelope may result. - -### FR-010 Raw Byte Semantics - -CAS MUST operate strictly over exact payload bytes. No normalization (newline, whitespace, UTF-8 interpretation, or Unicode equivalence) SHALL occur. - -### FR-011 Filesystem Independence - -Consensus behaviour MUST NOT depend on: - -* directory entry ordering -* timestamp metadata -* filesystem case sensitivity -* locale or regional configuration - -### FR-012 Deterministic Failure - -Malformed objects MUST be rejected. CAS MUST NOT auto-repair or normalize COR/1 envelopes. - -### FR-013 Resource Boundaries - -Resource exhaustion (disk full, allocation failure) MUST fail atomically and leave no partial objects visible. - -### FR-014 FCS/1 Descriptor Determinism (v1-min) - -Composite and custom functions MUST be expressed as canonical **FCS/1** descriptors that contain **only the execution recipe**: -`function_ptr`, `parameter_block (PCB1)`, and `arity`. -Identical descriptors SHALL hash to identical CIDs and MUST remain immutable after publication. **No policy/intent/notes** appear in FCS/1. - -### FR-015 Registry Determinism (Descriptor Admission) - -Functional registries MUST admit **only canonical FCS/1 descriptors** (per FR-014) and enforce descriptor validation (TLV order, PCB1 arity, acyclicity). -Registries MUST NOT infer or embed policy/intent into descriptors; publication governance is handled at certification time (FR-017). - -### FR-016 Evaluation Receipt Integrity (FER/1) - -Every execution of a composite function under curated or locked policies MUST emit a **FER/1** receipt. The receipt SHALL encode, in canonical TLV order, at least the following evidence: - -1. `function_cid` → evaluated FCS/1 descriptor (v1-min) preserving CIP indirection. -2. `input_manifest` → GS/1 BCF/1 set of consumed input CIDs (deduped and byte-lexicographic). -3. `environment` → ICD/1 (or PH03 env capsule) snapshot pinning toolchain/runtime state. -4. `evaluator_id` → stable evaluator identity bytes. -5. `executor_set` → implementations that executed the recipe, keyed in canonical byte order. -6. `parity_vector` → per-executor digests with matching `executor` ordering, shared `output` (`== output_cid`), and `sbom_cid` entries. -7. `executor_fingerprint` + `run_id` → optional SBOM fingerprint CID and deterministic dedup hash (`H("AMDUAT:RUN\0" || function || manifest || env || fingerprint)`). -8. `logs` → typed evidence capsules binding `kind`, `cid`, and `sha256` for stdout/stderr/metrics traces. -9. `limits` → declared execution envelope (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`). -10. `determinism_level` / `rng_seed` → declared determinism class (`D1_bit_exact` default, `D2_numeric_stable` requires a 0–32 byte seed). -11. `output_cid` → single canonical output CID for the run. -12. `started_at` / `completed_at` → epoch-second timestamps satisfying FR-020 bounds. -13. `signature` → Ed25519 metadata verifying `H("AMDUAT:FER\0" || canonical bytes)`. - -Receipts MAY include optional `logs` (typed capsules), `context`, `witnesses`, `parent`, and `signature_ext` TLVs but MUST NOT leak policy/intent (those belong to FCT/1). - -From Phase 04 onwards, governance and runtime layers MUST require FER/1 v1.1 receipts; ER/1 artefacts remain valid only as historical evidence and SHALL NOT satisfy FR-016 compliance gates. - -Parity discipline is mandatory: unsorted executor keys or mismatched parity orderings SHALL raise `ERR_IMPL_PARITY_ORDER`; divergent outputs or missing executors SHALL raise `ERR_IMPL_PARITY`. Unknown TLVs or cardinality violations SHALL raise `ERR_FER_UNKNOWN_TAG`. GS/1 manifest violations emit `ERR_FER_INPUT_MANIFEST_SHAPE`; missing RNG seed when determinism ≠ D1 emits `ERR_FER_RNG_REQUIRED`. All signatures MUST verify against the domain-separated hash (`ERR_FER_SIGNATURE` on failure). - -### FR-017 Certification Transactions (FCT/1: Policy & Intent) - -Certification events MUST be recorded as **FCT/1** transactions that aggregate one or more FER/1 receipts and bind **registry policy, intent, domain scope, and authority role**. -Transactions MUST include attestations whenever `registry_policy != 0` and SHALL expose publication pointers when federated. -**All intent/scope/role/authority metadata lives in FCT/1 (not in FCS/1).** - -### FR-BS-001 ByteStore Deterministic Identity - -ByteStore SHALL derive CIDs using the canonical CAS domain separator: `CID = algo || H("CAS:OBJ\0" || payload)`. -The derived CID returned by `put()` and `import_cor()` MUST match the CID embedded in COR/1 envelopes and SHALL remain stable across runs, implementations, and ingest modes (DDS §11.2; ADR-030). - -### FR-BS-002 Atomic Durability Ladder - -ByteStore persistence MUST follow the atomic write ladder: write → `fsync(tmp)` → `rename` → `fsync(shard)` → `fsync(root)`. -Crash-window simulations triggered via `AMDUAT_BYTESTORE_CRASH_STEP` MUST leave the public area consistent upon recovery, with no visible partial objects (DDS §11.4; ADR-030; evidence PH05-EV-BS-001). - -### FR-BS-003 Secure/Public Area Isolation - -ByteStore SHALL enforce SA/PA isolation such that public payload roots and secure state roots are disjoint and non-overlapping. -Violations MUST raise `ERR_AREA_VIOLATION` and SHALL be surfaced to callers (DDS §11.5; ADR-030). - -### FR-BS-004 COR/1 Round-Trip Identity - -Importing COR/1 bytes via ByteStore and exporting the same CID MUST yield a byte-identical envelope. -Any mismatch between stored bytes and derived CID SHALL raise `ERR_IDENTITY_MISMATCH` (DDS §11.3; ADR-030). - -### FR-BS-005 Streaming Determinism & Policy Enforcement - -Chunked ingestion (`put_stream`) MUST produce the same CID as single-shot `put` for equivalent payloads and reject non-bytes or missing data with deterministic errors (`ERR_STREAM_ORDER`, `ERR_STREAM_TRUNCATED`). -ByteStore SHALL enforce ICD/1 `max_object_size` for all ingest paths, raising `ERR_POLICY_SIZE` when exceeded (DDS §11.6–11.7; ADR-030). - -### FR-022 Federation Publication Digest (FPD/1) - -Every publish event emerging from an FCT/1 certification MUST emit exactly one **FPD/1** digest satisfying ADR-007 single-digest guarantees. -The digest SHALL canonically hash the certified FCT/1 record, all attested FER/1 receipts, and the emitted governance edges (`certifies`, `attests`, `publishes`). -Implementations MUST persist the FPD/1 bytes alongside the FCT/1 payload under `/logs/ph03/evidence/fct/` (or successor evidence path) and reference the resulting CID from `fct.publication`. -Repeated invocations over identical inputs SHALL reproduce the same digest; mismatches SHALL be treated as certification failures. - -### FR-018 Provenance Enforcement - -Caching or replay layers MUST validate FER/1 receipts and FCT/1 transactions before serving composite outputs. Serving uncertified artefacts when policy requires certification is forbidden. - -### FR-019 Transaction Envelope Rejection - -Systems MUST reject FER/1 or FCT/1 envelopes whose CID lineage does not match the referenced FCS/1 descriptor, whose timestamps are non-monotonic, or whose signatures/attestations fail verification. - -### FR-020 Deterministic Execution Envelope - -| ID | Statement | Verification | Notes | -| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | -| **FR-020 — Deterministic Execution Envelope** | Each executor SHALL complete within a bounded deterministic time envelope (default 5 s). Execution time SHALL be measured and logged as evidence. Non-termination SHALL yield symbolic error `ERR_EXEC_TIMEOUT`. | Verified via CI parity harness and evidence file `/logs/ph03/evidence/-execution-times.jsonl`. | Implements Maat’s Balance principle. Tags: [deterministic-timing, evidence, maat-balance]. | - -### FR-021 Acyclic Composition - -FCS/1 descriptors referencing FPS/1 primitives, PCB1 parameter blocks, or nested FCS/1 descriptors MUST form an acyclic graph. -Registries SHALL reject submissions introducing self-references or cycles and emit `ERR_FCS_CYCLE_DETECTED` or -`ERR_PCB_ARITY_MISMATCH` when arity metadata conflicts with PCB1 manifests. - -### FR-028 Concept-Native Domain Materialization - -Federated domain manifests SHALL be materialized exclusively from CRS Concepts -and Relations. Given a DomainNode Concept, registries MUST traverse -`hasManifest` → `ManifestEntry` Concepts, extract `entryName` and -`entryChildVersion` relations, dedupe the `(name, version)` set, and compute the -GS/1 domain state deterministically. Duplicated pairs trigger `ERR_DG_DUP_ENTRY`; -missing relations trigger `ERR_DG_ENTRY_INCOMPLETE`; self references or -ancestor loops raise `ERR_DG_CYCLE`. Evidence: `tools/ci/dg_snapshot.py` -→ `logs/ph04/evidence/dg1/PH04-EV-DG-001/`. - -Operational linkage: router listings (`GET /links`) MUST return entries sorted -lexicographically by `fls_cid` and treat `since` query parameters as exclusive -lower bounds, ensuring deterministic replay of linkage events. - -### FR-029 Publication Recursion Discipline - -Publication Concepts SHALL declare their supporting FPD/1 digest, GS/1 cover -state, endorsed member FPD CIDs, and optional lineage parent using CRS -relations (`covers`, `endorses`, `parent`). Validators MUST recompute GS/1 from -the FPD payload, enforce duplicate-free membership, and detect recursive -cycles (`ERR_FPD_CYCLE`). Timestamp regressions raise `ERR_FPD_TIMESTAMP`; state -mismatches raise `ERR_PUB_STATE_MISMATCH`. Evidence: `tools/ci/pub_validate.py` -→ `logs/ph04/evidence/pub1/PH04-EV-PUB-001/`. - -Operational linkage: non-genesis publications SHOULD enable the parent-required -policy, supplying `fpd.parent` and guaranteeing strictly monotonic -`fpd.timestamp` to align with ADR-019 v1.2.1 and PH04 parent-policy harnesses. - -### FR-030 Predicate Concepts - -Every CRR/1 relation predicate MUST resolve to a CRS Concept. When the -taxonomy defines a `Predicate` Concept, predicate entries SHALL expose an -`is_a` edge into that class. Missing predicate Concepts raise -`ERR_CRR_PREDICATE_NOT_CONCEPT`; missing taxonomy membership raises -`ERR_CRR_PREDICATE_CLASS_MISSING`. Evidence: CRS validator vectors and -`logs/ph04/evidence/crs1/PH04-EV-CRS-001.md`. - -Operational linkage: FPD feed endpoints SHALL implement stateless, content-anchored pagination over parent-chained publications. `GET /feed/fpd` MUST traverse the publisher’s current tip toward genesis until either the caller-provided `limit` is satisfied or the supplied `since` CID is encountered; identical `publisher_id`, `since`, and `limit` inputs SHALL yield identical CID sequences. Detail lookups (`GET /feed/fpd/:cid`) SHALL expose publisher, members, parent, and state metadata without server-side session state. Evidence: `tools/ci/feeds_check.py` → `/amduat/logs/ph04/evidence/feeds/PH04-EV-FEEDS-001/pass.jsonl`. - -### FR-031 Authority Anchoring via CRS & FPD - -Publishing authorities SHALL represent identities as CRS Concepts linked via -`owns` and `hasRole` relations to key material and governance roles. Signatures -remain confined to FCT/1 and FPD/1 surfaces; CRS layers stay unsigned. FLS/1 -transport MAY carry Concept or Relation payloads but MUST NOT mutate them and -MUST perform payload-kind checks when requested (`--check-crs-payload`). - -Operational linkage: FLS router deployments SHALL expose `POST /fls`, -`GET /fls/:cid`, `GET /links`, `GET /healthz`, and `GET /readyz` endpoints and -enforce SA/PA separation (`ERR_AREA_VIOLATION` if misconfigured) so that public -ingest never mutates state areas directly. Audited ticket intake SHALL be -implemented via WT/1 (ADR-023) with: - -* `POST /wt` (Protected Area) accepting WT/1 BCF/1 payloads, validating - `has_pubkey(wt.author, wt.pubkey)` (or registered equivalent), verifying - signatures over `H("AMDUAT:WT\0" || canonical_bytes_without_signature)`, - enforcing registered ADR-010 intents (deduped + byte-lexicographically - sorted), ensuring monotonic `wt.timestamp` per `wt.author`, and optionally - chaining `wt.parent` lineage. Violations yield `ERR_WT_SIGNATURE`, - `ERR_WT_KEY_UNBOUND`, `ERR_WT_INTENT_UNREGISTERED`, `ERR_WT_INTENT_DUP`, - `ERR_WT_INTENT_EMPTY`, `ERR_WT_TIMESTAMP`, `ERR_WT_PARENT_UNKNOWN`, or - `ERR_WT_PARENT_REQUIRED`. Router policy MUST surface scope denials as - `ERR_WT_SCOPE_UNAUTHORIZED` and log the governing policy capsule. -* `GET /wt/:cid` returning the canonical WT/1 bytes for any accepted ticket. -* Deterministic pagination (`GET /wt?after=&limit=`) that emits WT/1 - entries in byte-lexicographic CID order with stable page boundaries. The - `after` parameter is an exclusive bound and routers SHALL enforce - `1 ≤ limit ≤ Nmax` to guarantee replay stability. - -Evidence: `/amduat/logs/ph04/evidence/wt1/PH04-EV-WT-001/summary.md` captures the -validator run over vectors `TV-WT-001…009`, ensuring unknown keys, signature -failures, timestamp regressions (including parent inversions), unbound keys, -unregistered intents, policy rejections, and unresolved parents reject as -specified. - -Compat overlays SHALL reference ADR-025 MPR/1 provenance capsules and ADR-026 -IER/1 inference evidence when operating in policy lane `compat`. Routers MUST -validate that `executor_fingerprint` equals the supplied MPR/1 CID, enforce -`determinism_level` plus `rng_seed` (raising `ERR_FER_RNG_REQUIRED` when -omitted), and verify log digests via the IER/1 manifest before accepting -overlays (`ERR_IER_LOG_HASH`/`ERR_IER_LOG_MANIFEST`). Evidence surfaces -`/amduat/logs/ph04/evidence/mpr1/PH04-EV-MPR-001/pass.jsonl` and -`/amduat/logs/ph04/evidence/ier1/PH04-EV-IER-001/pass.jsonl` prove vector -coverage `TV-MPR-001…003` (hash triple, missing weights, signature domain) and -`TV-IER-001…004` (ok, missing seed, fingerprint mismatch, log digest mismatch) -respectively with scenario summaries in accompanying `summary.md` files. - -### FR-032 CT/1 Deterministic Replay (D1) - -Given identical AC/1 + DTF/1 + topology inputs, executing the runtime twice in -isolation MUST produce byte-identical CT/1 snapshots (header and payload) with -matching CIDs whenever `ct.determinism_level = 0`. Evidence: -`tools/ci/ct_replay.py` (`runA`/`runB`) → -`/amduat/logs/ph05/evidence/ct1/PH05-EV-CT1-REPLAY-001/`. - -### FR-033 CT/1 Numeric Stability (D2) - -When `ct.determinism_level = 1`, numeric observables MAY diverge, but the -maximum absolute delta MUST remain within the tolerance documented by -`ct.kernel_cfg`. Evidence: `tools/ci/ct_replay.py` D2 replay outputs and kernel -configuration manifests in the same evidence set. - -### FR-034 CT/1 Header Integrity - -CT/1 headers MUST follow ADR-027: canonical BCF/1 key ordering, rejection of -unknown keys, monotonic `ct.tick`, canonical `cid:` formatting for topology and -AC/1/DTF/1 pointers (ADR-028), and Ed25519 signatures over -`H("AMDUAT:CT\0" || canonical_bytes_without_signature)`. Evidence: -`tools/validate/ct1_validator.py` with vectors -`/amduat/vectors/ph05/ct1/TV-CT1-001…004` and AC/DTF fixtures -`TV-AC1-001…002`, `TV-DTF1-001…002`. - ---- - -## 4. Non-Functional Requirements - -### NFR-001 Determinism - -Platform/language differences MUST NOT affect CID. - -### NFR-002 Performance - -Put/get latency MUST remain within configured OPS budgets. - -### NFR-003 Reliability - -CAS operations MUST be atomic; partial writes MUST NOT be visible. - -### NFR-004 Portability - -Implementations MUST operate on common filesystems. - -### NFR-005 Security Posture - -Domain separation strings MUST be applied for all hashed surfaces. - -### 4.3 Future Scope Alignment (Informative) - -Phase 02 introduces deterministic transformation primitives (**FPS/1**) extending the Kheper CAS model defined herein. -See `/amduat/arc/adrs/adr-015.md` and `/amduat/tier1/fps.md` for details. -No behavioural changes apply retroactively to PH01 surfaces. - ---- - -## 5. Data Model (Behavioural View) - -* CAS objects identified strictly by CID. -* COR/1 envelope provides size, payload, algo_id. -* ICD/1 descriptor provides instance configuration. - -> See DDS §2 (COR/1) and §3 (ICD/1) for normative byte layouts. - ---- - -## 6. API Semantics - -### `put(payload_bytes, algo_id=default) → CID` - -* Compute CID using domain separation: `CID = algo_id || H("CAS:OBJ\0" || payload_bytes)` -* If CID exists: return existing CID (idempotent) -* If absent: write canonical COR/1 envelope atomically -* Reject on size limit breach, malformed payload, non-canonical COR/1, I/O errors -* Writes MUST be atomic: temp file → fsync → rename → fsync parent dir - -### `get(CID) → payload_bytes` - -* Retrieve raw payload bytes -* MUST validate canonical COR/1 envelope -* Implementation MAY verify hash on read by policy -* Reject on missing object, hash mismatch - -### `exists(CID) → bool` - -* Return true if object is present and canonical - -### `stat(CID) → { present, size, algo_id }` - -* MUST return canonical metadata - -### `verify(CID) → { ok|error, expected:CID, actual:CID }` - -* Recompute CID from canonical bytes -* MUST detect corruption and reject non-canonical encodings - -### `import(stream_COR1) → CID` - -* Validate canonical TLV ordering -* Reject duplicate tags, extraneous tags, malformed VARINTs -* MUST round-trip to identical CID - -### `export(CID) → stream_COR1` - -* Emit canonical envelope; re-encoding MUST preserve canonical bytes - -### Deterministic Errors - -Errors MUST be emitted as stable symbolic codes including but not limited to: - -* `E_CID_NOT_FOUND` -* `E_CORRUPT_OBJECT` -* `E_CANONICALITY_VIOLATION` -* `E_IO_FAILURE` - ---- - -## 7. Success Criteria - -* Byte-for-byte CID agreement (≥ 3 platforms) -* Zero false positives in `verify()` -* Idempotent concurrent `put()` -* COR/1 import/export round-trips cleanly - ---- - -## 8. GC Semantics (Behavioural) - -* Reachability from configured roots -* Dry-run mode MUST NOT delete -* Removal MUST be atomic per object - ---- - -## 9. Acceptance Criteria (Phase Exit) - -* Golden vectors published -* Cross-impl CI passing -* COR/1 and ICD/1 documented in DDS -* Security posture validated by SEC - ---- - -## 10. Traceability - -* Requirements link to tests/defects in Phase Packs -* ADRs reference affected FR/NFR IDs - ---- - -## 11. Future Phases - -* Multi-object transactions bind to `instance_id` -* Provenance graph consumes COR/1 metadata - ---- - -## 12. Functional Primitive Surface (FPS/1) - -> Defines the canonical deterministic operations over canonical payloads. -> Each primitive produces exactly one payload and one CID. - -| Primitive | Signature | Description | Determinism / Errors | -| ------------- | ------------------------------ | ------------------------------------------- | ---------------------------------------------- | -| `put` | `(payload_bytes) → CID` | Canonical write, atomic fsync ladder. | ADR-006 `ERR_IO_FAILURE`, `ERR_NORMALIZATION`. | -| `get` | `(CID) → payload_bytes` | Fetch canonical bytes. | `ERR_CID_NOT_FOUND`. | -| `slice` | `(CID, offset, length) → CID` | Extract contiguous bytes. | `ERR_SLICE_RANGE`. | -| `concatenate` | `([CID₁,…,CIDₙ]) → CID` | Sequential join of payloads. | `ERR_EMPTY_INPUTS`. | -| `reverse` | `(CID, level) → CID` | Reverse payload order (bit/byte/word/long). | `ERR_REV_ALIGNMENT`, `ERR_INVALID_LEVEL`. | -| `splice` | `(CID_a, offset, CID_b) → CID` | Insert payload b into a at offset. | `ERR_SPLICE_RANGE`. | - -**Determinism:** identical inputs → identical outputs. -**Immutability:** inputs never mutated. -**Closure:** outputs valid for reuse as inputs to any primitive. -**Error handling:** all symbolic per ADR-006. - ---- - -## Appendix A — Surface Version Table - -| Surface | Version | Notes | -| ------- | ------- | ----- | -| FCS/1 | v1-min | Canonical execution descriptors; governance captured in FCT/1. | -| FER/1 | v1.1 | Receipts enforce parity-first evidence, run_id dedup, typed logs, and RNG discipline (ADR-017). | -| FCT/1 | v1.0 | Certification transactions binding policy/intent/attestations with FER/1 sets. | -| FPD/1 | v1.0 | Publication digest linking FCT/1 to FER/1 receipts for federation replay. | - ---- - -## Document History - -* 0.2.1 (2025-10-26) — Phase Pack pointer updated; no semantic changes; archival preserves historical lineage per ADR-002. -* 0.2.2 (2025-10-26) — Promoted PH01 baseline to Approved; synchronized Phase Pack §1 anchors and closure snapshot. -* 0.2.3 (2025-10-27) — Added future scope alignment note pointing to FPS/1 and ADR-015; PH01 semantics remain unchanged. -* **0.2.4 (2025-11-14):** Added FR-014–FR-019 for FCS/1 composition, FER/1 receipts, and FCT/1 certification policies. -* **0.2.5 (2025-11-15):** Added FR-021 (formerly FR-020) enforcing acyclic FCS/1 composition and PCB1 arity validation. -* **0.2.6 (2025-11-19):** Registered FR-020 Deterministic Execution Envelope (Maat’s Balance) with timing evidence tags. -* **0.3.0 (2025-11-02):** Trimmed FCS/1 to execution-only (v1-min) under FR-014/FR-015; moved policy/intent/scope/role/authority to FCT/1 (FR-017); clarified registry admission behaviour and kept FER/1 unchanged. -* **0.3.1 (2025-11-21):** Updated FR-016 to require parity-first FER/1 receipts with executor sets, parity vectors, and FR-020 aligned timestamps. -* **0.3.2 (2025-11-22):** Registered FR-022 Federation Publication Digest (FPD/1) requirement tying FCT/1 publications to single-digest evidence and canonical logging. - -* **0.3.4 (2025-11-07):** Recorded FER/1 v1.1 requirement for Phase 04 and added surface version table. - -* **0.3.5 (2025-11-08):** Registered PH04 linkage & semantic placeholder requirements (FR-028…031). -* **0.3.6 (2025-11-09):** Promoted FR-028…031 to normative linkage requirements with CRS/1 validator enforcement. - -* **0.3.7 (2025-11-08):** Finalized FR-028…031 with CRS/1 immutability, GS/1 linkage, and certification coverage. - -* **0.3.8 (2025-11-09):** Promoted FR-028…FR-031 for concept-native domain and publication validation. -* **0.3.9 (2025-11-09):** Documented operational linkage: router endpoints, deterministic `/links`, and parent-required publish policy guidance. -* **0.3.10 (2025-11-11):** Registered FR-030 stateless, content-anchored FPD feed pagination requirement. - -* **0.3.11 (2025-11-09):** Extended FR-031 with WT/1 intake endpoints, validation, and evidence log references. -* **0.3.12 (2025-11-20):** Tightened FR-031 with `wt.pubkey` bindings, signature preimage exclusion, lineage/policy errors, and - expanded WT/1 vector evidence coverage. - -* **0.3.13 (2025-11-21):** Updated FR-031 for `has_pubkey` bindings (`ERR_WT_KEY_UNBOUND`), intent registry enforcement (`ERR_WT_INTENT_UNREGISTERED`), lineage policy rejection (`ERR_WT_PARENT_REQUIRED`), and expanded WT/1 vectors `TV-WT-001…009`. -* **0.3.14 (2025-11-22):** WT/1 intake and SOS/1 compat overlays proven with PH04-M4/M5 audit evidence. -* **0.3.15 (2025-11-22):** Recorded ADR-025/026 compat path requirements and evidence anchors for FR-031. - -* **0.3.16 (2025-11-23):** Compat lane now enforces ADR-025/026 validators (MPR/1 hash triple, IER/1 replay) with updated evidence surfaces. - -* **0.3.17 (2025-11-24):** Added FR-032–FR-034 for CT/1 replay determinism, numeric stability, and header integrity (ADR-027/028). - -* **0.4.0 (2025-11-11):** Added FR-BS-001…005 for ByteStore identity, atomic durability, SA/PA isolation, COR round-trip, and streaming determinism linked to DDS §11 / ADR-030. +This placeholder avoids drift between repos. diff --git a/tier1/tgk-1.md b/tier1/tgk-1.md index f5ade72..1076b40 100644 --- a/tier1/tgk-1.md +++ b/tier1/tgk-1.md @@ -1,138 +1,5 @@ -# TGK/1 — Trace Graph Kernel Semantics +# TGK/1 — moved -Status: Draft -Owner: Architecture -Version: 0.1.0 -SoT: No -Last Updated: 2025-11-30 -Tags: [tgk, determinism, index, federation] +Canonical spec: `vendor/amduat/tier1/tgk-1.md`. -**Document ID:** `TGK/1` -**Layer:** L1 — Semantic graph layer over ASL artifacts and PERs (no encodings) - -**Depends on (normative):** - -* `ASL/1-CORE` -* `ASL/1-CORE-INDEX` -* `ASL/LOG/1` -* `ASL/SYSTEM/1` - -**Informative references:** - -* `ENC-TGK1-EDGE` (core edge encoding, if present) -* `ENC-TGK-INDEX` (index encoding draft) -* `ASL/INDEX-ACCEL/1` -* `ENC-ASL-CORE-INDEX` - ---- - -## 0. Conventions - -The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119. - -TGK/1 defines semantic meaning only. It does not define storage formats, on-disk encodings, or execution operators. - ---- - -## 1. Purpose & Scope - -TGK/1 defines the **semantic layer** for Trace Graph Kernel (TGK) edges that relate ASL artifacts and PERs. -It keeps TGK thin and deterministic by reusing ASL index and log semantics. - -Non-goals: - -* New encodings for edges or indexes -* Query operators or execution plans -* Federation protocols or transport -* Re-definition of ASL or PEL semantics - ---- - -## 2. TGK Objects - -### 2.1 TGK Edge - -A TGK Edge is an **immutable record** representing a directed relationship between ASL artifacts and/or PERs. -TGK edges are semantic overlays and **MUST NOT** redefine or bypass ASL identity. -TGK/1-CORE defines the EdgeBody structure with ordered `from`/`to` lists; TGK/1 -does not further constrain cardinality. - -### 2.2 Canonical Edge Key - -Each TGK edge has a **Canonical Edge Key** that uniquely identifies it. -The Canonical Edge Key MUST be derived from the logical `EdgeBody` defined in -`TGK/1-CORE`, preserving list order and multiplicity: - -* `from`: ordered list of source node identifiers (MAY be empty) -* `to`: ordered list of destination node identifiers (MAY be empty) -* `payload`: reference carried by the edge -* `type`: edge type identifier -* Projection context (for example, PER or execution identity) when not already - captured by the edge payload or type profile - -Classification attributes (edge type keys, labels) **MUST NOT** affect canonical identity. - ---- - -## 3. Index and Visibility (Normative) - -TGK edges are **indexed objects** and inherit visibility from the ASL index and log: - -1. A TGK edge becomes visible only when its index record is admitted by a sealed segment and log order (ASL/LOG/1). -2. TGK traversal and lookup **MUST NOT** bypass index visibility or log ordering. -3. For a fixed `{Snapshot, LogPrefix}`, TGK edge lookup and shadowing **MUST** be deterministic (ASL/1-CORE-INDEX). -4. Tombstones and shadowing semantics follow ASL/1-CORE-INDEX and ASL/LOG/1 replay order. - -Index records MUST reference TGK/1-CORE edge identities. Index encodings MUST -NOT re-encode edge structure (`from[]`, `to[]`); they reference TGK/1-CORE edges -and carry only routing/filter metadata. - ---- - -## 4. Deterministic Traversal (Normative) - -TGK traversal operates over a snapshot/log-bounded view: - -* Inputs: `{Snapshot, LogPrefix}` and a seed set (nodes or edges). -* Outputs: only edges visible under the same `{Snapshot, LogPrefix}`. -* Traversal **MUST** be deterministic and replay-compatible with ASL/LOG/1. - -Deterministic ordering for traversal output MUST be: - -1. `logseq` ascending -2. Canonical Edge Key as tie-break - -Acceleration structures MAY be used but MUST NOT change semantics. - ---- - -## 5. Federation Alignment (Normative) - -Federation does not change TGK semantics. It only propagates edges and artifacts that are already visible under index rules. - -* Domain visibility and publication status are enforced via index metadata (ENC-ASL-CORE-INDEX). -* TGK edges keep canonical identity across domains. -* Cross-domain propagation MUST preserve snapshot/log determinism. - ---- - -## 6. Non-Goals - -TGK/1 does not define: - -* Edge encoding or storage layout -* Index segment formats -* Query languages or execution plans -* Acceleration rules beyond ASL/INDEX-ACCEL/1 - ---- - -## 7. Normative Invariants - -Conforming implementations MUST enforce: - -1. TGK edges are immutable and indexed objects. -2. No TGK visibility without index admission and log ordering. -3. Traversal is snapshot/log bounded and deterministic. -4. Federation does not alter TGK semantics; it only propagates visible edges. -5. Edge classification is not part of canonical identity. +This placeholder avoids drift between repos. diff --git a/vendor/amduat b/vendor/amduat index 0fc1fbd..3886716 160000 --- a/vendor/amduat +++ b/vendor/amduat @@ -1 +1 @@ -Subproject commit 0fc1fbd980292fda121834ad5672950a5449c840 +Subproject commit 3886716799d6019f092f3d002e42201c0772b669