Add core tier1 specs for ASL/TGK
This commit is contained in:
parent
0fc1fbd980
commit
3886716799
233
tier1/asl-core-index-1.md
Normal file
233
tier1/asl-core-index-1.md
Normal file
|
|
@ -0,0 +1,233 @@
|
|||
# ASL/1-CORE-INDEX — Semantic Index Model
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [deterministic, index, semantics]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-core-index.md | Canonical: /amduat/tier1/asl-core-index-1.md -->
|
||||
|
||||
**Document ID:** `ASL/1-CORE-INDEX`
|
||||
**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE`
|
||||
* `ASL/1-STORE`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
||||
* `ENC/ASL-CORE-INDEX/1` — bytes-on-disk encoding profile
|
||||
* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding)
|
||||
* `ASL/LOG/1` — append-only semantic log (segment visibility)
|
||||
* `TGK/1` — TGK edge visibility and traversal alignment
|
||||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose & Non-Goals
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:
|
||||
|
||||
* It specifies what it means to map an artifact identity to a byte location.
|
||||
* It defines visibility, immutability, and shadowing semantics.
|
||||
* It ensures deterministic lookup for a fixed snapshot and log prefix.
|
||||
|
||||
### 1.2 Non-goals
|
||||
|
||||
ASL/1-CORE-INDEX explicitly does **not** define:
|
||||
|
||||
* On-disk layouts, segment files, or memory representations.
|
||||
* Block allocation, packing, GC, or lifecycle rules.
|
||||
* Snapshot implementation details, checkpoints, or log storage.
|
||||
* Performance optimizations (bloom filters, sharding, SIMD).
|
||||
* Federation, provenance, or execution semantics.
|
||||
|
||||
---
|
||||
|
||||
## 2. Terminology
|
||||
|
||||
* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE.
|
||||
* **Reference** — ASL/1 content address of an Artifact (hash_id + digest).
|
||||
* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE).
|
||||
* **Block** — immutable storage unit containing artifact bytes.
|
||||
* **BlockID** — opaque identifier for a block.
|
||||
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
|
||||
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
|
||||
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
|
||||
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
|
||||
* **CURRENT** — effective state after replaying a log prefix on a snapshot.
|
||||
|
||||
---
|
||||
|
||||
## 3. Core Mapping Semantics
|
||||
|
||||
### 3.1 Index Mapping
|
||||
|
||||
The index defines a semantic mapping:
|
||||
|
||||
```
|
||||
Reference -> ArtifactLocation
|
||||
```
|
||||
|
||||
For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state.
|
||||
|
||||
### 3.2 Determinism
|
||||
|
||||
For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
|
||||
|
||||
### 3.3 StoreConfig Consistency
|
||||
|
||||
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.
|
||||
|
||||
---
|
||||
|
||||
## 4. ArtifactLocation Semantics
|
||||
|
||||
* An ArtifactLocation is an **ordered list** of ArtifactExtents.
|
||||
* Each extent references immutable bytes within a block.
|
||||
* The artifact bytes are defined by **concatenating extents in order**.
|
||||
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
|
||||
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
|
||||
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
|
||||
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
||||
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
||||
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
|
||||
|
||||
---
|
||||
|
||||
## 5. Visibility Model
|
||||
|
||||
An index entry is **visible** at CURRENT if and only if:
|
||||
|
||||
1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
|
||||
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
|
||||
|
||||
Visibility is binary; entries are either visible or not visible.
|
||||
|
||||
---
|
||||
|
||||
## 6. Snapshot and Log Semantics
|
||||
|
||||
Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.
|
||||
|
||||
The index state for a given CURRENT is defined as:
|
||||
|
||||
```
|
||||
Index(CURRENT) = Index(snapshot) + replay(log_prefix)
|
||||
```
|
||||
|
||||
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
|
||||
|
||||
---
|
||||
|
||||
## 7. Immutability and Shadowing
|
||||
|
||||
### 7.1 Immutability
|
||||
|
||||
* Index entries are never mutated.
|
||||
* Once visible, an entry’s meaning does not change.
|
||||
* Referenced bytes are immutable for the lifetime of the entry.
|
||||
|
||||
### 7.2 Shadowing
|
||||
|
||||
* Later entries MAY shadow earlier entries with the same Reference.
|
||||
* Precedence is determined solely by log order.
|
||||
* Snapshot boundaries do not alter shadowing semantics.
|
||||
|
||||
---
|
||||
|
||||
## 8. Tombstones (Optional)
|
||||
|
||||
Tombstone entries MAY be used to invalidate prior mappings.
|
||||
|
||||
* A tombstone shadows earlier entries for the same Reference.
|
||||
* Visibility rules are identical to regular entries.
|
||||
* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.
|
||||
|
||||
---
|
||||
|
||||
## 9. Determinism Guarantees
|
||||
|
||||
For fixed:
|
||||
|
||||
* StoreConfig
|
||||
* Snapshot
|
||||
* Log prefix
|
||||
|
||||
ASL/1-CORE-INDEX guarantees:
|
||||
|
||||
* Deterministic lookup results
|
||||
* Deterministic shadowing resolution
|
||||
* Deterministic visibility
|
||||
|
||||
---
|
||||
|
||||
## 10. Normative Invariants
|
||||
|
||||
Conforming implementations MUST enforce:
|
||||
|
||||
1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
|
||||
2. No mutation of visible index entries.
|
||||
3. Referenced bytes remain immutable for the entry’s lifetime.
|
||||
4. Shadowing follows strict log order.
|
||||
5. Snapshot + log replay uniquely defines CURRENT.
|
||||
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.
|
||||
|
||||
Violation of any invariant constitutes index corruption.
|
||||
|
||||
---
|
||||
|
||||
## 11. Relationship to Other Specifications
|
||||
|
||||
| Layer | Responsibility |
|
||||
| ------------------ | ---------------------------------------------------------- |
|
||||
| ASL/1-CORE | Artifact semantics and identity |
|
||||
| ASL/1-STORE | StoreSnapshot and put/get logical model |
|
||||
| ASL/1-CORE-INDEX | Semantic mapping of Reference → ArtifactLocation |
|
||||
| ASL-STORE-INDEX | Lifecycle, replay, and visibility contracts |
|
||||
| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records |
|
||||
|
||||
---
|
||||
|
||||
## 12. Summary
|
||||
|
||||
ASL/1-CORE-INDEX specifies the semantic meaning of the index:
|
||||
|
||||
* It maps artifact References to byte locations deterministically.
|
||||
* It defines visibility and shadowing rules across snapshot + log replay.
|
||||
* It guarantees immutability and deterministic lookup.
|
||||
|
||||
It answers one question:
|
||||
|
||||
> *Given a Reference and a CURRENT state, where are the bytes?*
|
||||
296
tier1/asl-index-accel-1.md
Normal file
296
tier1/asl-index-accel-1.md
Normal file
|
|
@ -0,0 +1,296 @@
|
|||
# ASL/INDEX-ACCEL/1 — Index Acceleration Semantics
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [deterministic, index, acceleration]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-index-accel-1.md | Canonical: /amduat/tier1/asl-index-accel-1.md -->
|
||||
|
||||
**Document ID:** `ASL/INDEX-ACCEL/1`
|
||||
**Layer:** L1 — Acceleration rules over index semantics (no storage / encoding)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE-INDEX`
|
||||
* `ASL/LOG/1`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
||||
* `ENC/ASL-CORE-INDEX/1` — bytes-on-disk encoding profile
|
||||
* `TGK/1` — TGK semantics and visibility alignment
|
||||
* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/INDEX-ACCEL/1 defines **acceleration semantics only**. It MUST NOT change index meaning defined by ASL/1-CORE-INDEX.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
ASL/INDEX-ACCEL/1 defines **acceleration mechanisms** used by ASL-based indexes, including:
|
||||
|
||||
* Routing keys
|
||||
* Sharding
|
||||
* Filters (Bloom, XOR, Ribbon, etc.)
|
||||
* SIMD execution
|
||||
* Hash recasting
|
||||
|
||||
All mechanisms defined herein are **observationally invisible** to ASL/1-CORE-INDEX semantics.
|
||||
|
||||
---
|
||||
|
||||
## 2. Scope
|
||||
|
||||
Applies to:
|
||||
|
||||
* Artifact indexes (ASL)
|
||||
* Projection and graph indexes (e.g., TGK)
|
||||
* Any index layered on ASL/1-CORE-INDEX semantics
|
||||
|
||||
Does **not** define:
|
||||
|
||||
* Artifact or edge identity
|
||||
* Snapshot semantics
|
||||
* Storage lifecycle
|
||||
* Encoding details
|
||||
|
||||
---
|
||||
|
||||
## 3. Canonical Key vs Routing Key
|
||||
|
||||
### 3.1 Canonical Key
|
||||
|
||||
The **Canonical Key** uniquely identifies an indexable entity.
|
||||
|
||||
Examples:
|
||||
|
||||
* Artifact: `Reference`
|
||||
* TGK Edge: canonical key defined by `TGK/1` and `TGK/1-CORE` (opaque here)
|
||||
|
||||
Properties:
|
||||
|
||||
* Defines semantic identity
|
||||
* Used for equality, shadowing, and tombstones
|
||||
* Stable and immutable
|
||||
* Fully compared on index match
|
||||
|
||||
### 3.2 Routing Key
|
||||
|
||||
The **Routing Key** is a **derived, advisory key** used exclusively for acceleration.
|
||||
|
||||
Properties:
|
||||
|
||||
* Derived deterministically from Canonical Key and optional attributes
|
||||
* MAY be used for sharding, filters, SIMD layouts
|
||||
* MUST NOT affect index semantics
|
||||
* MUST be verified by full Canonical Key comparison on match
|
||||
|
||||
Formal rule:
|
||||
|
||||
```
|
||||
CanonicalKey determines correctness
|
||||
RoutingKey determines performance
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Filter Semantics
|
||||
|
||||
### 4.1 Advisory Nature
|
||||
|
||||
All filters are **advisory only**.
|
||||
|
||||
Rules:
|
||||
|
||||
* False positives are permitted
|
||||
* False negatives are forbidden
|
||||
* Filter behavior MUST NOT affect correctness
|
||||
|
||||
Invariant:
|
||||
|
||||
```
|
||||
Filter miss => key is definitely absent
|
||||
Filter hit => key may be present
|
||||
```
|
||||
|
||||
### 4.2 Filter Inputs
|
||||
|
||||
Filters operate over **Routing Keys**, not Canonical Keys.
|
||||
|
||||
A Routing Key MAY incorporate:
|
||||
|
||||
* Hash of Canonical Key
|
||||
* Artifact type tag (if present)
|
||||
* TGK `EdgeTypeId` or other immutable classification attributes (TGK/1-CORE)
|
||||
* Direction, role, or other immutable classification attributes
|
||||
|
||||
Absence of optional attributes MUST be encoded explicitly.
|
||||
|
||||
### 4.3 Filter Construction
|
||||
|
||||
* Filters are built only over **sealed, immutable segments**
|
||||
* Filters are immutable once built
|
||||
* Filter construction MUST be deterministic
|
||||
* Filter state MUST be covered by segment checksums
|
||||
* Filters SHOULD be snapshot-scoped or versioned with their segment to avoid
|
||||
unbounded false-positive accumulation over time
|
||||
|
||||
---
|
||||
|
||||
## 5. Sharding Semantics
|
||||
|
||||
### 5.1 Observational Invisibility
|
||||
|
||||
Sharding is a **mechanical partitioning** of the index.
|
||||
|
||||
Invariant:
|
||||
|
||||
```
|
||||
LogicalIndex = union(all shards)
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
* Shards MUST NOT affect lookup results
|
||||
* Shard count and boundaries may change over time
|
||||
* Rebalancing MUST preserve lookup semantics
|
||||
|
||||
### 5.2 Shard Assignment
|
||||
|
||||
Shard assignment MAY be based on:
|
||||
|
||||
* Hash of Canonical Key
|
||||
* Routing Key
|
||||
* Composite routing strategies
|
||||
|
||||
Shard selection MUST be deterministic per snapshot.
|
||||
|
||||
---
|
||||
|
||||
## 6. Hashing and Hash Recasting
|
||||
|
||||
### 6.1 Hashing
|
||||
|
||||
Hashes MAY be used for routing, filtering, or SIMD layout.
|
||||
|
||||
Hashes MUST NOT be treated as identity.
|
||||
|
||||
### 6.2 Hash Recasting
|
||||
|
||||
Hash recasting (changing hash functions or seeds) is permitted if:
|
||||
|
||||
1. It is deterministic
|
||||
2. It does not change Canonical Keys
|
||||
3. It does not affect index semantics
|
||||
|
||||
Recasting is equivalent to rebuilding acceleration structures.
|
||||
|
||||
---
|
||||
|
||||
## 7. SIMD Execution
|
||||
|
||||
SIMD operations MAY be used to:
|
||||
|
||||
* Evaluate filters
|
||||
* Compare routing keys
|
||||
* Accelerate scans
|
||||
|
||||
Rules:
|
||||
|
||||
* SIMD must operate only on immutable data
|
||||
* SIMD must not short-circuit semantic checks
|
||||
* SIMD must preserve deterministic behavior
|
||||
|
||||
---
|
||||
|
||||
## 8. Multi-Dimensional Routing Examples (Normative)
|
||||
|
||||
### 8.1 Artifact Index
|
||||
|
||||
* Canonical Key: `Reference`
|
||||
* Routing Key components:
|
||||
|
||||
* `H(Reference)`
|
||||
* `type_tag` (if present)
|
||||
* `has_typetag`
|
||||
|
||||
### 8.2 TGK Edge Index
|
||||
|
||||
* Canonical Key: defined by `TGK/1` and `TGK/1-CORE` (opaque here)
|
||||
* Routing Key components:
|
||||
|
||||
* `H(CanonicalEdgeKey)`
|
||||
* `EdgeTypeId` (if present in the TGK profile)
|
||||
* Direction or role (optional)
|
||||
|
||||
---
|
||||
|
||||
## 9. Snapshot Interaction
|
||||
|
||||
Acceleration structures:
|
||||
|
||||
* MUST respect snapshot visibility rules
|
||||
* MUST operate over the same sealed segments visible to the snapshot
|
||||
* MUST NOT bypass tombstones or shadowing
|
||||
|
||||
Snapshot cuts apply **after** routing and filtering.
|
||||
|
||||
---
|
||||
|
||||
## 10. Normative Invariants
|
||||
|
||||
1. Canonical Keys define identity and correctness
|
||||
2. Routing Keys are advisory only
|
||||
3. Filters may never introduce false negatives
|
||||
4. Sharding is observationally invisible
|
||||
5. Hashes are not identity
|
||||
6. SIMD is an execution strategy, not a semantic construct
|
||||
7. All acceleration is deterministic per snapshot
|
||||
|
||||
---
|
||||
|
||||
## 11. Non-Goals
|
||||
|
||||
ASL/INDEX-ACCEL/1 does not define:
|
||||
|
||||
* Specific filter algorithms
|
||||
* Memory layout
|
||||
* CPU instruction selection
|
||||
* Encoding formats
|
||||
* Federation policies
|
||||
|
||||
---
|
||||
|
||||
## 12. Summary
|
||||
|
||||
ASL/INDEX-ACCEL/1 establishes a strict contract:
|
||||
|
||||
> All acceleration exists to make the index faster, never different.
|
||||
|
||||
It formalizes Canonical vs Routing keys and constrains filters, sharding, hashing, and SIMD so that correctness is preserved under all optimizations.
|
||||
139
tier1/asl-indexes-1.md
Normal file
139
tier1/asl-indexes-1.md
Normal file
|
|
@ -0,0 +1,139 @@
|
|||
# ASL/INDEXES/1 -- Index Taxonomy and Relationships
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-01-17
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [indexes, content, structural, materialization]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-indexes-1.md | Canonical: /amduat/tier1/asl-indexes-1.md -->
|
||||
|
||||
**Document ID:** `ASL/INDEXES/1`
|
||||
**Layer:** L2 -- Index taxonomy (no encoding)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE-INDEX`
|
||||
* `ASL/STORE-INDEX/1`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/SYSTEM/1`
|
||||
* `TGK/1`
|
||||
* `ENC/ASL-CORE-INDEX/1`
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/INDEXES/1 defines index roles and relationships. It does not define encodings or storage layouts.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document defines the minimal set of indexes used by ASL systems and their dependency relationships.
|
||||
|
||||
---
|
||||
|
||||
## 2. Index Taxonomy (Normative)
|
||||
|
||||
ASL systems use three distinct indexes:
|
||||
|
||||
### 2.1 Content Index
|
||||
|
||||
Purpose: map semantic identity to bytes.
|
||||
|
||||
```
|
||||
ArtifactKey -> ArtifactLocation
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
||||
* Snapshot-relative and append-only
|
||||
* Deterministic replay
|
||||
* Optional tombstone shadowing
|
||||
|
||||
This is the ASL/1-CORE-INDEX and is the only index that governs visibility.
|
||||
|
||||
### 2.2 Structural Index
|
||||
|
||||
Purpose: map structural identity to a derivation DAG node.
|
||||
|
||||
```
|
||||
SID -> DAG node
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
||||
* Deterministic and rebuildable
|
||||
* Does not imply materialization
|
||||
* May be in-memory or persisted
|
||||
|
||||
### 2.3 Materialization Cache
|
||||
|
||||
Purpose: record previously materialized content for a structural identity.
|
||||
|
||||
```
|
||||
SID -> ArtifactKey
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
||||
* Redundant and safe to drop
|
||||
* Recomputable from DAG + content index
|
||||
* Pure performance optimization
|
||||
|
||||
---
|
||||
|
||||
## 3. Dependency Rules (Normative)
|
||||
|
||||
Dependencies MUST follow this direction:
|
||||
|
||||
```
|
||||
Structural Index -> Materialization Cache -> Content Index
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
* The Content Index MUST NOT depend on the Structural Index.
|
||||
* The Structural Index MUST NOT depend on stored bytes.
|
||||
* The Materialization Cache MAY depend on both.
|
||||
|
||||
---
|
||||
|
||||
## 4. PUT/GET Interaction (Informative)
|
||||
|
||||
* PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index.
|
||||
* GET consults only the Content Index and reads bytes from the store.
|
||||
* The Structural Index and Materialization Cache are optional optimizations for PUT.
|
||||
|
||||
---
|
||||
|
||||
## 5. Non-Goals
|
||||
|
||||
ASL/INDEXES/1 does not define:
|
||||
|
||||
* Encodings for any index
|
||||
* Storage layout or sharding
|
||||
* Query operators or traversal semantics
|
||||
314
tier1/asl-log-1.md
Normal file
314
tier1/asl-log-1.md
Normal file
|
|
@ -0,0 +1,314 @@
|
|||
# ASL/LOG/1 — Append-Only Semantic Log
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [deterministic, log, snapshot]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-log-1.md | Canonical: /amduat/tier1/asl-log-1.md -->
|
||||
|
||||
**Document ID:** `ASL/LOG/1`
|
||||
**Layer:** L1 — Domain log semantics (no transport)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts (pending spec)
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/1-CORE-INDEX` — index semantics
|
||||
* `TGK/1` — TGK edge visibility and traversal alignment
|
||||
* `ENC/ASL-LOG/1` — bytes-on-disk encoding profile
|
||||
* `ENC/ASL-CORE-INDEX/1` — index segment encoding
|
||||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/LOG/1 defines **semantic log behavior**. It does not define transport, replication protocols, or storage layout.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
ASL/LOG/1 defines the **authoritative, append-only log** for an ASL domain.
|
||||
|
||||
The log records **semantic commits** that affect:
|
||||
|
||||
* Index segment visibility
|
||||
* Tombstone policy
|
||||
* Snapshot anchoring
|
||||
* Optional publication metadata
|
||||
|
||||
The log is the **sole source of truth** for reconstructing CURRENT state.
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Properties (Normative)
|
||||
|
||||
An ASL log MUST be:
|
||||
|
||||
1. Append-only
|
||||
2. Strictly ordered
|
||||
3. Deterministically replayable
|
||||
4. Hash-chained
|
||||
5. Snapshot-anchorable
|
||||
6. Binary encoded per `ENC-ASL-LOG`
|
||||
7. Forward-compatible
|
||||
|
||||
---
|
||||
|
||||
## 3. Log Model
|
||||
|
||||
### 3.1 Log Sequence
|
||||
|
||||
Each record has a monotonically increasing `logseq`:
|
||||
|
||||
```
|
||||
logseq: uint64
|
||||
```
|
||||
|
||||
* Assigned by the domain authority
|
||||
* Total order within a domain
|
||||
* Never reused
|
||||
|
||||
### 3.2 Hash Chain
|
||||
|
||||
Each record commits to the previous record:
|
||||
|
||||
```
|
||||
record_hash = H(prev_record_hash || logseq || record_type || payload_len || payload)
|
||||
```
|
||||
|
||||
This enables tamper detection, witness signing, and federation verification.
|
||||
|
||||
### 3.3 Record Envelope
|
||||
|
||||
All log records share a common envelope whose **exact byte layout** is defined
|
||||
in `ENC-ASL-LOG`. The envelope MUST include:
|
||||
|
||||
* `logseq` (monotonic sequence number)
|
||||
* `record_type` (type tag)
|
||||
* `payload_len` (bytes)
|
||||
* `payload` (type-specific bytes)
|
||||
* `record_hash` (hash-chained integrity)
|
||||
|
||||
---
|
||||
|
||||
## 4. Record Types (Normative)
|
||||
|
||||
## 4.0 Common Payload Encoding (Informative)
|
||||
|
||||
The byte-level payload schemas are defined in `ENC-ASL-LOG`. The shared
|
||||
artifact reference encoding is:
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
uint32_t hash_id;
|
||||
uint16_t digest_len;
|
||||
uint16_t reserved0; // must be 0
|
||||
uint8_t digest[digest_len];
|
||||
} ArtifactRef;
|
||||
```
|
||||
|
||||
### 4.1 SEGMENT_SEAL
|
||||
|
||||
Declares an index segment visible.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
uint64_t segment_id;
|
||||
uint8_t segment_hash[32];
|
||||
} SegmentSealPayload;
|
||||
```
|
||||
|
||||
Semantics:
|
||||
|
||||
* From this `logseq` onward, the referenced segment is visible for lookup and replay.
|
||||
* Segment MUST be immutable.
|
||||
* All referenced blocks MUST already be sealed.
|
||||
* Segment contents are not re-logged.
|
||||
|
||||
### 4.2 TOMBSTONE
|
||||
|
||||
Declares an artifact inadmissible under domain policy.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
uint32_t scope;
|
||||
uint32_t reason_code;
|
||||
} TombstonePayload;
|
||||
```
|
||||
|
||||
Semantics:
|
||||
|
||||
* Does not delete data.
|
||||
* Shadows prior visibility.
|
||||
* Applies from this logseq onward.
|
||||
|
||||
### 4.3 TOMBSTONE_LIFT
|
||||
|
||||
Supersedes a previous tombstone.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
uint64_t tombstone_logseq;
|
||||
} TombstoneLiftPayload;
|
||||
```
|
||||
|
||||
Semantics:
|
||||
|
||||
* References an earlier TOMBSTONE.
|
||||
* Does not erase history.
|
||||
* Only affects CURRENT at or above this logseq.
|
||||
|
||||
### 4.4 SNAPSHOT_ANCHOR
|
||||
|
||||
Binds semantic state to a snapshot.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
uint64_t snapshot_id;
|
||||
uint8_t root_hash[32];
|
||||
} SnapshotAnchorPayload;
|
||||
```
|
||||
|
||||
Semantics:
|
||||
|
||||
* Defines a replay checkpoint.
|
||||
* Enables log truncation below anchor with care.
|
||||
|
||||
### 4.5 ARTIFACT_PUBLISH (Optional)
|
||||
|
||||
Marks an artifact as published.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
} ArtifactPublishPayload;
|
||||
```
|
||||
|
||||
Semantics:
|
||||
|
||||
* Publication is domain-local.
|
||||
* Federation layers may interpret this metadata.
|
||||
|
||||
### 4.6 ARTIFACT_UNPUBLISH (Optional)
|
||||
|
||||
Withdraws publication.
|
||||
|
||||
Payload (encoding):
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
} ArtifactUnpublishPayload;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Replay Semantics (Normative)
|
||||
|
||||
To reconstruct CURRENT:
|
||||
|
||||
1. Load latest snapshot anchor (if any).
|
||||
2. Initialize visible segments from that snapshot.
|
||||
3. Replay all log records with `logseq > snapshot.logseq`.
|
||||
4. Apply records in order:
|
||||
|
||||
* SEGMENT_SEAL -> add segment
|
||||
* TOMBSTONE -> update policy state
|
||||
* TOMBSTONE_LIFT -> override policy
|
||||
* PUBLISH/UNPUBLISH -> update visibility metadata
|
||||
|
||||
Replay MUST be deterministic.
|
||||
|
||||
---
|
||||
|
||||
## 6. Index Interaction
|
||||
|
||||
* Index segments contain index entries.
|
||||
* The log never records individual index entries.
|
||||
* Visibility is controlled solely by SEGMENT_SEAL.
|
||||
* Index rebuild = scan visible segments + apply policy.
|
||||
|
||||
---
|
||||
|
||||
## 7. Garbage Collection Constraints
|
||||
|
||||
* A segment may be GC'd only if:
|
||||
|
||||
* No snapshot references it.
|
||||
* No log replay <= CURRENT requires it.
|
||||
|
||||
* Log truncation is only safe at SNAPSHOT_ANCHOR boundaries.
|
||||
|
||||
---
|
||||
|
||||
## 8. Versioning & Extensibility
|
||||
|
||||
* Unknown record types MUST be skipped and MUST NOT break replay.
|
||||
* Payloads are opaque outside their type.
|
||||
* New record types may be added in later versions.
|
||||
|
||||
---
|
||||
|
||||
## 9. Non-Goals
|
||||
|
||||
ASL/LOG/1 does not define:
|
||||
|
||||
* Federation protocols
|
||||
* Network replication
|
||||
* Witness signatures
|
||||
* Block-level events
|
||||
* Hydration / eviction
|
||||
* Execution receipts
|
||||
|
||||
---
|
||||
|
||||
## 10. Invariant (Informative)
|
||||
|
||||
> If it affects visibility, admissibility, or authority, it goes in the log.
|
||||
> If it affects layout or performance, it does not.
|
||||
|
||||
---
|
||||
|
||||
## 10. Summary
|
||||
|
||||
ASL/LOG/1 defines the minimal semantic log needed to reconstruct CURRENT.
|
||||
|
||||
If it affects visibility or admissibility, it goes in the log. If it affects layout or performance, it does not.
|
||||
414
tier1/asl-store-index-1.md
Normal file
414
tier1/asl-store-index-1.md
Normal file
|
|
@ -0,0 +1,414 @@
|
|||
# ASL/STORE-INDEX/1 — Store Semantics and Contracts for ASL Core Index
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [deterministic, index, log, storage]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-store-index.md | Canonical: /amduat/tier1/asl-store-index-1.md -->
|
||||
|
||||
**Document ID:** `ASL/STORE-INDEX/1`
|
||||
**Layer:** L1 — Store lifecycle and replay contracts (no encoding)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE-INDEX` — semantic index model
|
||||
* `ASL/LOG/1` — append-only log semantics
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ENC/ASL-CORE-INDEX/1` — index segment encoding
|
||||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||||
* `TGK/1` — TGK semantics and visibility alignment
|
||||
* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document defines the **operational and store-level semantics** required to implement ASL-CORE-INDEX.
|
||||
|
||||
It specifies:
|
||||
|
||||
* **Block lifecycle**: creation, sealing, retention, GC
|
||||
* **Index segment lifecycle**: creation, append, seal, visibility
|
||||
* **Snapshot identity and log positions** for deterministic replay
|
||||
* **Append-only log semantics**
|
||||
* **Lookup, visibility, and crash recovery rules**
|
||||
* **Small vs large block handling**
|
||||
|
||||
It **does not define encoding** (see `ENC/ASL-CORE-INDEX/1`) or semantic mapping (see `ASL/1-CORE-INDEX`).
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||||
* `TGK/1` — TGK semantics and visibility alignment
|
||||
* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions
|
||||
|
||||
---
|
||||
|
||||
## 2. Scope
|
||||
|
||||
Covers:
|
||||
|
||||
* Lifecycle of **blocks** and **index entries**
|
||||
* Snapshot and CURRENT consistency guarantees
|
||||
* Deterministic replay and recovery
|
||||
* GC and tombstone semantics
|
||||
* Packing policy for small vs large artifacts
|
||||
|
||||
Excludes:
|
||||
|
||||
* Disk-level encoding
|
||||
* Sharding or acceleration strategies (see ASL/INDEX-ACCEL/1)
|
||||
* Memory residency or caching
|
||||
* Federation, PEL, or TGK semantics (see `TGK/1` and `TGK/1-CORE`)
|
||||
|
||||
---
|
||||
|
||||
## 3. Core Concepts
|
||||
|
||||
### 3.1 Block
|
||||
|
||||
* **Definition:** Immutable storage unit containing artifact bytes.
|
||||
* **Identifier:** BlockID (opaque, unique).
|
||||
* **Properties:**
|
||||
|
||||
* Once sealed, contents never change.
|
||||
* Can be referenced by multiple artifacts.
|
||||
* May be pinned by snapshots for retention.
|
||||
* Allocation method is implementation-defined (e.g., hash or sequence).
|
||||
|
||||
### 3.2 Index Segment
|
||||
|
||||
Segments group index entries and provide **persistence and recovery units**.
|
||||
|
||||
* **Open segment:** accepting new index entries, not visible for lookup.
|
||||
* **Sealed segment:** closed for append, log-visible, snapshot-pinnable.
|
||||
* **Segment components:** header, optional bloom filter, index records, footer.
|
||||
* **Segment visibility:** only after seal and log append.
|
||||
|
||||
### 3.3 Append-Only Log
|
||||
|
||||
All store-visible mutations are recorded in a **strictly ordered, append-only log**:
|
||||
|
||||
* Entries include:
|
||||
|
||||
* Index additions
|
||||
* Tombstones
|
||||
* Segment seals
|
||||
* Log is replayable to reconstruct CURRENT.
|
||||
* Log semantics are defined in `ASL/LOG/1`.
|
||||
|
||||
### 3.4 Snapshot Identity and Log Position
|
||||
|
||||
To make CURRENT referencable and replayable, ASL-STORE-INDEX defines:
|
||||
|
||||
* **SnapshotID**: opaque, immutable identifier for a snapshot.
|
||||
* **LogPosition**: monotonic integer position in the append-only log.
|
||||
* **IndexState**: `(SnapshotID, LogPosition)`.
|
||||
|
||||
Deterministic replay is defined as:
|
||||
|
||||
```
|
||||
Index(SnapshotID, LogPosition) = Snapshot[SnapshotID] + replay(log[0:LogPosition])
|
||||
```
|
||||
|
||||
Snapshots and log positions are required for checkpointing, federation, and deterministic recovery.
|
||||
|
||||
### 3.5 Artifact Location
|
||||
|
||||
* **ArtifactExtent**: `(BlockID, offset, length)` identifying a byte slice within a block.
|
||||
* **ArtifactLocation**: ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
|
||||
* Multi-extent locations allow a single artifact to be striped across multiple blocks.
|
||||
|
||||
---
|
||||
|
||||
## 4. PUT/GET Contract (Normative)
|
||||
|
||||
### 4.1 PUT Signature
|
||||
|
||||
```
|
||||
put(artifact) -> (ArtifactKey, IndexState)
|
||||
```
|
||||
|
||||
* `ArtifactKey` is the content identity (ASL/1-CORE-INDEX).
|
||||
* `IndexState = (SnapshotID, LogPosition)` after the PUT is admitted.
|
||||
|
||||
### 4.2 PUT Semantics
|
||||
|
||||
1. **Structural registration (if applicable)**: if a structural index (SID -> DAG) exists, it MUST register the artifact and reuse existing SID entries.
|
||||
2. **Materialization (if applicable)**: if the artifact is lazy, materialize deterministically to derive `ArtifactKey`.
|
||||
3. **Deduplication**: lookup `ArtifactKey` at CURRENT. If present, PUT MUST succeed without writing bytes or adding a new index entry.
|
||||
4. **Storage**: if absent, write bytes to one or more sealed blocks and produce `ArtifactLocation`.
|
||||
5. **Index mutation**: append an index entry mapping `ArtifactKey -> ArtifactLocation` and record visibility via log order.
|
||||
|
||||
### 4.3 PUT Guarantees
|
||||
|
||||
* PUT is idempotent for identical artifacts.
|
||||
* No visible index entry points to mutable or missing bytes.
|
||||
* Visibility follows log order and seal rules defined in this document.
|
||||
|
||||
### 4.4 GET Signature
|
||||
|
||||
```
|
||||
get(ArtifactKey, IndexState?) -> bytes | NOT_FOUND
|
||||
```
|
||||
|
||||
* `IndexState` defaults to CURRENT when omitted.
|
||||
|
||||
### 4.5 GET Semantics
|
||||
|
||||
1. Resolve `ArtifactKey -> ArtifactLocation` using `Index(snapshot, log_prefix)`.
|
||||
2. If no entry exists, return `NOT_FOUND`.
|
||||
3. Otherwise, read exactly the referenced `(BlockID, offset, length)` bytes and return them verbatim.
|
||||
|
||||
GET MUST NOT mutate state or trigger materialization.
|
||||
|
||||
### 4.6 Failure Semantics
|
||||
|
||||
* Partial writes MUST NOT become visible.
|
||||
* Replay of snapshot + log after crash MUST reconstruct a valid CURRENT.
|
||||
* Implementations MAY use caching, but MUST preserve determinism.
|
||||
|
||||
---
|
||||
|
||||
## 5. Block Lifecycle Semantics
|
||||
|
||||
| Event | Description | Semantic Guarantees |
|
||||
| ------------------ | ------------------------------------- | ------------------------------------------------------------- |
|
||||
| Creation | Block allocated; bytes may be written | Not visible to index until sealed |
|
||||
| Sealing | Block is finalized and immutable | Sealed blocks are stable and safe to reference from index |
|
||||
| Retention | Block remains accessible | Blocks referenced by snapshots or CURRENT must not be removed |
|
||||
| Garbage Collection | Block may be deleted | Only unpinned, unreachable blocks may be removed |
|
||||
|
||||
Notes:
|
||||
|
||||
* Sealing ensures any index entry referencing the block is immutable.
|
||||
* Retention is driven by snapshot and log visibility rules.
|
||||
* GC must **never violate CURRENT reconstruction guarantees**.
|
||||
|
||||
---
|
||||
|
||||
## 6. Segment Lifecycle Semantics
|
||||
|
||||
### 5.1 Creation
|
||||
|
||||
* Open segment is allocated.
|
||||
* Index entries appended in log order.
|
||||
* Entries are invisible until segment seal and log append.
|
||||
|
||||
### 5.2 Seal
|
||||
|
||||
* Segment is closed to append.
|
||||
* Seal record is written to append-only log.
|
||||
* Segment becomes visible for lookup.
|
||||
* Sealed segment may be snapshot-pinned.
|
||||
|
||||
### 5.3 Snapshot Interaction
|
||||
|
||||
* Snapshots capture sealed segments.
|
||||
* Open segments need not survive snapshot.
|
||||
* Segments below snapshot are replay anchors.
|
||||
|
||||
---
|
||||
|
||||
## 7. Visibility and Lookup Semantics
|
||||
|
||||
### 6.1 Visibility Rules
|
||||
|
||||
* Entry visible **iff**:
|
||||
|
||||
* The block is sealed.
|
||||
* Log record exists at position ≤ CURRENT.
|
||||
* Segment seal recorded in log.
|
||||
|
||||
* Entries above CURRENT or referencing unsealed blocks are invisible.
|
||||
|
||||
### 6.2 Lookup Semantics
|
||||
|
||||
To resolve an `ArtifactKey`:
|
||||
|
||||
1. Identify all visible segments ≤ CURRENT.
|
||||
2. Search segments in **reverse seal-log order** (highest seal log position first).
|
||||
3. Return first matching entry.
|
||||
4. Respect tombstones to shadow prior entries.
|
||||
|
||||
Determinism:
|
||||
|
||||
* Lookup results are identical across platforms given the same snapshot and log prefix.
|
||||
* Accelerations (bloom filters, sharding, SIMD) **do not alter correctness**.
|
||||
|
||||
---
|
||||
|
||||
## 8. Snapshot Interaction
|
||||
|
||||
* Snapshots capture the set of **sealed blocks** and **sealed index segments** at a point in time.
|
||||
* Blocks referenced by a snapshot are **pinned** and cannot be garbage-collected until snapshot expiration.
|
||||
* CURRENT is reconstructed as:
|
||||
|
||||
```
|
||||
CURRENT = snapshot_state + replay(log)
|
||||
```
|
||||
|
||||
Segment and block visibility rules:
|
||||
|
||||
| Entity | Visible in snapshot | Visible in CURRENT |
|
||||
| -------------------- | ---------------------------- | ------------------------------ |
|
||||
| Open segment/block | No | Only after seal and log append |
|
||||
| Sealed segment/block | Yes, if included in snapshot | Yes, replayed from log |
|
||||
| Tombstone | Yes, if log-recorded | Yes, shadows prior entries |
|
||||
|
||||
---
|
||||
|
||||
## 9. Garbage Collection
|
||||
|
||||
Eligibility for GC:
|
||||
|
||||
* Segments: sealed, no references from CURRENT or snapshots.
|
||||
* Blocks: unpinned, unreferenced by any segment or artifact.
|
||||
|
||||
Rules:
|
||||
|
||||
* GC is safe **only on sealed segments and blocks**.
|
||||
* Must respect snapshot pins.
|
||||
* Tombstones may aid in invalidating unreachable blocks.
|
||||
* Snapshots retained for provenance or receipt verification MUST remain pinned.
|
||||
|
||||
Outcome:
|
||||
|
||||
* GC never violates CURRENT reconstruction.
|
||||
* Blocks can be reclaimed without breaking provenance.
|
||||
|
||||
---
|
||||
|
||||
## 10. Tombstone Semantics
|
||||
|
||||
* Optional marker to invalidate prior mappings.
|
||||
* Visibility rules identical to regular index entries.
|
||||
* Used to maintain deterministic CURRENT in face of shadowing or deletions.
|
||||
|
||||
---
|
||||
|
||||
## 11. Small vs Large Block Handling
|
||||
|
||||
### 11.1 Definitions
|
||||
|
||||
| Term | Meaning |
|
||||
| ----------------- | --------------------------------------------------------------------- |
|
||||
| **Small block** | Block containing artifact bytes below a threshold `T_small`. |
|
||||
| **Large block** | Block containing artifact bytes ≥ `T_small`. |
|
||||
| **Mixed segment** | Segment containing both small and large blocks (discouraged). |
|
||||
| **Packing** | Combining multiple small artifacts into a single physical block. |
|
||||
| **BlockID** | Opaque identifier for a block; addressing is identical for all sizes. |
|
||||
|
||||
Small vs large classification is **store-level only** and transparent to ASL-CORE and index layers.
|
||||
`T_small` is configurable per deployment.
|
||||
|
||||
### 11.2 Packing Rules
|
||||
|
||||
1. **Small blocks may be packed together** to reduce storage overhead.
|
||||
2. **Large blocks are never packed with other artifacts**.
|
||||
3. Mixed segments are **allowed but discouraged**; implementations MAY warn when mixing occurs.
|
||||
|
||||
### 11.3 Segment Allocation Rules
|
||||
|
||||
1. Small blocks are allocated into segments optimized for packing efficiency.
|
||||
2. Large blocks are allocated into segments optimized for sequential I/O.
|
||||
3. Segment sealing and visibility rules remain unchanged.
|
||||
|
||||
### 11.4 Indexing and Addressing
|
||||
|
||||
All blocks are addressed uniformly:
|
||||
|
||||
```
|
||||
ArtifactExtent = (BlockID, offset, length)
|
||||
ArtifactLocation = [ArtifactExtent...]
|
||||
```
|
||||
|
||||
Packing does **not** affect index semantics or determinism. Multi-extent ArtifactLocations are allowed.
|
||||
|
||||
### 11.5 GC and Retention
|
||||
|
||||
1. Packed small blocks can be reclaimed only when **all contained artifacts** are unreachable.
|
||||
2. Large blocks are reclaimed per block.
|
||||
|
||||
Invariant: GC must never remove bytes still referenced by CURRENT or snapshots.
|
||||
|
||||
---
|
||||
|
||||
## 12. Crash and Recovery Semantics
|
||||
|
||||
* Open segments or unsealed blocks may be lost; no invariant is broken.
|
||||
* Recovery procedure:
|
||||
|
||||
1. Mount last checkpoint snapshot.
|
||||
2. Replay append-only log from checkpoint.
|
||||
3. Reconstruct CURRENT.
|
||||
|
||||
* Recovery is **deterministic and idempotent**.
|
||||
* Segments and blocks **never partially visible** after crash.
|
||||
|
||||
---
|
||||
|
||||
## 13. Normative Invariants
|
||||
|
||||
1. Sealed blocks are immutable.
|
||||
2. Index entries referencing blocks are immutable once visible.
|
||||
3. Shadowing follows strict log order.
|
||||
4. Replay of snapshot + log uniquely reconstructs CURRENT.
|
||||
5. GC cannot remove blocks or segments needed by snapshot or CURRENT.
|
||||
6. Tombstones shadow prior entries without deleting underlying blocks prematurely.
|
||||
7. IndexState `(SnapshotID, LogPosition)` uniquely identifies CURRENT.
|
||||
|
||||
---
|
||||
|
||||
## 14. Non-Goals
|
||||
|
||||
* Disk-level encoding (ENC-ASL-CORE-INDEX).
|
||||
* Memory layout or caching.
|
||||
* Sharding or performance heuristics.
|
||||
* Federation / multi-domain semantics (handled elsewhere).
|
||||
* Block packing strategies beyond the policy rules here.
|
||||
|
||||
---
|
||||
|
||||
## 15. Relationship to Other Layers
|
||||
|
||||
| Layer | Responsibility |
|
||||
| ------------------ | ---------------------------------------------------------------------------- |
|
||||
| ASL-CORE | Artifact semantics, existence of blocks, immutability |
|
||||
| ASL-CORE-INDEX | Semantic mapping of ArtifactKey → ArtifactLocation |
|
||||
| ASL-STORE-INDEX | Lifecycle and operational contracts for blocks and segments |
|
||||
| ENC-ASL-CORE-INDEX | Bytes-on-disk layout for segments, index records, and optional bloom filters |
|
||||
|
||||
---
|
||||
|
||||
## 16. Summary
|
||||
|
||||
The tier1 ASL-STORE-INDEX specification:
|
||||
|
||||
* Defines **block lifecycle** and **segment lifecycle**.
|
||||
* Makes **snapshot identity and log positions** explicit for replay.
|
||||
* Ensures deterministic visibility, lookup, and crash recovery.
|
||||
* Formalizes GC safety and tombstone behavior.
|
||||
* Adds clear **small vs large block** handling without changing core semantics.
|
||||
213
tier1/asl-system-1.md
Normal file
213
tier1/asl-system-1.md
Normal file
|
|
@ -0,0 +1,213 @@
|
|||
# ASL/SYSTEM/1 — Unified ASL + TGK + PEL System View
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-01-17
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [deterministic, federation, pel, tgk, index]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-system-1.md | Canonical: /amduat/tier1/asl-system-1.md -->
|
||||
|
||||
**Document ID:** `ASL/SYSTEM/1`
|
||||
**Layer:** L2 — Cross-cutting system view (no new encodings)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE`
|
||||
* `ASL/1-CORE-INDEX`
|
||||
* `ASL/STORE-INDEX/1`
|
||||
* `ASL/LOG/1`
|
||||
* `ENC/ASL-CORE-INDEX/1`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/INDEX-ACCEL/1`
|
||||
* `TGK/1` — Trace Graph Kernel semantics
|
||||
* PEL draft specs (program DAG, execution receipts)
|
||||
* `ASL/FEDERATION/1` — core federation semantics
|
||||
* `ASL/FEDERATION-REPLAY/1` — cross-node deterministic replay
|
||||
* `ASL/DAP/1` — domain admission
|
||||
* `ASL/POLICY-HASH/1` — policy binding
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are
|
||||
to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/SYSTEM/1 is an integration view. It does not define new encodings or
|
||||
storage formats; those remain in the underlying layer specs.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose & Scope
|
||||
|
||||
This document aligns the cross-cutting semantics of:
|
||||
|
||||
* ASL index and log behavior
|
||||
* PEL deterministic execution
|
||||
* TGK edge semantics and traversal
|
||||
* Federation visibility and replay
|
||||
|
||||
It ensures a single, consistent model for determinism, snapshot bounds, and
|
||||
domain visibility.
|
||||
|
||||
Non-goals:
|
||||
|
||||
* New on-disk encodings
|
||||
* New execution operators
|
||||
* Domain policy or governance rules
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Objects (Unified View)
|
||||
|
||||
* **Artifact**: immutable byte value (ASL/1-CORE).
|
||||
* **PER**: PEL Execution Receipt stored as an artifact.
|
||||
* **TGK Edge**: immutable edge record linking artifacts and/or PERs.
|
||||
* **Snapshot + Log Prefix**: boundary for deterministic visibility and replay.
|
||||
* **Domain Visibility**: internal vs published visibility embedded in index
|
||||
records (ENC-ASL-CORE-INDEX).
|
||||
|
||||
All of these objects are addressed and stored via the same index semantics.
|
||||
|
||||
---
|
||||
|
||||
## 3. Determinism & Snapshot Boundaries
|
||||
|
||||
For a fixed `(SnapshotID, LogPrefix)`:
|
||||
|
||||
* Index lookup is deterministic (ASL/1-CORE-INDEX).
|
||||
* TGK traversal is deterministic when bounded by the same snapshot/log prefix.
|
||||
* PEL execution is deterministic when its inputs are bounded by the same
|
||||
snapshot/log prefix.
|
||||
|
||||
PEL MUST read only snapshot-scoped artifacts and receipts. It MUST NOT depend
|
||||
on storage layout, block packing, or non-snapshot metadata.
|
||||
|
||||
PEL outputs (artifacts and PERs) become visible only through normal index
|
||||
admission and log ordering.
|
||||
|
||||
PEL MUST NOT depend on physical storage metadata. It MAY read only:
|
||||
|
||||
* snapshot identity
|
||||
* execution configuration that is itself snapshot-scoped and immutable
|
||||
|
||||
---
|
||||
|
||||
## 4. One PEL Principle (Resolution)
|
||||
|
||||
There is exactly one PEL: a deterministic, snapshot-bound, authority-aware
|
||||
derivation language mapping artifacts to artifacts.
|
||||
|
||||
Distinctions such as "PEL-S" vs "PEL-P" are not separate languages. They are
|
||||
policy decisions about how outputs are treated:
|
||||
|
||||
* **Promotion** (truth vs view) is a domain policy decision.
|
||||
* **Publication** (internal vs published) is a visibility decision encoded in
|
||||
index metadata.
|
||||
* **Retention** (store, cache, discard, recompute) is a store policy decision.
|
||||
|
||||
Implementations MUST NOT fork PEL semantics into separate dialects. Any
|
||||
classification of outputs MUST be expressed via policy, publication flags, or
|
||||
receipt annotations, not by changing the execution language.
|
||||
|
||||
---
|
||||
|
||||
## 5. PEL, PERs, and TGK Integration
|
||||
|
||||
* PEL programs consume artifacts and/or PERs.
|
||||
* PEL execution produces artifacts and a PER describing the run.
|
||||
* TGK edges may reference artifacts, PERs, or projections derived from them.
|
||||
|
||||
---
|
||||
|
||||
## 5.1 PERs and Snapshot State (Clarification)
|
||||
|
||||
PERs are artifacts that bind deterministic execution to a specific snapshot
|
||||
and log prefix. They do not introduce a separate storage layer:
|
||||
|
||||
* The sequential log and snapshot define CURRENT.
|
||||
* A PER records that execution observed CURRENT at a specific log prefix.
|
||||
* Replay uses the same snapshot + log prefix to reconstruct inputs.
|
||||
* PERs are artifacts and MAY be used as inputs, but programs embedded in
|
||||
receipts MUST NOT be executed implicitly.
|
||||
|
||||
TGK remains a semantic graph layer; it does not alter PEL determinism and does
|
||||
not bypass the index.
|
||||
|
||||
---
|
||||
|
||||
## 6. Federation Alignment
|
||||
|
||||
Federation operates over the same immutable artifacts, PERs, and TGK edges.
|
||||
Cross-domain visibility is governed by index metadata:
|
||||
|
||||
* `domain_id` identifies the owning domain.
|
||||
* `visibility` marks internal vs published.
|
||||
* `cross_domain_source` preserves provenance for imported artifacts.
|
||||
|
||||
Deterministic replay across nodes MUST respect:
|
||||
|
||||
* Snapshot boundaries
|
||||
* Log order
|
||||
* Domain visibility rules
|
||||
|
||||
Federation does not change PEL semantics. It propagates artifacts and receipts
|
||||
that were already deterministically produced.
|
||||
|
||||
Admission and policy compatibility gate foreign state: only admitted domains and
|
||||
policy-compatible published state may be included in a federation view.
|
||||
|
||||
---
|
||||
|
||||
## 7. Index Alignment
|
||||
|
||||
The index is the shared substrate:
|
||||
|
||||
* Artifacts, PERs, and TGK edges are all indexed via the same lookup semantics.
|
||||
* Sharding, SIMD, and filters (ASL/INDEX-ACCEL/1) are advisory and MUST NOT
|
||||
change correctness.
|
||||
* Tombstones and shadowing remain the only visibility overrides.
|
||||
|
||||
---
|
||||
|
||||
## 8. Glossary and Terminology Alignment (Informative)
|
||||
|
||||
To prevent drift across layers, the following terms map as:
|
||||
|
||||
* **EdgeBody** (`TGK/1-CORE`) — logical edge content (`from[]`, `to[]`, `payload`, `type`).
|
||||
* **EdgeArtifact** (`TGK/1-CORE`) — ASL Artifact whose payload encodes an EdgeBody.
|
||||
* **EdgeRef** (`TGK/1-CORE`) — ASL Reference to an EdgeArtifact.
|
||||
* **TGK index record** (`TGK/1`, `ASL/1-CORE-INDEX`) — index entry that makes an EdgeRef visible under snapshot/log rules; contains no edge payload.
|
||||
* **TGK traversal result** (`TGK/1`) — snapshot/log-bounded set of visible edges (EdgeRefs) and/or node references derived from indexed EdgeArtifacts.
|
||||
|
||||
---
|
||||
|
||||
## 9. Summary
|
||||
|
||||
ASL/SYSTEM/1 provides a single, consistent view:
|
||||
|
||||
* One PEL, with policy-based output treatment
|
||||
* TGK and PEL both bounded by snapshot + log determinism
|
||||
* Federation mediated by index-level domain metadata
|
||||
* Index semantics remain the core substrate for all objects
|
||||
251
tier1/asl-tgk-execution-plan-1.md
Normal file
251
tier1/asl-tgk-execution-plan-1.md
Normal file
|
|
@ -0,0 +1,251 @@
|
|||
# ASL/TGK-EXEC-PLAN/1 -- Unified Execution Plan Semantics
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-01-17
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [execution, query, tgk, determinism]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/asl-tgk-execution-plan-1.md | Canonical: /amduat/tier1/asl-tgk-execution-plan-1.md -->
|
||||
|
||||
**Document ID:** `ASL/TGK-EXEC-PLAN/1`
|
||||
**Layer:** L2 -- Execution plan semantics (no encoding)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE-INDEX`
|
||||
* `ASL/LOG/1`
|
||||
* `ASL/INDEX-ACCEL/1`
|
||||
* `TGK/1`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/SYSTEM/1`
|
||||
* `ENC/ASL-CORE-INDEX/1`
|
||||
* `ENC/ASL-TGK-EXEC-PLAN/1`
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ASL/TGK-EXEC-PLAN/1 defines execution plan semantics for querying artifacts and TGK edges. It does not define encoding, transport, or runtime scheduling.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document defines the operator model and determinism rules for executing queries over ASL artifacts and TGK edges using snapshot-bounded visibility.
|
||||
|
||||
---
|
||||
|
||||
## 2. Execution Plan Model (Normative)
|
||||
|
||||
An execution plan is a DAG of operators:
|
||||
|
||||
```
|
||||
Plan = { nodes: [Op], edges: [(Op -> Op)] }
|
||||
```
|
||||
|
||||
Each operator includes:
|
||||
|
||||
* `op_id`: unique identifier
|
||||
* `op_type`: operator type
|
||||
* `inputs`: upstream operator outputs
|
||||
* `snapshot`: `(SnapshotID, LogPrefix)`
|
||||
* `constraints`: canonical filters
|
||||
* `projections`: output fields
|
||||
* `traversal`: optional traversal parameters
|
||||
* `aggregation`: optional aggregation parameters
|
||||
|
||||
---
|
||||
|
||||
## 2.1 Query Abstraction (Informative)
|
||||
|
||||
A query can be represented as:
|
||||
|
||||
```
|
||||
Q = {
|
||||
snapshot: S,
|
||||
constraints: C,
|
||||
projections: P,
|
||||
traversal: optional,
|
||||
aggregation: optional
|
||||
}
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
* `constraints` describe canonical filters (artifact keys, type tags, edge types, roles, node IDs).
|
||||
* `projections` select output fields.
|
||||
* `traversal` declares TGK traversal depth and direction.
|
||||
* `aggregation` defines deterministic reduction operations.
|
||||
|
||||
---
|
||||
|
||||
## 3. Deterministic Ordering (Normative)
|
||||
|
||||
All operator outputs MUST be ordered by:
|
||||
|
||||
1. `logseq` ascending
|
||||
2. canonical key ascending (tie-breaker)
|
||||
|
||||
Parallel execution MUST preserve this order.
|
||||
|
||||
---
|
||||
|
||||
## 4. Visibility Rules (Normative)
|
||||
|
||||
Records are visible if and only if:
|
||||
|
||||
* `record.logseq <= snapshot.log_prefix`
|
||||
* The record is not shadowed by a later tombstone
|
||||
|
||||
Unknown record types MUST be skipped without breaking determinism.
|
||||
|
||||
---
|
||||
|
||||
## 5. Operator Types (Normative)
|
||||
|
||||
### 5.1 SegmentScan
|
||||
|
||||
* Inputs: sealed segments
|
||||
* Outputs: raw record references
|
||||
* Rules:
|
||||
* Only segments with `segment.logseq_min <= snapshot.log_prefix` are scanned.
|
||||
* Advisory filters MAY be applied but MUST NOT introduce false negatives.
|
||||
* Shard routing MAY be applied prior to scan if deterministic.
|
||||
|
||||
### 5.2 IndexFilter
|
||||
|
||||
* Inputs: record stream
|
||||
* Outputs: filtered record stream
|
||||
* Rules:
|
||||
* Applies canonical constraints (artifact key, type tag, TGK edge type, roles).
|
||||
* Filters MUST be exact; advisory filters are not sufficient.
|
||||
|
||||
### 5.3 TombstoneShadow
|
||||
|
||||
* Inputs: record stream + tombstone stream
|
||||
* Outputs: visible records only
|
||||
* Rules:
|
||||
* Later tombstones shadow earlier entries with the same canonical key.
|
||||
|
||||
### 5.4 Merge
|
||||
|
||||
* Inputs: multiple ordered streams
|
||||
* Outputs: single ordered stream
|
||||
* Rules:
|
||||
* Order is `logseq` then canonical key.
|
||||
* Merge MUST be deterministic regardless of shard order.
|
||||
|
||||
### 5.5 Projection
|
||||
|
||||
* Inputs: record stream
|
||||
* Outputs: projected fields
|
||||
* Rules:
|
||||
* Projection MUST preserve input order.
|
||||
|
||||
### 5.6 TGKTraversal
|
||||
|
||||
* Inputs: seed node set
|
||||
* Outputs: edge and/or node stream
|
||||
* Rules:
|
||||
* Expansion MUST respect snapshot bounds.
|
||||
* Traversal depth MUST be explicit.
|
||||
* Order MUST follow deterministic ordering rules.
|
||||
|
||||
### 5.7 Aggregation (Optional)
|
||||
|
||||
* Inputs: record stream
|
||||
* Outputs: aggregate results
|
||||
* Rules:
|
||||
* Aggregation MUST be deterministic given identical inputs and snapshot.
|
||||
|
||||
### 5.8 LimitOffset (Optional)
|
||||
|
||||
* Inputs: ordered record stream
|
||||
* Outputs: ordered slice
|
||||
* Rules:
|
||||
* Applies pagination or top-N selection.
|
||||
* MUST preserve deterministic order from upstream operators.
|
||||
|
||||
### 5.9 ShardDispatch (Optional)
|
||||
|
||||
* Inputs: shard-local streams
|
||||
* Outputs: ordered global stream
|
||||
* Rules:
|
||||
* Shard execution MAY be parallel.
|
||||
* Merge MUST preserve deterministic ordering by `logseq` then canonical key.
|
||||
|
||||
### 5.10 SIMDFilter (Optional)
|
||||
|
||||
* Inputs: record stream
|
||||
* Outputs: filtered record stream
|
||||
* Rules:
|
||||
* SIMD filters are advisory accelerators.
|
||||
* Canonical checks MUST still be applied before output.
|
||||
|
||||
---
|
||||
|
||||
## 6. Acceleration Constraints (Normative)
|
||||
|
||||
Acceleration mechanisms (filters, routing, SIMD) MUST be observationally invisible:
|
||||
|
||||
* False positives are permitted.
|
||||
* False negatives are forbidden.
|
||||
* Canonical checks MUST always be applied before returning results.
|
||||
|
||||
---
|
||||
|
||||
## 7. Plan Serialization (Optional)
|
||||
|
||||
Execution plans MAY be serialized for reuse or deterministic replay.
|
||||
|
||||
```c
|
||||
struct exec_plan {
|
||||
uint32_t plan_version;
|
||||
uint32_t operator_count;
|
||||
struct operator_def operators[];
|
||||
struct operator_edge edges[];
|
||||
};
|
||||
```
|
||||
|
||||
Serialization MUST preserve operator parameters, snapshot bounds, and DAG edges.
|
||||
|
||||
---
|
||||
|
||||
## 8. GC Safety (Informative)
|
||||
|
||||
Records and edges MUST NOT be removed if they appear in a snapshot or are
|
||||
reachable via traversal at that snapshot.
|
||||
|
||||
---
|
||||
|
||||
## 9. Non-Goals
|
||||
|
||||
ASL/TGK-EXEC-PLAN/1 does not define:
|
||||
|
||||
* Runtime scheduling or parallelization strategy
|
||||
* Encoding of operator plans
|
||||
* Query languages or APIs
|
||||
* Operator cost models
|
||||
944
tier1/dds.md
Normal file
944
tier1/dds.md
Normal file
|
|
@ -0,0 +1,944 @@
|
|||
# AMDUAT-DDS — Detailed Design Specification
|
||||
|
||||
Status: Approved
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.5.0
|
||||
SoT: Yes
|
||||
Last Updated: 2025-11-11
|
||||
Linked Phase Pack: PH01
|
||||
Tags: [design, cas, composition]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/dds.md | Canonical: /amduat/tier1/dds.md -->
|
||||
|
||||
**Document ID:** `AMDUAT-DDS`
|
||||
**Layer:** L0.1 — Byte-level design (CAS + deterministic envelopes)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `AMDUAT-SRS` — behavioural requirements
|
||||
* ADR-001 — CAS identity
|
||||
* ADR-003 — canonical encoding discipline
|
||||
* ADR-006 — deterministic error semantics
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* ADR-015 — rejection governance
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
> **Note (scope):**
|
||||
> This DDS covers **Phase 01 (Kheper CAS)** byte semantics and, where necessary, the canonical **binary encodings** for higher deterministic layers (FCS/1, PCB1, FER/1, FCT/1).
|
||||
> **Behavioural semantics live in SRS.** This document governs the **bytes**.
|
||||
|
||||
---
|
||||
|
||||
## 1 – Content ID (CID)
|
||||
|
||||
**Rule.**
|
||||
|
||||
```
|
||||
CID = algo_id || H("CAS:OBJ\0" || payload_bytes)
|
||||
```
|
||||
|
||||
* `algo_id`: 1-byte or VARINT identifier (default `0x01` = SHA-256).
|
||||
* `H`: selected hash over **exact payload bytes**.
|
||||
* Domain separation prefix must be present verbatim: `"CAS:OBJ\0"`.
|
||||
|
||||
**Properties.**
|
||||
|
||||
* Deterministic: identical payload → identical CID.
|
||||
* Implementation-independent (SRS NFR-001).
|
||||
* Crypto-agile via `algo_id`.
|
||||
|
||||
**Errors.**
|
||||
|
||||
* `ERR_ALGO_UNSUPPORTED` when `algo_id` not registered.
|
||||
* Empty payload is allowed and canonical.
|
||||
|
||||
---
|
||||
|
||||
## 2. Canonical Object Record (COR/1)
|
||||
|
||||
COR/1 is the **only** canonical import/export envelope for CAS objects. Exact bytes are consensus; on-disk layout is not.
|
||||
|
||||
### 2.1 Envelope Layout (exact bytes)
|
||||
|
||||
```
|
||||
Header (7 bytes total):
|
||||
MAGIC : 4 bytes = "CAS1" (0x43 0x41 0x53 0x31)
|
||||
VERSION : 1 byte = 0x01
|
||||
FLAGS : 1 byte = 0x00 (reserved; MUST be 0)
|
||||
RSV : 1 byte = 0x00 (reserved; MUST be 0)
|
||||
|
||||
Body (strict TLV order; no padding):
|
||||
0x10 algo_id (VARINT)
|
||||
0x11 size (VARINT)
|
||||
0x12 payload (BYTES; length == size)
|
||||
```
|
||||
|
||||
**Notes**
|
||||
|
||||
* Fixed header invariants; any mismatch is rejection.
|
||||
* No alignment/padding anywhere.
|
||||
|
||||
### 2.2 Tag Semantics
|
||||
|
||||
| Tag | Name | Type | Card. | Notes |
|
||||
| ---: | ------- | ------ | ----: | ----------------------------------------------- |
|
||||
| 0x10 | algo_id | VARINT | 1 | MUST equal algorithm used for the object’s CID. |
|
||||
| 0x11 | size | VARINT | 1 | **Minimal VARINT**; MUST equal payload length. |
|
||||
| 0x12 | payload | BYTES | 1 | Raw bytes; never normalized. |
|
||||
|
||||
### 2.3 Canonicalization Rules (strict)
|
||||
|
||||
1. **Order & uniqueness:** `0x10`, `0x11`, `0x12`, each exactly once.
|
||||
2. **VARINTS:** Unsigned LEB128 **minimal** form only.
|
||||
3. **BYTES:** `VARINT(len) || len bytes`, with `len == size`.
|
||||
4. **No extras:** No unknown tags, no trailing bytes.
|
||||
5. **Header invariants:** `MAGIC="CAS1"`, `VERSION=0x01`, `FLAGS=RSV=0x00`.
|
||||
6. **Policy domain:** `size ≤ max_object_size` when enforced (ICD/1 §3).
|
||||
7. **Raw byte semantics** (SRS FR-010).
|
||||
|
||||
### 2.4 Decoder Validation Algorithm (normative)
|
||||
|
||||
1. Validate header ⇒ else `ERR_COR_HEADER_INVALID`.
|
||||
2. Read `0x10` minimal VARINT ⇒ else `ERR_COR_TAG_ORDER` / `ERR_VARINT_NON_MINIMAL`.
|
||||
3. Read `0x11` minimal VARINT ⇒ same error rules.
|
||||
4. Read `0x12` BYTES (length minimal VARINT) ⇒ else `ERR_VARINT_NON_MINIMAL`.
|
||||
5. Enforce `size == len(payload)` ⇒ `ERR_COR_LENGTH_MISMATCH` on failure.
|
||||
6. Ensure **no trailing bytes** ⇒ `ERR_TRAILING_BYTES`.
|
||||
7. Recompute CID and compare ⇒ mismatch `ERR_CORRUPT_OBJECT`.
|
||||
|
||||
### 2.5 Consistency with CID (normative)
|
||||
|
||||
* **Export:** set `algo_id` to CID algorithm.
|
||||
* **Import:** verify `algo_id` and hash component against expected CID.
|
||||
* Mismatch ⇒ `ERR_ALGO_MISMATCH` / `ERR_CORRUPT_OBJECT`.
|
||||
|
||||
### 2.6 Round-Trip Identity
|
||||
|
||||
`import(COR/1) → export(CID)` MUST produce **byte-identical** envelope (SRS FR-005). Re-encoding is forbidden.
|
||||
|
||||
### 2.7 Rejection Matrix (normative)
|
||||
|
||||
| Violation | Example | Error |
|
||||
| ------------------ | -------------------------------- | ------------------------- |
|
||||
| Bad header | Wrong MAGIC/VERSION/FLAGS/RSV | `ERR_COR_HEADER_INVALID` |
|
||||
| Unknown/extra tag | Any tag not 0x10/0x11/0x12 | `ERR_COR_UNKNOWN_TAG` |
|
||||
| Out-of-order | `0x11` before `0x10` | `ERR_COR_TAG_ORDER` |
|
||||
| Duplicate tag | Two `0x10` entries | `ERR_COR_DUPLICATE_TAG` |
|
||||
| Non-minimal VARINT | Over-long algo/size/bytes length | `ERR_VARINT_NON_MINIMAL` |
|
||||
| Length mismatch | `size != len(payload)` | `ERR_COR_LENGTH_MISMATCH` |
|
||||
| Trailing bytes | Any bytes after payload | `ERR_TRAILING_BYTES` |
|
||||
| Algo mismatch | `algo_id` conflicts with CID | `ERR_ALGO_MISMATCH` |
|
||||
| Hash mismatch | Recomputed hash ≠ expected | `ERR_CORRUPT_OBJECT` |
|
||||
|
||||
---
|
||||
|
||||
## 3. Instance Descriptor (ICD/1)
|
||||
|
||||
ICD/1 publishes canonical instance configuration; its bytes are consensus.
|
||||
|
||||
### 3.1 Envelope
|
||||
|
||||
```
|
||||
Header:
|
||||
MAGIC : "ICD1"
|
||||
VERSION : 0x01
|
||||
|
||||
TLV (strict order; minimal VARINTs; no duplicates):
|
||||
0x20 algo_default (VARINT)
|
||||
0x21 max_object_size (VARINT)
|
||||
0x22 cor_version (VARINT) # 0x01 => COR/1 v1
|
||||
0x23 gc_policy_id (VARINT; 0 if none)
|
||||
0x24 impl_id (BYTES; optional build/impl descriptor CID)
|
||||
```
|
||||
|
||||
### 3.2 Derived Identity
|
||||
|
||||
```
|
||||
instance_id = SHA-256("CAS:ICD\0" || bytes(ICD/1))
|
||||
```
|
||||
|
||||
**Rules:** Ordering/minimal VARINTs mirror COR/1. Exporters preserve canonical bytes; `instance_id` is stable.
|
||||
|
||||
---
|
||||
|
||||
## 4. Encodings
|
||||
|
||||
* **VARINT (unsigned LEB128)** — minimal form only; else `ERR_VARINT_NON_MINIMAL`.
|
||||
* **BYTES** — `VARINT(length) || length bytes`.
|
||||
* **Fixed-width integers** — big-endian if present.
|
||||
* **No padding/alignment** in canonical encodings.
|
||||
|
||||
---
|
||||
|
||||
## 5. Algorithm Registry
|
||||
|
||||
**Default**
|
||||
|
||||
* `0x01` → SHA-256
|
||||
|
||||
**Reserved**
|
||||
|
||||
* `0x02` → SHA-512/256
|
||||
* `0x03` → BLAKE3
|
||||
|
||||
**Policy**
|
||||
|
||||
* New entries require ADR + test vectors. Backward compatible by design.
|
||||
|
||||
---
|
||||
|
||||
## 6. Filesystem Considerations (Informative)
|
||||
|
||||
```
|
||||
cas/
|
||||
├─ sha256/
|
||||
│ ├─ aa/.. # fan-out by CID prefix (implementation detail)
|
||||
│ └─ ff/..
|
||||
└─ amduat/
|
||||
└─ <instance-id>/
|
||||
├─ amduatcas
|
||||
├─ sha256/.. # private runtime state; never a put() target
|
||||
├─ interface/
|
||||
│ └─ libamduatcas.current
|
||||
├─ HEAD
|
||||
└─ meta/
|
||||
```
|
||||
|
||||
**Rule:** Public CAS API acts only on `cas/sha256/`. The per-instance subtree is private and MUST NOT receive `put()` writes.
|
||||
|
||||
---
|
||||
|
||||
## 7. Error Conditions & Higher-Layer Layouts (Normative)
|
||||
|
||||
### 7.1 COR/1 & ICD/1 Enforcement (codes)
|
||||
|
||||
* `ERR_COR_HEADER_INVALID`, `ERR_COR_UNKNOWN_TAG`, `ERR_COR_TAG_ORDER`, `ERR_COR_DUPLICATE_TAG`,
|
||||
`ERR_COR_LENGTH_MISMATCH`, `ERR_VARINT_NON_MINIMAL`, `ERR_ALGO_UNSUPPORTED`,
|
||||
`ERR_ALGO_MISMATCH`, `ERR_TRAILING_BYTES`, `ERR_CORRUPT_OBJECT`.
|
||||
|
||||
---
|
||||
|
||||
### 7.2 FCS/1 Descriptor Layout — v1-min (Normative)
|
||||
|
||||
> **Design principle:** *FCS/1 describes the deterministic execution recipe only.*
|
||||
> Intent, roles, scope, authority, and registry policy are **not** encoded in FCS; they are captured at **certification time** in FCT/1.
|
||||
|
||||
Header: `MAGIC="FCS1" VERSION=0x01 FLAGS=RSV=0x00`
|
||||
|
||||
| Tag | Field | Type | Card. | Notes |
|
||||
| ---: | ----------------- | ------ | ----: | ------------------------------------------ |
|
||||
| 0x30 | `function_ptr` | CID | 1 | FPS/1 primitive or nested FCS/1 descriptor |
|
||||
| 0x31 | `parameter_block` | CID | 1 | CID of PCB1 parameter block |
|
||||
| 0x32 | `arity` | VARINT | 1 | Expected parameter slots |
|
||||
|
||||
**Validation rules**
|
||||
|
||||
1. Strict TLV order; duplicates/out-of-order → `ERR_FCS_TAG_ORDER`.
|
||||
2. `parameter_block` MUST be valid PCB1 → `ERR_FCS_PARAMETER_FORMAT`.
|
||||
3. `arity` MUST match slot count → `ERR_PCB_ARITY_MISMATCH`.
|
||||
4. Descriptor graph MUST be acyclic → `ERR_FCS_CYCLE_DETECTED`.
|
||||
5. **Any unknown or legacy governance tag** (`registry_policy 0x33`, `intent_vector 0x34`, `provenance_edge 0x35`, `notes 0x36`, or unregistered fields) → `ERR_FCS_UNKNOWN_TAG`. Such tags MUST never be tolerated in canonical streams.
|
||||
|
||||
---
|
||||
|
||||
### 7.3 PCB1 Parameter Blocks (Normative)
|
||||
|
||||
PCB1 payloads are COR/1 envelopes with header `MAGIC="PCB1"`, `VERSION=0x01`, `FLAGS=RSV=0x00`.
|
||||
|
||||
| Tag | Field | Type | Notes |
|
||||
| ---: | --------------- | ----- | ----------------------------------------------------- |
|
||||
| 0x50 | `slot_manifest` | BCF/1 | Canonical slot descriptors `{index,name,type,digest}` |
|
||||
| 0x51 | `slot_data` | BYTES | Packed slot bytes respecting manifest order |
|
||||
|
||||
**Rules:**
|
||||
Slots appear in ascending `index`. Numeric slots default to `0` when omitted.
|
||||
Digest mismatches ⇒ `ERR_PCB_DIGEST_MISMATCH`. Non-deterministic ordering ⇒ `ERR_PCB_MANIFEST_ORDER`.
|
||||
Arity mismatch vs FCS/1 ⇒ `ERR_PCB_ARITY_MISMATCH`.
|
||||
|
||||
---
|
||||
|
||||
### 7.4 **FER/1 Receipt Layout (Normative)**
|
||||
|
||||
FER/1 receipts reuse COR/1 framing with header `"FER1"` and are byte-deterministic.
|
||||
|
||||
**Strict TLV order (no padding):**
|
||||
|
||||
| Tag | Field | Type | Cardinality | Notes |
|
||||
| ---- | --------------------- | ----------- | ----------- | ----- |
|
||||
| 0x40 | `function_cid` | CID | 1 | Evaluated FCS/1 descriptor (must decode to v1-min). |
|
||||
| 0x41 | `input_manifest` | CID | 1 | MUST decode to GS/1 BCF/1 set list (deduped, byte-lexicographic). |
|
||||
| 0x42 | `environment` | CID | 1 | ICD/1 snapshot or PH03 environment capsule. |
|
||||
| 0x43 | `evaluator_id` | BYTES | 1 | Stable evaluator identity (DID/descriptor CID). |
|
||||
| 0x44 | `executor_set` | BCF/1 map | 1 | Map of executors → impl metadata (language/version/build); keys sorted. |
|
||||
| 0x4F | `executor_fingerprint`| CID | 0–1 | SBOM/attestation CID feeding `run_id`; REQUIRED when `run_id` present. |
|
||||
| 0x45 | `output_cid` | CID | 1 | Canonical output CID (single-output invariant). |
|
||||
| 0x46 | `parity_vector` | BCF/1 list | 1 | Sorted by executor key; each entry carries `{executor, output, digest, sbom_cid}`. |
|
||||
| 0x47 | `logs` | LIST<BCF/1> | 0–1 | Typed log capsules (`kind`, `cid`, `sha256`). |
|
||||
| 0x51 | `determinism_level` | ENUM | 0–1 | `"D1_bit_exact"` (default) or `"D2_numeric_stable"`. |
|
||||
| 0x50 | `rng_seed` | BYTES | 0–1 | 0–32 byte seed REQUIRED when determinism ≠ D1. |
|
||||
| 0x52 | `limits` | BCF/1 map | 0–1 | Resource envelope (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`). |
|
||||
| 0x48 | `started_at` | UINT64 | 1 | Epoch seconds (FR-020 start bound). |
|
||||
| 0x49 | `completed_at` | UINT64 | 1 | Epoch seconds ≥ `started_at`. |
|
||||
| 0x53 | `parent` | CID | 0–1 | Optional lineage pointer for follow-up runs. |
|
||||
| 0x4A | `context` | BCF/1 map | 0–1 | Optional scheduling hooks (WT/1 ticket, TA/1 branch tip, notes ref). |
|
||||
| 0x4B | `witnesses` | BCF/1 list | 0–1 | Optional observer descriptors / co-signers. |
|
||||
| 0x4E | `run_id` | BYTES[32] | 0–1 | Deterministic dedup anchor (`H("AMDUAT:RUN\0" || function || manifest || env || fingerprint)`). |
|
||||
| 0x4C | `signature` | BCF/1 map | 1 | Primary Ed25519 signature over `H("AMDUAT:FER\0" || canonical bytes)`. |
|
||||
| 0x4D | `signature_ext` | BCF/1 list | 0–1 | Reserved slot for multi-sig / threshold proofs (future). |
|
||||
|
||||
**Validation:**
|
||||
|
||||
1. TLV order strict; unknown tags ⇒ `ERR_FER_TAG_ORDER` / `ERR_FER_UNKNOWN_TAG`.
|
||||
2. `function_cid` must decode to valid FCS/1 ⇒ `ERR_FER_FUNCTION_MISMATCH` otherwise.
|
||||
3. `input_manifest` MUST decode to GS/1 set list (deduped + byte-lexicographic). Violations ⇒ `ERR_FER_INPUT_MANIFEST_SHAPE`.
|
||||
4. `executor_set` keys MUST be byte-lexicographic and align with `parity_vector` entries. Ordering mismatches ⇒ `ERR_IMPL_PARITY_ORDER`; missing executors or divergent outputs ⇒ `ERR_IMPL_PARITY`.
|
||||
5. Each parity entry MUST declare `sbom_cid` referencing the executor’s mini-SBOM CID.
|
||||
6. `determinism_level` defaults to `D1_bit_exact`; when set to any other value a 0–32 byte `rng_seed` is REQUIRED ⇒ `ERR_FER_RNG_REQUIRED`.
|
||||
7. `limits` (when present) MUST supply non-negative integers for `cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`.
|
||||
8. `logs` (when present) MUST contain objects with `kind ∈ {stderr, stdout, metrics, trace}`, `cid`, and `sha256` (both 32-byte hex strings).
|
||||
9. `run_id` (when present) MUST equal `H("AMDUAT:RUN\0" || function_cid || manifest_cid || environment_cid || executor_fingerprint)`; missing fingerprint ⇒ `ERR_FER_UNKNOWN_TAG`.
|
||||
10. `completed_at < started_at` ⇒ `ERR_FER_TIMESTAMP` (FR-020 envelope enforcement).
|
||||
11. Signatures MUST verify against `H("AMDUAT:FER\0" || canonical bytes)` ⇒ failure ⇒ `ERR_FER_SIGNATURE`.
|
||||
|
||||
> **Manifest note:** `input_manifest` bytes MUST be the GS/1 canonical list; ingestion MUST reject producer-specific ordering.
|
||||
> **Log capsule note:** `logs` entries bind `kind`, `cid`, and `sha256` together to avoid stdout/stderr hash confusion.
|
||||
> **Dedup note:** `run_id` enables idempotent FER ingestion across registries while keeping the FER CID authoritative.
|
||||
> **Provenance note:** FER/1 remains the exclusive home for run-time provenance and parity outcomes; governance stays in FCT/1.
|
||||
|
||||
> **Graph note:** Ingestors emit `realizes`, `produced_by`, `consumed_by`, and (optionally) `fulfills` edges based solely on FER content.
|
||||
|
||||
---
|
||||
|
||||
### 7.5 **FCT/1 Transaction Envelope (Normative)**
|
||||
|
||||
> **Design principle:** *FCT/1 is the canonical home for **intent**, **domain scope**, **roles/authority**, and **policy snapshot*** captured at certification/publication time.
|
||||
|
||||
FCT/1 serializes as ADR-003 BCF/1 map with canonical keys:
|
||||
|
||||
| Key | Type | Notes |
|
||||
| --------------------- | ----------- | ------------------------------------------------------- |
|
||||
| `fct.version` | UINT8 | MUST be `1` |
|
||||
| `fct.registry_policy` | UINT8 | Publication policy snapshot (0=Open,1=Curated,2=Locked) |
|
||||
| `fct.function` | CID | Certified FCS/1 descriptor |
|
||||
| `fct.receipts` | LIST<CID> | One or more FER/1 CIDs |
|
||||
| `fct.authority_role` | ENUM | ADR-010C role |
|
||||
| `fct.domain_scope` | ENUM | ADR-010B scope |
|
||||
| `fct.intent` | SET<ENUM> | ADR-010 intents |
|
||||
| `fct.constraints` | LIST<BCF/1> | Optional constraint set |
|
||||
| `fct.attestations` | LIST<BYTES> | Required when policy ≠ Open |
|
||||
| `fct.timestamp` | UINT64 | Epoch seconds |
|
||||
| `fct.publication` | CID | Optional ADR-007 digest |
|
||||
|
||||
**Validation:**
|
||||
|
||||
1. All receipts reference the same `function_cid` ⇒ else `ERR_FCT_RECEIPT_MISMATCH`.
|
||||
2. If `registry_policy ≠ 0` then `attestations` **required** ⇒ `ERR_FCT_ATTESTATION_REQUIRED`.
|
||||
3. All signatures/attestations verify ⇒ `ERR_FCT_SIGNATURE` on failure.
|
||||
4. Receipt timestamps must be monotonic ⇒ `ERR_FCT_TIMESTAMP`.
|
||||
|
||||
---
|
||||
|
||||
### 7.6 FPD/1 Publication Digest (Normative)
|
||||
|
||||
> **Design principle:** *Federation publishes exactly one deterministic digest per event (ADR-007, SRS FR-022).*
|
||||
|
||||
FPD/1 serializes as an ADR-003 BCF/1 map with canonical keys:
|
||||
|
||||
| Key | Type | Notes |
|
||||
| --------------- | ---------- | --------------------------------------------------------------------- |
|
||||
| `fpd.version` | UINT8 | MUST be `1`. |
|
||||
| `fpd.members` | LIST<CID> | Deterministic, byte-lexicographic list of member artefact CIDs. |
|
||||
| `fpd.parent` | CID (opt) | Previous FPD/1 digest for the domain publication chain (or `null`). |
|
||||
| `fpd.timestamp` | UINT64 | Epoch seconds aligned with `fct.timestamp` monotonic ordering. |
|
||||
| `fpd.digest` | CID | Canonical digest over `{FCT/1 bytes, FER/1 receipts, governance edges}`. |
|
||||
|
||||
**Construction:**
|
||||
|
||||
1. Normalize and sign the FCT/1 record (per §7.5) writing canonical bytes to the payload area (PA).
|
||||
2. Collect referenced FER/1 receipts and governance edges (`certifies`, `attests`, `publishes`) as canonical byte arrays.
|
||||
3. Build `fpd.members` as the byte-lexicographic list of CIDs for the certified FCT/1 record, every FER/1 receipt, and the edge batch capsule.
|
||||
4. Hash the concatenated canonical payloads using the federation digest algorithm (default `CIDv1/BCF`). Persist the resulting bytes and record the CID in `fpd.digest`.
|
||||
5. If a prior publication exists, set `fpd.parent` to the previous digest CID; otherwise omit.
|
||||
6. Emit the FPD/1 map, persist alongside the FCT/1 payload under `/logs/ph03/evidence/fct/`, and update `fct.publication` with the FPD/1 CID.
|
||||
|
||||
**Validation:**
|
||||
|
||||
* `fpd.members` MUST include exactly one FCT/1 CID and the full set of FER/1 receipt CIDs referenced by that transaction.
|
||||
* Recomputing the digest from the persisted canonical payloads MUST yield `fpd.digest`; mismatches ⇒ `ERR_FPD_DIGEST` (registered under ADR-006).
|
||||
* `fpd.timestamp` MUST be ≥ the largest FER/1 `completed_at` and ≥ the prior `fpd.timestamp` when `fpd.parent` is present ⇒ violations raise `ERR_FPD_TIMESTAMP`.
|
||||
* Graph emitters MUST log governance edges via `lib/g1-emitter/` using the canonical digests referenced above.
|
||||
|
||||
> **Graph note:** Publication surfaces emit `publishes(fct,fpd)` edges binding certification state to digest lineage for PH04 FLS/1 integration.
|
||||
|
||||
### 7.7 Error Surface Registration (consolidated)
|
||||
|
||||
All FCS/1, PCB1, FER/1, and FCT/1 errors map to ADR-006.
|
||||
Additions since v0.3.0:
|
||||
|
||||
| Code | Meaning |
|
||||
| --------------------- | -------------------------------------------------------------------------------------- |
|
||||
| `ERR_FCS_UNKNOWN_TAG` | Descriptor contained a tag outside the v1-min set (`0x30-0x32`). Rejected per ADR-006. |
|
||||
| `ERR_EXEC_TIMEOUT` | Executor exceeded deterministic time envelope (Maat’s Balance). |
|
||||
| `ERR_IMPL_PARITY` | Executor outputs/parity metadata diverged (missing executor, mismatched `output_cid`). |
|
||||
| `ERR_IMPL_PARITY_ORDER` | Parity vector ordering did not match the canonical executor ordering. |
|
||||
| `ERR_FER_UNKNOWN_TAG` | FER/1 payload contained an unknown tag or cardinality violation. |
|
||||
| `ERR_FER_INPUT_MANIFEST_SHAPE` | `input_manifest` failed GS/1 set decoding (not deduped or unsorted). |
|
||||
| `ERR_FER_RNG_REQUIRED` | `determinism_level` demanded an `rng_seed` but none was provided. |
|
||||
| `ERR_FPD_DIGEST` | Recomputed federation digest did not match `fpd.digest` (non-deterministic publication). |
|
||||
| `ERR_FPD_TIMESTAMP` | Publication timestamp regressed relative to receipts or parent digest. |
|
||||
| `ERR_FPD_PARENT_REQUIRED` | Policy-enforced lineage expected `fpd.parent` but none was provided. |
|
||||
| `ERR_FPD_MEMBER_DUP` | Duplicate member CID detected in the canonical set ordering. |
|
||||
| `ERR_WT_UNKNOWN_KEY` | WT/1 map contained a key outside the v1-min schema. |
|
||||
| `ERR_WT_VERSION_UNSUPPORTED` | `wt.version` not equal to `1`. |
|
||||
| `ERR_WT_INTENT_EMPTY` | `wt.intent` list empty. |
|
||||
| `ERR_WT_INTENT_DUP` | Duplicate ADR-010 intents detected in `wt.intent`. |
|
||||
| `ERR_WT_TIMESTAMP` | `wt.timestamp` regressed relative to the previous ticket from the same author. |
|
||||
| `ERR_WT_SIGNATURE` | Signature validation over `"AMDUAT:WT\0"` failed. |
|
||||
| `ERR_WT_KEY_UNBOUND` | Declared `wt.pubkey` is not authorized for `wt.author` via the predicate registry. |
|
||||
| `ERR_WT_INTENT_UNREGISTERED` | `wt.intent` entry not registered in ADR-010 predicate registry. |
|
||||
| `ERR_WT_SCOPE_UNAUTHORIZED` | Router policy rejected the declared domain scope. |
|
||||
| `ERR_WT_PARENT_UNKNOWN` | Optional `wt.parent` reference could not be resolved. |
|
||||
| `ERR_WT_PARENT_REQUIRED` | Policy required `wt.parent` but the field was omitted. |
|
||||
| `ERR_SOS_UNKNOWN_KEY` | SOS/1 map contained a key outside the v1-min schema. |
|
||||
| `ERR_SOS_VERSION_UNSUPPORTED` | `sos.version` not equal to `1`. |
|
||||
| `ERR_SOS_PREDICATE_UNREGISTERED` | Overlay predicate not registered in the CRS predicate registry. |
|
||||
| `ERR_SOS_POLICY_INCOMPATIBLE` | `sos.policy` outside `{0,1,2}` or disallowed for the deployment lane. |
|
||||
| `ERR_SOS_SIGNATURE_INVALID` | Signature validation over `"AMDUAT:SOS\0"` failed. |
|
||||
| `ERR_SOS_COMPAT_EVIDENCE_REQUIRED` | Compat overlays missing MPR/1 + IER/1 references. |
|
||||
| `ERR_SOS_TIMESTAMP_REGRESSION` | Overlay timestamp regressed relative to policy baseline. |
|
||||
|
||||
### 7.8 FLS/1 and CRS/1 Byte Semantics
|
||||
|
||||
Phase 04 establishes deterministic linkage between FLS/1 envelopes and CRS/1 concept graphs. ADR-018 governs the linkage envelope; ADR-020 governs concept and relation payloads. CI harnesses (`tools/ci/run_vectors.py`, `tools/ci/gs_snapshot.py`) provide conformance evidence.
|
||||
|
||||
#### 7.8.1 FLS/1 Envelope TLVs (Draft)
|
||||
|
||||
> **Scope:** Draft wire image aligned with ADR-018 v0.5.0. Stewardship will finalize signature semantics alongside multi-surface publication work.
|
||||
|
||||
| Tag | Field | Type | Card. | Notes |
|
||||
| ------ | -------------------- | ------ | ----- | ----- |
|
||||
| `0x60` | `source_cid` | CID | 1 | Deterministic sender artefact/surface. |
|
||||
| `0x61` | `target_cid` | CID | 1 | Deterministic recipient artefact/surface. |
|
||||
| `0x62` | `payload_cid` | CID | 1 | Content payload (COR/1 capsule, CRS/1 concept, or CRR/1 relation). |
|
||||
| `0x63` | `routing_policy_cid` | CID | 0-1 | Optional deterministic policy capsule. |
|
||||
| `0x64` | `timestamp` | UINT64 | 0-1 | Optional bounded timing evidence (big-endian). |
|
||||
| `0x65` | `signature` | BYTES | 0-1 | Optional Ed25519 signature with `"AMDUAT:FLS\0"` domain separator. |
|
||||
|
||||
**Envelope rules (draft):**
|
||||
|
||||
* Header MUST present `MAGIC="FLS1"`, `VERSION=0x01`, and zeroed `FLAGS/RSV` bytes.
|
||||
* TLVs MUST appear in strictly increasing tag order. Duplicate tags ⇒ `ERR_FLS_DUPLICATE_TAG`; reordering ⇒ `ERR_FLS_TAG_ORDER`.
|
||||
* Unknown tags are rejected until ADR updates extend this table (`ERR_FLS_UNKNOWN_TAG`).
|
||||
* CID TLVs MUST present 32-byte payloads aligned with ADR-001 ⇒ `ERR_FLS_CID_LENGTH`.
|
||||
* `timestamp` MUST be exactly eight bytes (UINT64, network byte order) ⇒ `ERR_FLS_TIMESTAMP_LENGTH`.
|
||||
* `signature` MUST start with `"AMDUAT:FLS\0"` and carry a 64-byte Ed25519 signature ⇒ `ERR_FLS_SIGNATURE_DOMAIN` / `ERR_FLS_SIGNATURE_LENGTH`; failing Ed25519 verification raises `ERR_FLS_SIGNATURE`.
|
||||
* When supplied, CRS payload bytes MUST hash to the declared `payload_cid` using `SHA-256("CAS:OBJ\0" || payload)` ⇒ `ERR_FLS_PAYLOAD_CID_MISMATCH`.
|
||||
* CRS payload headers MUST match `CRS1` (concept) or `CRR1` (relation) when linkage metadata declares the type ⇒ `ERR_FLS_PAYLOAD_KIND`.
|
||||
* Payloads MAY be CRS/1 concepts or CRR/1 relations; FLS/1 envelopes never mutate CRS graphs.
|
||||
|
||||
#### 7.8.2 CRS/1 Concept & Relation TLVs (Normative)
|
||||
|
||||
> **Scope:** Deterministic CRS/1 byte layout as ratified by ADR-020 v1.1.0. All TLVs
|
||||
> use single-byte tags + single-byte lengths with fixed 32-byte payloads.
|
||||
|
||||
**Concept Header** — `MAGIC="CRS1"`, `VERSION=0x01`, `FLAGS=0x00`, `RSV=0x00`.
|
||||
|
||||
| Tag | Field | Type | Card. | Notes |
|
||||
| ------ | ------------------ | ---- | ----- | ----- |
|
||||
| `0x40` | `description_cid` | CID | 1 | Canonical COR/1/BCF descriptor for the concept text/essence. |
|
||||
| `0x41` | `relations_cid` | CID | 1 | Deterministic list CID of outbound relation CIDs. |
|
||||
|
||||
**Relation Header** — `MAGIC="CRR1"`, `VERSION=0x01`, `FLAGS=0x00`, `RSV=0x00`.
|
||||
|
||||
| Tag | Field | Type | Card. | Notes |
|
||||
| ------ | ----------------- | ---- | ----- | ----- |
|
||||
| `0x42` | `source_cid` | CID | 1 | Originating Concept CID. |
|
||||
| `0x43` | `target_cid` | CID | 1 | Destination Concept or artefact CID. |
|
||||
| `0x44` | `predicate_cid` | CID | 1 | Registered predicate Concept CID. |
|
||||
|
||||
**Validation rules**
|
||||
|
||||
* Headers MUST match the values above; mismatches reject as malformed.
|
||||
* TLVs MUST appear exactly once in the order listed. Missing or out-of-order
|
||||
TLVs ⇒ `ERR_CRS_TAG_ORDER` (concept) or `ERR_CRR_TAG_ORDER` (relation).
|
||||
* Duplicate relation tags ⇒ `ERR_CRR_DUPLICATE_TAG`.
|
||||
* TLV payloads MUST be exactly 32 bytes ⇒ `ERR_CRS_LENGTH_MISMATCH` / `ERR_CRR_LENGTH_MISMATCH`.
|
||||
* Unknown tags are rejected ⇒ `ERR_CRS_UNKNOWN_TAG` / `ERR_CRR_UNKNOWN_TAG`.
|
||||
* `predicate_cid` MUST reference a CRS Concept (`ERR_CRR_PREDICATE_NOT_CONCEPT`). When a predicate taxonomy exists, predicates MUST declare `is_a → Predicate` (`ERR_CRR_PREDICATE_CLASS_MISSING`).
|
||||
|
||||
**Error mapping (ADR-006)**
|
||||
|
||||
| Code | Condition |
|
||||
| ---- | --------- |
|
||||
| `ERR_CRS_TAG_ORDER` | Concept TLVs missing, duplicated, or out of order. |
|
||||
| `ERR_CRS_LENGTH_MISMATCH` | Concept TLV payload not exactly 32 bytes. |
|
||||
| `ERR_CRS_UNKNOWN_TAG` | Concept TLV tag outside `0x40–0x41`. |
|
||||
| `ERR_CRR_TAG_ORDER` | Relation TLVs missing, duplicated, or out of order. |
|
||||
| `ERR_CRR_LENGTH_MISMATCH` | Relation TLV payload not exactly 32 bytes. |
|
||||
| `ERR_CRR_UNKNOWN_TAG` | Relation TLV tag outside `0x42–0x44`. |
|
||||
| `ERR_CRR_DUPLICATE_TAG` | Duplicate relation TLV encountered. |
|
||||
| `ERR_CRR_PREDICATE_NOT_CONCEPT` | `predicate_cid` did not resolve to a CRS Concept. |
|
||||
| `ERR_CRR_PREDICATE_CLASS_MISSING` | Predicate Concept missing `is_a → Predicate` taxonomy edge. |
|
||||
|
||||
**CID derivation**
|
||||
|
||||
```
|
||||
concept_cid = SHA-256("CAS:OBJ\0" || bytes(CRS/1 concept record))
|
||||
relation_cid = SHA-256("CAS:OBJ\0" || bytes(CRR/1 relation record))
|
||||
```
|
||||
|
||||
Byte-identical records MUST yield identical CIDs; any mutation requires a new
|
||||
record.
|
||||
|
||||
### 7.9 WT/1 Audited Ticket Intake (Normative)
|
||||
|
||||
WT/1 (ADR-023) captures auditable intent-to-change tickets as an ADR-003 BCF/1
|
||||
map. Keys are UTF-8 strings sorted lexicographically; values use canonical BCF
|
||||
types.
|
||||
|
||||
| Key | Type | Cardinality | Notes |
|
||||
| -------------- | ----------------- | ----------- | ----- |
|
||||
| `wt.version` | UINT8 | 1 | MUST equal `1`. |
|
||||
| `wt.author` | CID (hex string) | 1 | CRS Concept or DID capsule representing the submitting actor. |
|
||||
| `wt.scope` | CID (hex string) | 1 | ADR-010B domain scope concept CID. |
|
||||
| `wt.intent` | LIST<STRING> | 1 | Non-empty ADR-010 intent identifiers; deduped and byte-lexicographically sorted. |
|
||||
| `wt.payload` | CID (hex string) | 1 | CRS manifest, change plan, or opaque payload describing proposed work. |
|
||||
| `wt.timestamp` | UINT64 | 1 | Epoch seconds; MUST be monotonic per `wt.author`. |
|
||||
| `wt.pubkey` | BYTES[32] | 1 | Ed25519 public key used to verify `wt.signature`; MUST bind to `wt.author`. |
|
||||
| `wt.signature` | BYTES[64] | 1 | Ed25519 signature over `H("AMDUAT:WT\0" || canonical_bytes_without_signature)`. |
|
||||
| `wt.parent` | CID (hex string) | 0–1 | Optional lineage pointer to the previous WT/1 ticket for the same author. |
|
||||
|
||||
**Encoding rules**
|
||||
|
||||
1. `wt.intent` MUST be encoded as a list of unique UTF-8 strings sorted
|
||||
lexicographically; duplicates ⇒ `ERR_WT_INTENT_DUP`; entries not registered in
|
||||
ADR-010 ⇒ `ERR_WT_INTENT_UNREGISTERED`.
|
||||
2. CIDs serialize as lowercase hex strings (32 bytes → 64 hex chars) matching
|
||||
`SHA-256("CAS:OBJ\0" || payload)` outputs.
|
||||
3. `wt.signature` is a 64-byte Ed25519 signature; `wt.pubkey` supplies the
|
||||
32-byte verification key. The signature domain-separates with
|
||||
`"AMDUAT:WT\0"` and excludes the `wt.signature` field from the canonical byte
|
||||
stream hashed for verification.
|
||||
|
||||
**Validation**
|
||||
|
||||
1. Unknown keys ⇒ `ERR_WT_UNKNOWN_KEY`.
|
||||
2. `wt.version != 1` ⇒ `ERR_WT_VERSION_UNSUPPORTED`.
|
||||
3. Empty `wt.intent` ⇒ `ERR_WT_INTENT_EMPTY`.
|
||||
4. `wt.timestamp` less than the prior accepted ticket for the same `wt.author`
|
||||
⇒ `ERR_WT_TIMESTAMP`. When `wt.parent` is provided, its timestamp MUST NOT
|
||||
exceed the child timestamp; violations ⇒ `ERR_WT_TIMESTAMP`.
|
||||
5. Signature verification failure ⇒ `ERR_WT_SIGNATURE`.
|
||||
6. Routers MUST verify `has_pubkey(wt.author, wt.pubkey)` (or registered
|
||||
equivalent) ⇒ missing edge raises `ERR_WT_KEY_UNBOUND`.
|
||||
7. Unknown ADR-010 intent ⇒ `ERR_WT_INTENT_UNREGISTERED`.
|
||||
8. Router policy rejection of `wt.scope` ⇒ `ERR_WT_SCOPE_UNAUTHORIZED`.
|
||||
9. Provided `wt.parent` that cannot be resolved ⇒ `ERR_WT_PARENT_UNKNOWN`.
|
||||
10. Policy required lineage but omitted `wt.parent` ⇒ `ERR_WT_PARENT_REQUIRED`.
|
||||
|
||||
**Router integration**
|
||||
|
||||
* `POST /wt` (Protected Area) accepts WT/1 payloads, verifies signatures against
|
||||
`wt.pubkey`, enforces ADR-010 intent membership, validates optional
|
||||
`wt.parent` lineage, and rejects timestamp regressions.
|
||||
* `GET /wt/:cid` returns canonical WT/1 bytes for replay.
|
||||
* `GET /wt?after=<cid>&limit=<n>` paginates deterministically by CID
|
||||
(byte-lexicographic). `after` is an exclusive bound; routers enforce
|
||||
`1 ≤ limit ≤ Nmax` and MUST preserve stable replay windows.
|
||||
* Responses MUST include canonical WT/1 bytes; no rewriting or reformatting is
|
||||
permitted.
|
||||
|
||||
**Evidence & vectors**
|
||||
|
||||
* `/amduat/logs/ph04/evidence/wt1/PH04-EV-WT-001/summary.md` — validator run linking
|
||||
router behaviour to vectors.
|
||||
* `/amduat/vectors/ph04/wt1/` — fixtures `TV-WT-001…009` covering success,
|
||||
unknown key, signature failure, timestamp regression, key unbound, intent
|
||||
unregistered, parent timestamp inversion, scope policy rejection, and
|
||||
unresolved parent lineage.
|
||||
|
||||
### 7.10 CT/1 Header (Normative)
|
||||
|
||||
CT/1 headers serialize as ADR-003 BCF/1 maps with fixed key ordering. Keys and
|
||||
types:
|
||||
|
||||
| Key | Type | Notes |
|
||||
| --------------------- | -------- | ----- |
|
||||
| `ct.version` | `UINT8` | MUST equal `1`. |
|
||||
| `ct.rcs_version` | `UINT8` | RCS/1 core schema version; MUST equal `1`. |
|
||||
| `ct.topology` | `CID` | CRS/1 topology or manifest CID. |
|
||||
| `ct.ac` | `CID` | AC/1 descriptor CID (ADR-028). |
|
||||
| `ct.dtf` | `CID` | DTF/1 policy CID (ADR-028). |
|
||||
| `ct.determinism_level`| `UINT8` | `0` = D1 (bit-exact), `1` = D2 (numeric stable). |
|
||||
| `ct.kernel_cfg` | `CID` | Opaque kernel/tolerance configuration manifest. |
|
||||
| `ct.tick` | `UINT64` | Monotonically increasing replay sequence number. |
|
||||
| `ct.signature` | `BYTES` | 64-byte Ed25519 signature payload. |
|
||||
|
||||
**Validation**
|
||||
|
||||
1. BCF decode failures ⇒ `ERR_CT_MALFORMED`.
|
||||
2. Key set/order mismatches ⇒ `ERR_CT_UNKNOWN_KEY`.
|
||||
3. `ct.version` or `ct.rcs_version` ≠ `1` ⇒ `ERR_CT_VERSION`.
|
||||
4. `ct.determinism_level ∉ {0,1}` ⇒ `ERR_CT_DET_LEVEL`.
|
||||
5. Non-canonical CID strings ⇒ `ERR_CT_CID`.
|
||||
6. `ct.tick` outside `UINT64` range or non-monotone progression ⇒
|
||||
`ERR_CT_FIELD_TYPE` / `ERR_CT_TICK`.
|
||||
7. `ct.signature` length mismatch or Ed25519 verification failure ⇒
|
||||
`ERR_CT_SIGNATURE`.
|
||||
|
||||
**Signature rules**
|
||||
|
||||
`ct.signature` signs `H("AMDUAT:CT\0" || canonical_bytes_without_signature)`. Public
|
||||
keys are registered in the determinism catalogue (this section) and referenced by
|
||||
`ct.kernel_cfg` as needed for tolerance disclosure.
|
||||
|
||||
**Evidence & vectors**
|
||||
|
||||
* `/amduat/tools/validate/ct1_validator.py` — validation helper covering CT/1,
|
||||
AC/1, and DTF/1 schemas.
|
||||
* `/amduat/vectors/ph05/ct1/` — fixtures `TV-CT1-001…004`, `TV-AC1-001…002`,
|
||||
`TV-DTF1-001…002`.
|
||||
* `/amduat/tools/ci/ct_replay.py` — replay harness producing
|
||||
`/amduat/logs/ph05/evidence/ct1/PH05-EV-CT1-REPLAY-001/` (D1 parity + D2
|
||||
tolerance runs).
|
||||
|
||||
### 7.11 SOS/1 Semantic Overlays (Normative)
|
||||
|
||||
SOS/1 (ADR-024) attaches typed overlays to CRS Concepts or Relations via an
|
||||
ADR-003 BCF/1 map signed with the `"AMDUAT:SOS\0"` domain separator.
|
||||
|
||||
| Key | Type | Cardinality | Notes |
|
||||
| -------------- | ------------ | ----------- | ----- |
|
||||
| `sos.version` | UINT8 | 1 | MUST equal `1`. |
|
||||
| `sos.subject` | CID (hex) | 1 | CRS Concept or Relation CID receiving the overlay. |
|
||||
| `sos.predicate`| CID (hex) | 1 | Registered predicate concept describing overlay semantics. |
|
||||
| `sos.value` | CID (hex) | 1 | Opaque payload (text capsule, BCF/1 manifest, etc.). |
|
||||
| `sos.policy` | ENUM | 1 | `0=open`, `1=curated`, `2=compat`. |
|
||||
| `sos.timestamp`| UINT64 | 1 | Epoch seconds when authored. |
|
||||
| `sos.signature`| BYTES[64] | 1 | Ed25519 signature over `H("AMDUAT:SOS\0" || canonical_bytes_without_signature)`. |
|
||||
|
||||
**Validation**
|
||||
|
||||
1. Unknown keys ⇒ `ERR_SOS_UNKNOWN_KEY`.
|
||||
2. `sos.version != 1` ⇒ `ERR_SOS_VERSION_UNSUPPORTED`.
|
||||
3. `sos.predicate` MUST resolve to a registered CRS predicate ⇒
|
||||
`ERR_SOS_PREDICATE_UNREGISTERED`.
|
||||
4. `sos.policy` outside `{0,1,2}` or disallowed for deployment ⇒
|
||||
`ERR_SOS_POLICY_INCOMPATIBLE`.
|
||||
5. Epoch-second timestamps that regress relative to policy baseline MAY raise
|
||||
`ERR_SOS_TIMESTAMP_REGRESSION`.
|
||||
6. Signature verification failure ⇒ `ERR_SOS_SIGNATURE_INVALID`.
|
||||
7. Compat overlays (`sos.policy = 2`) MUST reference MPR/1 + IER/1 artefacts in
|
||||
certification evidence ⇒ missing references raise
|
||||
`ERR_SOS_COMPAT_EVIDENCE_REQUIRED`.
|
||||
|
||||
**Router integration**
|
||||
|
||||
* `POST /sos` (Protected Area) validates predicate registry membership, policy
|
||||
lane, timestamp discipline, and signatures.
|
||||
* `GET /sos/:cid` returns canonical SOS/1 bytes for replay.
|
||||
* `GET /sos?subject=<cid>&after=<cid?>&limit=<n>` paginates overlays
|
||||
deterministically by CID with stable replay windows.
|
||||
* Compat responses MUST surface referenced MPR/1 hashes and IER/1 fingerprints
|
||||
for auditors.
|
||||
|
||||
**Evidence & vectors**
|
||||
|
||||
* `/amduat/logs/ph04/evidence/sos1/PH04-EV-SOS-001/summary.md` — validator run covering
|
||||
`TV-SOS-001…006`.
|
||||
* `/amduat/vectors/ph04/sos1/` — canonical overlay fixtures exercising success,
|
||||
unregistered predicate, policy mismatch, signature failure, timestamp
|
||||
regression, and compat evidence gaps.
|
||||
|
||||
### 7.12 MPR/1 Model Provenance (Normative)
|
||||
|
||||
MPR/1 (ADR-025 v1.0.0) captures canonical model fingerprint triples for compat
|
||||
policy lanes.
|
||||
|
||||
| Key | Type | Cardinality | Notes |
|
||||
| ------------------ | ------------ | ----------- | ----- |
|
||||
| `mpr.version` | UINT8 | 1 | MUST equal `1`. |
|
||||
| `mpr.model_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of model artefact. |
|
||||
| `mpr.weights_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of weights bundle. |
|
||||
| `mpr.tokenizer_hash` | HEX | 1 | Lowercase hex digest (≥64 chars) of tokenizer assets. |
|
||||
| `mpr.build_info` | CID *(optional)* | 0..1 | Immutable build metadata capsule. |
|
||||
| `mpr.signature` | BYTES[64] *(optional)* | 0..1 | Ed25519 signature over `"AMDUAT:MPR\0" || canonical_bytes_without_signature`. |
|
||||
|
||||
**Validation**
|
||||
|
||||
1. Unknown keys ⇒ `ERR_MPR_UNKNOWN_KEY`.
|
||||
2. `mpr.version != 1` ⇒ `ERR_MPR_VERSION`.
|
||||
3. Missing hash fields ⇒ `ERR_MPR_MISSING_FIELD`.
|
||||
4. Hash fields not lowercase hex (≥64) ⇒ `ERR_MPR_HASH_FORMAT`; zero digests ⇒ `ERR_MPR_HASH_ZERO`.
|
||||
5. `mpr.build_info` malformed ⇒ `ERR_MPR_BUILD_INFO`.
|
||||
6. Signature verification failure ⇒ `ERR_MPR_SIGNATURE`.
|
||||
|
||||
**Evidence & vectors**
|
||||
|
||||
* `/amduat/logs/ph04/evidence/mpr1/PH04-EV-MPR-001/pass.jsonl` — validator harness (`python tools/ci/run_mpr_vectors.py`) covering `TV-MPR-001…003` with summary in `summary.md`.
|
||||
* `/amduat/vectors/ph04/mpr1/` — fixtures exercising valid record, missing weights hash, and signature domain mismatch.
|
||||
|
||||
### 7.13 IER/1 Inference Evidence (Normative)
|
||||
|
||||
IER/1 (ADR-026 v1.0.0) binds FER/1 receipts to compat policy envelopes and MPR/1 fingerprints.
|
||||
|
||||
| Key | Type | Cardinality | Notes |
|
||||
| ------------------------ | --------------- | ----------- | ----- |
|
||||
| `ier.version` | UINT8 | 1 | MUST equal `1`. |
|
||||
| `ier.fer_cid` | CID | 1 | Referenced FER/1 receipt. |
|
||||
| `ier.executor_fingerprint` | CID | 1 | MUST equal linked MPR/1 CID. |
|
||||
| `ier.determinism_level` | ENUM | 1 | FER/1 determinism indicator. |
|
||||
| `ier.rng_seed` | HEX *(conditional)* | 0..1 | Required (hex) when determinism ≠ `D1`. |
|
||||
| `ier.policy_cid` | CID | 1 | Compat policy capsule authorising run. |
|
||||
| `ier.log_digest` | HEX | 1 | `H("AMDUAT:IER:LOG\0" || concat(log.sha256))`. |
|
||||
| `ier.log_manifest` | MAP *(optional)* | 0..1 | Non-empty list of log entries with `sha256`. |
|
||||
| `ier.attestations` | LIST<BYTES> *(optional)* | 0..1 | Policy attestations (Ed25519 signatures). |
|
||||
|
||||
**Validation**
|
||||
|
||||
1. Unknown keys ⇒ `ERR_IER_UNKNOWN_KEY`.
|
||||
2. `ier.version != 1` ⇒ `ERR_IER_VERSION`.
|
||||
3. Malformed CIDs ⇒ `ERR_IER_POLICY`.
|
||||
4. `ier.executor_fingerprint` mismatch ⇒ `ERR_IER_FINGERPRINT`.
|
||||
5. Missing RNG seed when determinism ≠ `D1` ⇒ `ERR_FER_RNG_REQUIRED`.
|
||||
6. `ier.log_digest` mismatch or malformed manifest ⇒ `ERR_IER_LOG_HASH` / `ERR_IER_LOG_MANIFEST`.
|
||||
7. Attestation payloads not raw bytes ⇒ `ERR_IER_MALFORMED`.
|
||||
|
||||
**Evidence & vectors**
|
||||
|
||||
* `/amduat/logs/ph04/evidence/ier1/PH04-EV-IER-001/pass.jsonl` — validator harness (`python tools/ci/run_ier_vectors.py`) covering `TV-IER-001…004` with manifest summary in `summary.md`.
|
||||
* `/amduat/vectors/ph04/ier1/` — fixtures exercising success, missing RNG seed, fingerprint mismatch, and log digest mismatch.
|
||||
|
||||
---
|
||||
|
||||
## 8 – Test Vectors & Conformance
|
||||
|
||||
### 8.1 COR/1 & ICD/1
|
||||
|
||||
* Payload → CID (algo `0x01`).
|
||||
* COR/1 streams → CID and back (round-trip identity).
|
||||
* ICD/1 → `instance_id`.
|
||||
|
||||
### 8.2 FCS/1 v1-min
|
||||
|
||||
* Positive: `{0x30,0x31,0x32}` only, strict order, valid PCB1, acyclic.
|
||||
* Negative: any pre-v1-min tags (`0x33/0x34/0x35/0x36`) ⇒ reject per §7.2.
|
||||
* Arity/PCB mismatch ⇒ `ERR_PCB_ARITY_MISMATCH`.
|
||||
* Cycle ⇒ `ERR_FCS_CYCLE_DETECTED`.
|
||||
* Negative: legacy tags (`0x33-0x36`) → `ERR_FCS_UNKNOWN_TAG` per §7.2.
|
||||
|
||||
### 8.3 FER/1
|
||||
|
||||
* Signed receipt with monotonic timestamps; verify signature, executor set ↔ parity alignment, and linkage to FCS/1.
|
||||
* Negative: timestamp inversion ⇒ `ERR_FER_TIMESTAMP`; bad signature ⇒ `ERR_FER_SIGNATURE`.
|
||||
* Negative: parity drift (mismatched executor keys or output digests) ⇒ `ERR_IMPL_PARITY`.
|
||||
* Negative: unknown TLV tag/cardinality ⇒ `ERR_FER_UNKNOWN_TAG`.
|
||||
|
||||
### 8.4 FCT/1
|
||||
|
||||
* Multiple FER/1 receipts for same function; verify attestation coverage by policy.
|
||||
* Negative: mismatched receipt function ⇒ `ERR_FCT_RECEIPT_MISMATCH`.
|
||||
* Negative: missing attestation when policy ≠ Open ⇒ `ERR_FCT_ATTESTATION_REQUIRED`.
|
||||
|
||||
### 8.5 FPD/1
|
||||
|
||||
* Deterministic reconstruction of `fpd.digest` over `{FCT/1 bytes, FER/1 receipts, governance edge capsule}` on repeated runs.
|
||||
* Negative: perturbation of member ordering ⇒ `ERR_FPD_DIGEST`.
|
||||
* Negative: timestamp regression versus FER receipts or parent digest ⇒ `ERR_FPD_TIMESTAMP`.
|
||||
|
||||
**CI Requirements**
|
||||
|
||||
* Import/export **byte-identity** round-trip for COR/1/FCS/1/FER/1.
|
||||
* Canonical TLV/BCF ordering across descriptors.
|
||||
* Multi-platform reproducibility (≥3) including signature verification parity.
|
||||
* Timing evidence captured per SRS FR-020 (deterministic envelope).
|
||||
* Federation digest fixture verifies stable FPD/1 CID under `tools/ci/fct_publish_check.py`.
|
||||
|
||||
---
|
||||
|
||||
## 9. Security Considerations
|
||||
|
||||
* Domain separation strings MUST be exact.
|
||||
* Hash **exact payload bytes**, never decoded structures.
|
||||
* Canonical rejection prevents ambiguous encodings.
|
||||
* Certification places policy/intent in signed FCT/1, not in execution recipes.
|
||||
|
||||
---
|
||||
|
||||
## 10. Change Management
|
||||
|
||||
* **Behavioural semantics are in SRS.**
|
||||
* Changes here require ADR + CCP approval.
|
||||
* Versioning follows semantic versioning of encodings.
|
||||
* On approval, update IDX and SRS references accordingly.
|
||||
|
||||
---
|
||||
|
||||
## 11. ByteStore API & Persistence Discipline
|
||||
|
||||
ByteStore is the canonical persistence boundary layered over COR/1 and ICD/1.
|
||||
Implementations **must** honour the behaviours in this section; deviations are
|
||||
governed by ADR-030.
|
||||
|
||||
### 11.1 API Surface
|
||||
|
||||
| API | Signature | Behaviour | Error Surfaces (ADR-006) |
|
||||
| -------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------------------------------- |
|
||||
| `put` | `(payload: bytes) → cid_hex` | Persist raw payload under CID derived from `H("CAS:OBJ\0" || payload)`. | `ERR_POLICY_SIZE`, `ERR_IDENTITY_MISMATCH` |
|
||||
| `put_stream` | `(chunks: Iterable[bytes]) → cid_hex` | Deterministic chunked ingest; concatenated bytes hash to the same CID as `put`. | `ERR_STREAM_ORDER`, `ERR_STREAM_TRUNCATED` |
|
||||
| `import_cor` | `(envelope: bytes) → cid_hex` | Validate COR/1, enforce policy, persist canonical envelope without re-encoding. | `ERR_POLICY_SIZE`, COR/1 decoder errors |
|
||||
| `export_cor` | `(cid_hex: str) → envelope` | Return stored COR/1 bytes; must match the original import byte-for-byte. | `ERR_STORE_MISSING`, `ERR_IDENTITY_MISMATCH` |
|
||||
| `get` | `(cid_hex: str) → bytes` | Return stored bytes (payload or COR envelope) exactly as persisted. | `ERR_STORE_MISSING` |
|
||||
| `stat` | `(cid_hex: str) → {present: bool, size: int}` | Probe object presence and payload/envelope size without mutating state. | `ERR_STORE_MISSING` (absence reported via `present`) |
|
||||
| `assert_area_isolation` | `(public_root: Path, secure_root: Path) → None` | Enforce SA/PA separation; raise if roots overlap or share ancestry. | `ERR_AREA_VIOLATION` |
|
||||
|
||||
### 11.2 Deterministic Identity
|
||||
|
||||
Canonical identity is derived per COR/1/SRS:
|
||||
|
||||
```
|
||||
cid = algo_id || H("CAS:OBJ\0" || payload)
|
||||
```
|
||||
|
||||
`algo_id` defaults to `0x01` (SHA-256). ByteStore **must** reuse the exact
|
||||
domain separator and hash to remain compatible with CAS and DDS §1.
|
||||
|
||||
### 11.3 COR/1 Round-Trip Identity
|
||||
|
||||
`import_cor()` decodes the envelope, enforces policy (size ≤ ICD/1
|
||||
`max_object_size`), and persists the canonical bytes. `export_cor()` returns the
|
||||
exact stored envelope; re-encoding is forbidden. Derived CID **must** equal the
|
||||
envelope’s CID (DDS §2.5, SRS FR-BS-004).
|
||||
|
||||
### 11.4 Atomic fsync Ladder
|
||||
|
||||
All writes follow the deterministic ladder:
|
||||
|
||||
1. Write payload/envelope to a unique `.tmp-<suffix>` file in the shard.
|
||||
2. `fsync(tmp)` to guarantee payload durability.
|
||||
3. `rename(tmp, final)`.
|
||||
4. `fsync(shard directory)` and then `fsync(ByteStore root)`.
|
||||
|
||||
Crash-window simulation is exposed via `AMDUAT_BYTESTORE_CRASH_STEP` (“before_rename”).
|
||||
Implementations **must** honour the hook and leave PA consistent on recovery
|
||||
(DDS §11.8; vectors TV-BS-005, evidence bundle PH05-EV-BS-001).
|
||||
|
||||
### 11.5 SA/PA Isolation & Pathing
|
||||
|
||||
Public area (PA) payloads live under case-stable two-level fan-out (`/aa/bb/cid…`).
|
||||
Secure area (SA) metadata is held outside the PA tree. `assert_area_isolation()`
|
||||
enforces:
|
||||
|
||||
* `public_root != secure_root`
|
||||
* neither root is an ancestor of the other
|
||||
|
||||
Violations raise `ERR_AREA_VIOLATION` and **must** be surfaced by callers.
|
||||
|
||||
### 11.6 Chunked Ingest Determinism & Policy
|
||||
|
||||
`put_stream()` concatenates byte chunks in order, rejecting non-bytes input or
|
||||
missing data. The resulting CID **must** equal `put(payload)` for the same
|
||||
payload (SRS FR-BS-005). ByteStore enforces ICD/1 `max_object_size` prior to
|
||||
persisting data; exceeding the limit raises `ERR_POLICY_SIZE`.
|
||||
|
||||
### 11.7 Error Mapping
|
||||
|
||||
| Condition | Error Code | Notes |
|
||||
| ---------------------------------- | --------------------- | -------------------------------------------------------------- |
|
||||
| Payload exceeds policy limit | `ERR_POLICY_SIZE` | ICD/1 `max_object_size` (ADR-006 policy lane). |
|
||||
| Streaming chunk type/order invalid | `ERR_STREAM_ORDER` | Non-bytes or out-of-order chunks (deterministic rejection). |
|
||||
| Streaming missing payload | `ERR_STREAM_TRUNCATED`| Zero-length stream without payload. |
|
||||
| Stored bytes mismatch CID | `ERR_IDENTITY_MISMATCH` | Raised when existing bytes conflict with derived identity. |
|
||||
| SA/PA overlap | `ERR_AREA_VIOLATION` | Shared roots or ancestry (secure/public crossing). |
|
||||
| Crash-window hook triggered | `ERR_CRASH_SIMULATION`| Simulated crash prior to rename/fsync ladder completion. |
|
||||
| Missing object | `ERR_STORE_MISSING` | Reported when an object path is absent. |
|
||||
|
||||
All other errors bubble from COR/1 decoding and map to existing ADR-006 codes
|
||||
(see §2.7).
|
||||
|
||||
### 11.8 Conformance & Evidence
|
||||
|
||||
* Vectors: `/amduat/vectors/ph05/bytestore/` (`TV-BS-001…005`).
|
||||
* Runner: `/amduat/tools/ci/bs_check.py` (dual-run determinism; emits JSONL).
|
||||
* Evidence: `/amduat/logs/ph05/evidence/bytestore/PH05-EV-BS-001/` (runA/runB +
|
||||
crash summary).
|
||||
* Linked ADR: ADR-030 (ByteStore Persistence Contract).
|
||||
|
||||
---
|
||||
|
||||
## Appendix A — Surface Version Table
|
||||
|
||||
| Surface | Version | Notes |
|
||||
| ------- | ------- | ----- |
|
||||
| FCS/1 | v1-min | Execution-only descriptor (ADR-016); governance fields live in FCT/1. |
|
||||
| FER/1 | v1.1 | Parity-first receipts with run_id dedup, executor fingerprints, typed logs, RNG envelope (ADR-017). |
|
||||
| FCT/1 | v1.0 | Certification transactions binding policy/intent/attestations; publishes FER/1 receipts. |
|
||||
| FPD/1 | v1.0 | Single-digest publication capsule linking FCT/1 and FER/1 sets. |
|
||||
|
||||
---
|
||||
|
||||
**End of DDS 0.5.0**
|
||||
|
||||
---
|
||||
|
||||
## Document History
|
||||
|
||||
* 0.2.1 (2025-10-26) — Updated Phase Pack references; byte semantics unchanged; ADR-012 no-normalization.
|
||||
|
||||
* 0.2.2 (2025-10-26) — Promoted PH01 design surfaces to Approved; synchronized anchors.
|
||||
|
||||
* 0.2.3 (2025-10-27) — Marked DDS scope as PH01-only and referenced FPS/1 surfaces.
|
||||
|
||||
* **0.2.4 (2025-11-14):** Added FCS/1 & PCB1 TLVs plus FER/1 receipt and FCT/1 transaction schemas with rejection mapping.
|
||||
|
||||
* **0.2.5 (2025-11-15):** Registered PCB1 header invariants and arity/cycle validation errors.
|
||||
|
||||
* **0.2.6 (2025-11-19):** Registered `ERR_EXEC_TIMEOUT` for deterministic timing envelope.
|
||||
|
||||
* **0.3.0 (2025-11-02):** Trimmed **FCS/1 to v1-min** (execution recipe only: `function_ptr`, `parameter_block`, `arity`). Moved **intent/roles/scope/policy** to **FCT/1**; clarified provenance lives in **FER/1**. Added rejection guidance for legacy FCS tags.
|
||||
|
||||
* **0.3.1 (2025-11-20):** Registered `ERR_FCS_UNKNOWN_TAG`; clarified that any legacy governance tag in FCS/1 is a hard rejection. No other layout changes.
|
||||
* **0.3.2 (2025-11-21):** Adopted parity-first FER/1 TLVs (executor set, parity vector, context/witness hooks), registered `ERR_IMPL_PARITY` and `ERR_FER_UNKNOWN_TAG`, and refreshed conformance guidance.
|
||||
* **0.3.3 (2025-11-22):** Added FPD/1 publication digest schema, registered federation digest/timestamp errors, and wired CI fixtures to deterministic publish checks.
|
||||
|
||||
* **0.3.5 (2025-11-07):** Added surface version table and aligned FER/1 v1.1 maintenance metadata for Phase 04 handoff.
|
||||
|
||||
* **0.3.6 (2025-11-08):** Seeded PH04 linkage & semantic placeholder section (DDS §7.8).
|
||||
|
||||
* **0.3.7 (2025-11-08):** Seeded FLS/1 placeholder TLV table aligned with ADR-018 v0.3.0.
|
||||
* **0.3.8 (2025-11-08):** Registered FLS/1 TLV registry (0x60–0x65), error mapping, and conformance vectors aligned with ADR-018 v0.4.0.
|
||||
* **0.3.9 (2025-11-09):** Locked CRS/1 concept/relation TLVs and registered FLS payload CID/type errors with conformance evidence.
|
||||
|
||||
* **0.4.0 (2025-11-08):** Promoted §7.8 FLS/1 & CRS/1 TLVs with error mapping and GS/1 snapshot evidence.
|
||||
|
||||
* **0.4.1 (2025-11-09):** Extended CRS predicate rules and mapped new validation errors
|
||||
* **0.4.2 (2025-11-09):** Registered router error codes (`ERR_FLS_UNKNOWN_TAG`, `ERR_FLS_TAG_ORDER`, `ERR_FLS_SIGNATURE`) and FPD parent-policy errors with GS diff evidence pointer.
|
||||
* **0.4.3 (2025-11-09):** Added WT/1 intake layout, validation errors, and router API integration (§7.9).
|
||||
* **0.4.4 (2025-11-20):** Refined WT/1 (§7.9) with `wt.pubkey`, signature preimage exclusion, lineage/policy errors, and
|
||||
expanded validator vector coverage.
|
||||
|
||||
* **0.4.6 (2025-11-22):** WT/1 and SOS/1 conformance evidence sealed via PH04-M4/M5 audit bundles.
|
||||
* **0.4.5 (2025-11-21):** Registered SOS/1 overlays (§7.10) with compat evidence enforcement, aligned WT/1 error mapping (`ERR_WT_KEY_UNBOUND`, `ERR_WT_INTENT_UNREGISTERED`, `ERR_WT_PARENT_REQUIRED`), and expanded vector coverage to `TV-WT-001…009`.
|
||||
|
||||
* **0.4.7 (2025-11-23):** Documented MPR/1 and IER/1 schemas, error surfaces, and validator evidence for compat policy lane.
|
||||
|
||||
* **0.4.8 (2025-11-24):** Added §7.10 CT/1 header schema with error codes and renumbered downstream sections for PH05 replay.
|
||||
|
||||
* **0.5.0 (2025-11-11):** Added §11 ByteStore API & Persistence discipline covering API surface, fsync ladder, SA/PA isolation, streaming determinism, and ADR-006 error mapping.
|
||||
357
tier1/enc-asl-core-index-1.md
Normal file
357
tier1/enc-asl-core-index-1.md
Normal file
|
|
@ -0,0 +1,357 @@
|
|||
# ENC/ASL-CORE-INDEX/1 — Encoding Specification for ASL Core Index
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [encoding, index, deterministic]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/enc-asl-core-index.md | Canonical: /amduat/tier1/enc-asl-core-index-1.md -->
|
||||
|
||||
**Document ID:** `ENC/ASL-CORE-INDEX/1`
|
||||
**Layer:** Index Encoding Profile (on top of ASL/1-CORE-INDEX + ASL/STORE-INDEX/1)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE-INDEX` — semantic index model
|
||||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/LOG/1` — append-only log semantics
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document defines the **exact encoding of ASL index segments** and records for storage and interoperability.
|
||||
|
||||
It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**.
|
||||
Variable-length digest requirements are defined in ASL/1-CORE-INDEX (`tier1/asl-core-index.md`).
|
||||
This document incorporates the federation encoding addendum.
|
||||
|
||||
It is intended for:
|
||||
|
||||
* C libraries
|
||||
* Tools
|
||||
* API frontends
|
||||
* Memory-mapped access
|
||||
|
||||
It does **not** define:
|
||||
|
||||
* Index semantics (see ASL/1-CORE-INDEX)
|
||||
* Store lifecycle behavior (see ASL-STORE-INDEX)
|
||||
* Acceleration semantics (see ASL/INDEX-ACCEL/1)
|
||||
* TGK edge semantics or encodings (see `TGK/1` and `TGK/1-CORE`)
|
||||
* Federation semantics (see federation/domain policy layers)
|
||||
|
||||
---
|
||||
|
||||
## 2. Encoding Principles
|
||||
|
||||
1. **Little-endian** representation
|
||||
2. **Fixed-width fields** for deterministic access
|
||||
3. **No pointers or references**; all offsets are file-relative
|
||||
4. **Packed structures**; no compiler-introduced padding
|
||||
5. **Forward compatibility** via version field
|
||||
6. **CRC or checksum protection** for corruption detection
|
||||
7. **Federation metadata** embedded in index records for deterministic cross-domain replay
|
||||
|
||||
All multi-byte integers are little-endian unless explicitly noted.
|
||||
|
||||
---
|
||||
|
||||
## 3. Segment Layout
|
||||
|
||||
Each index segment file is laid out as follows:
|
||||
|
||||
```
|
||||
+------------------+
|
||||
| SegmentHeader |
|
||||
+------------------+
|
||||
| BloomFilter[] | (optional, opaque to semantics)
|
||||
+------------------+
|
||||
| IndexRecord[] |
|
||||
+------------------+
|
||||
| DigestBytes[] |
|
||||
+------------------+
|
||||
| ExtentRecord[] |
|
||||
+------------------+
|
||||
| SegmentFooter |
|
||||
+------------------+
|
||||
```
|
||||
|
||||
* **SegmentHeader**: fixed-size, mandatory
|
||||
* **BloomFilter**: optional, opaque, segment-local
|
||||
* **IndexRecord[]**: array of index entries
|
||||
* **DigestBytes[]**: concatenated digest bytes referenced by IndexRecord
|
||||
* **ExtentRecord[]**: concatenated extent lists referenced by IndexRecord
|
||||
* **SegmentFooter**: fixed-size, mandatory
|
||||
|
||||
Offsets in the header define locations of Bloom filter and index records.
|
||||
|
||||
### 3.1 Fixed Constants and Sizes
|
||||
|
||||
**Magic bytes (SegmentHeader.magic):** `ASLIDX03`
|
||||
|
||||
* ASCII bytes: `0x41 0x53 0x4c 0x49 0x44 0x58 0x30 0x33`
|
||||
* Little-endian uint64 value: `0x33305844494c5341`
|
||||
|
||||
**Current encoding version:** `3`
|
||||
|
||||
**Fixed struct sizes (bytes):**
|
||||
|
||||
* `SegmentHeader`: 112
|
||||
* `IndexRecord`: 48
|
||||
* `ExtentRecord`: 16
|
||||
* `SegmentFooter`: 24
|
||||
|
||||
**Section packing (no gaps):**
|
||||
|
||||
* `records_offset = header_size + bloom_size`
|
||||
* `digests_offset = records_offset + (record_count * sizeof(IndexRecord))`
|
||||
* `extents_offset = digests_offset + digests_size`
|
||||
* `SegmentFooter` starts at `extents_offset + (extent_count * sizeof(ExtentRecord))`
|
||||
|
||||
All offsets MUST be file-relative, 8-byte aligned, and point to their respective arrays exactly as above.
|
||||
|
||||
### 3.2 Federation Defaults
|
||||
|
||||
This encoding integrates federation metadata into segments and records.
|
||||
|
||||
Legacy segments without federation fields MUST be treated as:
|
||||
|
||||
* `segment_domain_id = local`
|
||||
* `segment_visibility = internal`
|
||||
* `domain_id = local`
|
||||
* `visibility = internal`
|
||||
* `has_cross_domain_source = 0`
|
||||
* `cross_domain_source = 0`
|
||||
|
||||
---
|
||||
|
||||
## 4. SegmentHeader
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t magic; // Unique magic number identifying segment file type
|
||||
uint16_t version; // Encoding version
|
||||
uint16_t shard_id; // Optional shard identifier
|
||||
uint32_t header_size; // Total size of header including fields below
|
||||
|
||||
uint64_t snapshot_min; // Minimum snapshot ID for which segment entries are valid
|
||||
uint64_t snapshot_max; // Maximum snapshot ID
|
||||
|
||||
uint64_t record_count; // Number of index entries
|
||||
uint64_t records_offset; // File offset of IndexRecord array
|
||||
|
||||
uint64_t bloom_offset; // File offset of bloom filter (0 if none)
|
||||
uint64_t bloom_size; // Size of bloom filter (0 if none)
|
||||
|
||||
uint64_t digests_offset; // File offset of DigestBytes array
|
||||
uint64_t digests_size; // Total size in bytes of DigestBytes
|
||||
|
||||
uint64_t extents_offset; // File offset of ExtentRecord array
|
||||
uint64_t extent_count; // Total number of ExtentRecord entries
|
||||
|
||||
uint32_t segment_domain_id; // Domain owning this segment
|
||||
uint8_t segment_visibility; // 0 = internal, 1 = published
|
||||
uint8_t federation_version; // 0 if unused
|
||||
uint16_t reserved0; // Reserved (must be 0)
|
||||
|
||||
uint64_t flags; // Segment flags (must be 0 in version 3)
|
||||
} SegmentHeader;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
|
||||
* `magic` ensures the reader validates the segment type.
|
||||
* `version` allows forward-compatible extension.
|
||||
* `snapshot_min` / `snapshot_max` are reserved for future use and carry no visibility semantics in version 3.
|
||||
* `segment_domain_id` identifies the owning domain for all records in this segment.
|
||||
* `segment_visibility` MUST be the maximum visibility of all records in the segment.
|
||||
* `federation_version` MUST be `0` unless a future federation encoding version is defined.
|
||||
* `reserved0` MUST be `0`.
|
||||
* `header_size` MUST be `112`.
|
||||
* `flags` MUST be `0`. Readers MUST reject non-zero values.
|
||||
|
||||
---
|
||||
|
||||
## 5. IndexRecord
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint32_t hash_id; // Hash algorithm identifier
|
||||
uint16_t digest_len; // Digest length in bytes
|
||||
uint16_t reserved0; // Reserved for alignment/future use
|
||||
uint64_t digest_offset; // File offset of digest bytes for this entry
|
||||
|
||||
uint64_t extents_offset; // File offset of first ExtentRecord for this entry
|
||||
uint32_t extent_count; // Number of ExtentRecord entries for this artifact
|
||||
uint32_t total_length; // Total artifact length in bytes
|
||||
|
||||
uint32_t domain_id; // Domain identifier for this artifact
|
||||
uint8_t visibility; // 0 = internal, 1 = published
|
||||
uint8_t has_cross_domain_source; // 0 or 1
|
||||
uint16_t reserved1; // Reserved (must be 0)
|
||||
|
||||
uint32_t cross_domain_source; // Source domain if imported (valid if has_cross_domain_source=1)
|
||||
uint32_t flags; // Optional flags (tombstone, reserved, etc.)
|
||||
} IndexRecord;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
|
||||
* `hash_id` + `digest_len` + `digest_offset` store the artifact key deterministically.
|
||||
* `digest_len` MUST be explicit in the encoding and MUST match the length implied by `hash_id` and StoreConfig.
|
||||
* `digest_offset` MUST be within `[digests_offset, digests_offset + digests_size)`.
|
||||
* `extents_offset` references the first ExtentRecord for this entry.
|
||||
* `extent_count` defines how many extents to read (may be 0 for tombstones; see ASL/1-CORE-INDEX in `tier1/asl-core-index.md`).
|
||||
* `total_length` is the exact artifact size in bytes.
|
||||
* Flags may indicate tombstone or other special status.
|
||||
* `domain_id` MUST be present and stable across replay.
|
||||
* `visibility` MUST be `0` or `1`.
|
||||
* `has_cross_domain_source` MUST be `0` or `1`.
|
||||
* `cross_domain_source` MUST be `0` when `has_cross_domain_source=0`.
|
||||
* `reserved0` and `reserved1` MUST be `0`.
|
||||
|
||||
### 5.1 IndexRecord Flags
|
||||
|
||||
```
|
||||
IDX_FLAG_TOMBSTONE = 0x00000001
|
||||
```
|
||||
|
||||
* If `IDX_FLAG_TOMBSTONE` is set, then `extent_count`, `total_length`, and `extents_offset` MUST be `0`.
|
||||
* All other bits are reserved and MUST be `0`. Readers MUST reject unknown flag bits.
|
||||
* Tombstones MUST retain valid `domain_id` and `visibility` to ensure domain-local shadowing.
|
||||
|
||||
---
|
||||
|
||||
## 6. ExtentRecord
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t block_id; // ASL block identifier
|
||||
uint32_t offset; // Offset within block
|
||||
uint32_t length; // Length of this extent
|
||||
} ExtentRecord;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
|
||||
* Extents are concatenated in order to produce artifact bytes.
|
||||
* `extent_count` MUST be > 0 for visible (non-tombstone) entries.
|
||||
* `total_length` MUST equal the sum of `length` across the extents.
|
||||
* `offset` and `length` MUST describe a contiguous slice within the referenced block.
|
||||
|
||||
---
|
||||
|
||||
## 7. SegmentFooter
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t crc64; // CRC over header + bloom filter + index records + digest bytes + extents
|
||||
uint64_t seal_snapshot; // Snapshot ID when segment was sealed
|
||||
uint64_t seal_time_ns; // High-resolution seal timestamp
|
||||
} SegmentFooter;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
**Notes:**
|
||||
|
||||
* CRC ensures corruption detection during reads, covering all segment contents except the footer.
|
||||
* Seal information allows deterministic reconstruction of CURRENT state.
|
||||
|
||||
---
|
||||
|
||||
## 8. DigestBytes
|
||||
|
||||
* Digest bytes are concatenated in a single byte array.
|
||||
* Each IndexRecord references its digest via `digest_offset` and `digest_len`.
|
||||
* The digest bytes MUST be immutable once the segment is sealed.
|
||||
|
||||
---
|
||||
|
||||
## 9. Bloom Filter
|
||||
|
||||
* The bloom filter is **optional** and opaque to semantics.
|
||||
* Its purpose is **lookup acceleration**.
|
||||
* Must be deterministic: same entries → same bloom representation.
|
||||
* Segment-local only; no global assumptions.
|
||||
|
||||
---
|
||||
|
||||
## 10. Versioning and Compatibility
|
||||
|
||||
* `version` field in header defines encoding.
|
||||
* Readers must **reject unsupported versions**.
|
||||
* New fields may be added in future versions only via version bump.
|
||||
* Existing fields must **never change meaning**.
|
||||
* Version `1` implies single-extent layout (legacy).
|
||||
* Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`.
|
||||
* Version `3` introduces variable-length digest bytes with `hash_id` and `digest_offset`.
|
||||
* Version `3` also integrates federation metadata in segment headers and index records.
|
||||
|
||||
### 10.1 Federation Compatibility Rules
|
||||
|
||||
* Legacy segments without federation fields are treated as local/internal (see 3.2).
|
||||
* Tombstones MUST NOT shadow artifacts from other domains; domain matching is required.
|
||||
|
||||
---
|
||||
|
||||
## 11. Alignment and Packing
|
||||
|
||||
* All structures are **packed** (no compiler padding)
|
||||
* Multi-byte integers are **little-endian**
|
||||
* Memory-mapped readers can directly index `IndexRecord[]` using `records_offset`.
|
||||
* Extents are accessed via `IndexRecord.extents_offset` relative to the file base.
|
||||
|
||||
---
|
||||
|
||||
## 12. Summary of Encoding Guarantees
|
||||
|
||||
The ENC-ASL-CORE-INDEX specification ensures:
|
||||
|
||||
1. **Deterministic layout** across platforms
|
||||
2. **Direct mapping from semantic model** (ArtifactKey → ArtifactLocation)
|
||||
3. **Immutability of sealed segments**
|
||||
4. **Integrity validation** via CRC
|
||||
5. **Forward-compatible extensibility**
|
||||
|
||||
---
|
||||
|
||||
## 13. Relationship to Other Layers
|
||||
|
||||
| Layer | Responsibility |
|
||||
| ------------------ | ---------------------------------------------------------- |
|
||||
| ASL/1-CORE-INDEX | Defines semantic meaning of artifact → location mapping |
|
||||
| ASL-STORE-INDEX | Defines lifecycle, visibility, and replay contracts |
|
||||
| ASL/INDEX-ACCEL/1 | Defines routing, filters, sharding (observationally inert) |
|
||||
| ENC-ASL-CORE-INDEX | Defines exact bytes-on-disk format for segment persistence |
|
||||
|
||||
This completes the stack: **semantics → store behavior → encoding**.
|
||||
248
tier1/enc-asl-log-1.md
Normal file
248
tier1/enc-asl-log-1.md
Normal file
|
|
@ -0,0 +1,248 @@
|
|||
# ENC/ASL-LOG/1 — Encoding Specification for ASL Append-Only Log
|
||||
|
||||
Status: Draft
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [encoding, log, deterministic]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/enc-asl-log.md | Canonical: /amduat/tier1/enc-asl-log-1.md -->
|
||||
|
||||
**Document ID:** `ENC/ASL-LOG/1`
|
||||
**Layer:** Log Encoding Profile (on top of ASL/LOG/1)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/LOG/1` — semantic log behavior and replay rules
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This document defines the **exact encoding** of the ASL append-only log.
|
||||
|
||||
It translates **ASL/LOG/1** semantics into a deterministic **bytes-on-disk** format.
|
||||
|
||||
It does **not** define log semantics (see `ASL/LOG/1`).
|
||||
|
||||
---
|
||||
|
||||
## 2. Encoding Principles
|
||||
|
||||
1. **Little-endian** integers
|
||||
2. **Packed structures** (no compiler padding)
|
||||
3. **Forward-compatible** versioning via header fields
|
||||
4. **Deterministic serialization**: identical log content -> identical bytes
|
||||
5. **Hash-chained integrity** as defined by ASL/LOG/1
|
||||
|
||||
---
|
||||
|
||||
## 3. Log File Layout
|
||||
|
||||
```
|
||||
+----------------+
|
||||
| LogHeader |
|
||||
+----------------+
|
||||
| LogRecord[] |
|
||||
+----------------+
|
||||
```
|
||||
|
||||
* **LogHeader**: fixed-size, mandatory, begins file
|
||||
* **LogRecord[]**: append-only entries, variable number
|
||||
|
||||
---
|
||||
|
||||
## 4. LogHeader
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t magic; // "ASLLOG01"
|
||||
uint32_t version; // Encoding version (1)
|
||||
uint32_t header_size; // Total header bytes including this struct
|
||||
uint64_t flags; // Reserved, must be zero for v1
|
||||
} LogHeader;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* `magic` is ASCII bytes: `0x41 0x53 0x4c 0x4c 0x4f 0x47 0x30 0x31`
|
||||
* `version` allows forward compatibility
|
||||
|
||||
---
|
||||
|
||||
## 5. LogRecord Envelope
|
||||
|
||||
Each record is encoded as:
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t logseq; // Monotonic sequence number
|
||||
uint32_t record_type; // Record type tag
|
||||
uint32_t payload_len; // Payload byte length
|
||||
uint8_t payload[payload_len];
|
||||
uint8_t record_hash[32]; // Hash-chained integrity (SHA-256)
|
||||
} LogRecord;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
Hash chain rule (normative):
|
||||
|
||||
```
|
||||
record_hash = H(prev_record_hash || logseq || record_type || payload_len || payload)
|
||||
```
|
||||
|
||||
* `prev_record_hash` is the previous record's `record_hash`
|
||||
* For the first record, `prev_record_hash` is 32 bytes of zero
|
||||
* `H` is SHA-256 for v1
|
||||
|
||||
Readers MUST skip unknown `record_type` values using `payload_len` and MUST
|
||||
continue replay without failure.
|
||||
|
||||
---
|
||||
|
||||
## 6. Record Type IDs (v1)
|
||||
|
||||
These type IDs bind the ASL/LOG/1 semantics to bytes-on-disk:
|
||||
|
||||
| Type ID | Record Type |
|
||||
| ------- | ------------------ |
|
||||
| 0x01 | SEGMENT_SEAL |
|
||||
| 0x10 | TOMBSTONE |
|
||||
| 0x11 | TOMBSTONE_LIFT |
|
||||
| 0x20 | SNAPSHOT_ANCHOR |
|
||||
| 0x30 | ARTIFACT_PUBLISH |
|
||||
| 0x31 | ARTIFACT_UNPUBLISH |
|
||||
|
||||
---
|
||||
|
||||
## 6.1 Payload Schemas (v1)
|
||||
|
||||
All payloads are little-endian and packed. Variable-length fields are encoded
|
||||
inline and accounted for by `payload_len`.
|
||||
|
||||
### 6.1.1 ArtifactRef
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint32_t hash_id; // Hash algorithm identifier
|
||||
uint16_t digest_len; // Digest length in bytes
|
||||
uint16_t reserved0; // Must be 0
|
||||
uint8_t digest[digest_len];
|
||||
} ArtifactRef;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* `digest_len` MUST be > 0.
|
||||
* If StoreConfig fixes the hash, `digest_len` MUST match that hash's length.
|
||||
|
||||
### 6.1.2 SEGMENT_SEAL (Type 0x01)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t segment_id; // Store-local segment identifier
|
||||
uint8_t segment_hash[32]; // SHA-256 over the segment file bytes
|
||||
} SegmentSealPayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
### 6.1.3 TOMBSTONE (Type 0x10)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
uint32_t scope; // Opaque to ASL/LOG/1
|
||||
uint32_t reason_code; // Opaque to ASL/LOG/1
|
||||
} TombstonePayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
### 6.1.4 TOMBSTONE_LIFT (Type 0x11)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
uint64_t tombstone_logseq; // logseq of the tombstone being lifted
|
||||
} TombstoneLiftPayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
### 6.1.5 SNAPSHOT_ANCHOR (Type 0x20)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
uint64_t snapshot_id;
|
||||
uint8_t root_hash[32]; // Hash of snapshot-visible state
|
||||
} SnapshotAnchorPayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
### 6.1.6 ARTIFACT_PUBLISH (Type 0x30)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
} ArtifactPublishPayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
### 6.1.7 ARTIFACT_UNPUBLISH (Type 0x31)
|
||||
|
||||
```c
|
||||
#pragma pack(push,1)
|
||||
typedef struct {
|
||||
ArtifactRef artifact;
|
||||
} ArtifactUnpublishPayload;
|
||||
#pragma pack(pop)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Versioning Rules
|
||||
|
||||
* `version = 1` for this specification.
|
||||
* New record types MAY be added without bumping the version.
|
||||
* Layout changes to `LogHeader` or `LogRecord` require a new version.
|
||||
|
||||
---
|
||||
|
||||
## 8. Relationship to Other Layers
|
||||
|
||||
| Layer | Responsibility |
|
||||
| ---------------- | ------------------------------------------------ |
|
||||
| ASL/LOG/1 | Semantic log behavior and replay rules |
|
||||
| ASL-STORE-INDEX | Store lifecycle and snapshot/log contracts |
|
||||
| ENC-ASL-LOG | Exact byte layout for log encoding (this doc) |
|
||||
| ENC-ASL-CORE-INDEX | Exact byte layout for index segments |
|
||||
202
tier1/enc-asl-tgk-exec-plan-1.md
Normal file
202
tier1/enc-asl-tgk-exec-plan-1.md
Normal file
|
|
@ -0,0 +1,202 @@
|
|||
# ENC/ASL-TGK-EXEC-PLAN/1 — Execution Plan Encoding
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-01-17
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [encoding, execution, tgk]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/enc-asl-tgk-exec-plan-1.md | Canonical: /amduat/tier1/enc-asl-tgk-exec-plan-1.md -->
|
||||
|
||||
**Document ID:** `ENC/ASL-TGK-EXEC-PLAN/1`
|
||||
**Layer:** L2 — Execution plan encoding (bytes-on-disk)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/TGK-EXEC-PLAN/1`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ENC/ASL-CORE-INDEX/1`
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
ENC/ASL-TGK-EXEC-PLAN/1 defines the byte-level encoding for serialized execution plans. It does not define operator semantics.
|
||||
|
||||
---
|
||||
|
||||
## 1. Operator Type Enumeration
|
||||
|
||||
```c
|
||||
typedef enum {
|
||||
OP_SEGMENT_SCAN,
|
||||
OP_INDEX_FILTER,
|
||||
OP_MERGE,
|
||||
OP_PROJECTION,
|
||||
OP_TGK_TRAVERSAL,
|
||||
OP_AGGREGATION,
|
||||
OP_LIMIT_OFFSET,
|
||||
OP_SHARD_DISPATCH,
|
||||
OP_SIMD_FILTER,
|
||||
OP_TOMBSTONE_SHADOW
|
||||
} operator_type_t;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Operator Flags
|
||||
|
||||
```c
|
||||
typedef enum {
|
||||
OP_FLAG_NONE = 0x00,
|
||||
OP_FLAG_PARALLEL = 0x01, // shard or SIMD capable
|
||||
OP_FLAG_OPTIONAL = 0x02 // optional operator (acceleration)
|
||||
} operator_flags_t;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Snapshot Range Structure
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
uint64_t logseq_min; // inclusive
|
||||
uint64_t logseq_max; // inclusive
|
||||
} snapshot_range_t;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Operator Parameter Union
|
||||
|
||||
```c
|
||||
typedef struct {
|
||||
// SegmentScan parameters
|
||||
struct {
|
||||
uint8_t is_asl_segment; // 1 = ASL, 0 = TGK
|
||||
uint64_t segment_start_id;
|
||||
uint64_t segment_end_id;
|
||||
} segment_scan;
|
||||
|
||||
// IndexFilter parameters
|
||||
struct {
|
||||
uint32_t artifact_type_tag;
|
||||
uint8_t has_type_tag;
|
||||
uint32_t edge_type_key;
|
||||
uint8_t has_edge_type;
|
||||
uint8_t role; // 0=none, 1=from, 2=to, 3=both
|
||||
} index_filter;
|
||||
|
||||
// Merge parameters
|
||||
struct {
|
||||
uint8_t deterministic; // 1 = logseq ascending + canonical key
|
||||
} merge;
|
||||
|
||||
// Projection parameters
|
||||
struct {
|
||||
uint8_t project_artifact_id;
|
||||
uint8_t project_tgk_edge_id;
|
||||
uint8_t project_node_id;
|
||||
uint8_t project_type_tag;
|
||||
} projection;
|
||||
|
||||
// TGKTraversal parameters
|
||||
struct {
|
||||
uint64_t start_node_id;
|
||||
uint32_t traversal_depth;
|
||||
uint8_t direction; // 1=from, 2=to, 3=both
|
||||
} tgk_traversal;
|
||||
|
||||
// Aggregation parameters
|
||||
struct {
|
||||
uint8_t agg_count;
|
||||
uint8_t agg_union;
|
||||
uint8_t agg_sum;
|
||||
} aggregation;
|
||||
|
||||
// LimitOffset parameters
|
||||
struct {
|
||||
uint64_t limit;
|
||||
uint64_t offset;
|
||||
} limit_offset;
|
||||
|
||||
// ShardDispatch & SIMDFilter are handled via flags
|
||||
} operator_params_t;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Operator Definition Structure
|
||||
|
||||
```c
|
||||
typedef struct operator_def {
|
||||
uint32_t op_id; // unique operator ID
|
||||
operator_type_t op_type; // operator type
|
||||
operator_flags_t flags; // parallel/optional flags
|
||||
snapshot_range_t snapshot; // snapshot bounds for deterministic execution
|
||||
operator_params_t params; // operator-specific parameters
|
||||
|
||||
uint32_t input_count; // number of upstream operators
|
||||
uint32_t inputs[8]; // list of op_ids for input edges (DAG)
|
||||
} operator_def_t;
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* `inputs` defines DAG dependencies.
|
||||
* The maximum input fan-in is 8 for v1.
|
||||
|
||||
---
|
||||
|
||||
## 6. Execution Plan Structure
|
||||
|
||||
```c
|
||||
typedef struct exec_plan {
|
||||
uint32_t plan_version; // version of plan encoding
|
||||
uint32_t operator_count; // number of operators
|
||||
operator_def_t *operators; // array of operator definitions
|
||||
} exec_plan_t;
|
||||
```
|
||||
|
||||
Operators SHOULD be serialized in topological order when possible.
|
||||
|
||||
---
|
||||
|
||||
## 7. Serialization Rules (Normative)
|
||||
|
||||
* All integers are little-endian.
|
||||
* Operators MUST be serialized in a deterministic order.
|
||||
* `operator_count` MUST match the serialized operator array length.
|
||||
* `inputs[]` MUST reference valid `op_id` values within the plan.
|
||||
|
||||
---
|
||||
|
||||
## 8. Non-Goals
|
||||
|
||||
ENC-ASL-TGK-EXEC-PLAN/1 does not define:
|
||||
|
||||
* Runtime scheduling or execution
|
||||
* Query languages or APIs
|
||||
* Operator semantics beyond parameter layout
|
||||
554
tier1/srs.md
Normal file
554
tier1/srs.md
Normal file
|
|
@ -0,0 +1,554 @@
|
|||
# AMDUAT-SRS — Detailed Requirements Specification
|
||||
|
||||
Status: Approved
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.4.0
|
||||
SoT: Yes
|
||||
Last Updated: 2025-11-11
|
||||
Linked Phase Pack: PH01
|
||||
Tags: [requirements, cas, kheper]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/srs.md | Canonical: /amduat/tier1/srs.md -->
|
||||
|
||||
**Document ID:** `AMDUAT-SRS`
|
||||
**Layer:** L0 — Requirements baseline (CAS + deterministic composition)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* None (requirements baseline)
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `AMDUAT-DDS` — byte-level design specification
|
||||
* ADR-006 — deterministic error semantics
|
||||
* ADR-015 — CAS rejection matrix alignment
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
> **Purpose:** Capture normative behavioural requirements for Phase PH01 (Kheper) and beyond. Long-lived semantics live here (not in Phase Packs).
|
||||
|
||||
---
|
||||
|
||||
## 1. Objectives (from Tier-0 Charter; elaborated)
|
||||
|
||||
* Deterministic addressing: identical payload bytes **MUST** yield identical CIDs.
|
||||
* Immutability: new bytes → new CID; objects MUST NOT be mutated in place.
|
||||
* Integrity by design: `verify()` MUST detect corruption; zero false positives.
|
||||
* Instance isolation: storage layout and runtime state are implementation detail.
|
||||
* Binary canonical substrate: COR/1 is the normative import/export envelope.
|
||||
* Instance identity: ICD/1 defines stable `instance_id` for future transaction bindings.
|
||||
* Crypto agility: default SHA-256; algorithm IDs extensible.
|
||||
* Minimal tooling: reference CLI (`amduatcas`) and C library.
|
||||
* Conformance: golden vectors and cross-impl CI enforce byte-identity.
|
||||
|
||||
---
|
||||
|
||||
## 2. Scope (Behavioural)
|
||||
|
||||
### 2.1 In Scope
|
||||
|
||||
* Local, single-node Content-Addressable Storage (CAS)
|
||||
* Deterministic hashing with domain separation
|
||||
* Canonical envelopes (COR/1) and instance descriptor (ICD/1)
|
||||
* CRUD-adjacent operations: put/get/stat/exists/verify
|
||||
* Import/export of canonical bytestreams
|
||||
* Optional listing/gc semantics
|
||||
|
||||
### 2.2 Out of Scope (for PH01)
|
||||
|
||||
* Networking, replication, consensus
|
||||
* Multi-object transactions
|
||||
* Semantic/provenance graphing
|
||||
* Encryption/ACLs (layer externally)
|
||||
|
||||
---
|
||||
|
||||
## 3. Functional Requirements
|
||||
|
||||
### FR-001 Deterministic CID Production
|
||||
|
||||
Given identical payload bytes and algo_id, the CID **MUST** match across compliant implementations.
|
||||
|
||||
### FR-002 Immutability
|
||||
|
||||
Objects **MUST NOT** be mutated; new payload → new CID.
|
||||
|
||||
### FR-003 Idempotent Put
|
||||
|
||||
Concurrent `put()` of identical payload MUST yield one canonical object; object integrity preserved.
|
||||
|
||||
### FR-004 Verification
|
||||
|
||||
`verify(CID)` MUST recompute the CID and detect corruption; zero false positives.
|
||||
|
||||
### FR-005 Import/Export Canonicality
|
||||
|
||||
Importing COR/1 and then exporting it MUST yield byte-identical bytestreams.
|
||||
|
||||
### FR-006 Size Validation
|
||||
|
||||
`get()` MUST validate payload length according to COR/1.
|
||||
|
||||
### FR-007 Optional Verify-on-Read Policy
|
||||
|
||||
Policy MAY require verify for cold reads; MUST NOT corrupt payload if disabled.
|
||||
|
||||
### FR-008 Canonical Rejection
|
||||
|
||||
CAS decoders MUST reject:
|
||||
|
||||
* out-of-order TLV tags
|
||||
* duplicate TLV tags
|
||||
* extraneous tags
|
||||
* trailing bytes
|
||||
* malformed or over-long VARINT encodings
|
||||
* payload length mismatches
|
||||
|
||||
Rejection MUST be deterministic and symbolic.
|
||||
|
||||
### FR-009 Concurrency Discipline
|
||||
|
||||
Concurrent `put()` operations for identical payloads MUST NOT yield divergent COR/1 envelopes. Only one canonical envelope may result.
|
||||
|
||||
### FR-010 Raw Byte Semantics
|
||||
|
||||
CAS MUST operate strictly over exact payload bytes. No normalization (newline, whitespace, UTF-8 interpretation, or Unicode equivalence) SHALL occur.
|
||||
|
||||
### FR-011 Filesystem Independence
|
||||
|
||||
Consensus behaviour MUST NOT depend on:
|
||||
|
||||
* directory entry ordering
|
||||
* timestamp metadata
|
||||
* filesystem case sensitivity
|
||||
* locale or regional configuration
|
||||
|
||||
### FR-012 Deterministic Failure
|
||||
|
||||
Malformed objects MUST be rejected. CAS MUST NOT auto-repair or normalize COR/1 envelopes.
|
||||
|
||||
### FR-013 Resource Boundaries
|
||||
|
||||
Resource exhaustion (disk full, allocation failure) MUST fail atomically and leave no partial objects visible.
|
||||
|
||||
### FR-014 FCS/1 Descriptor Determinism (v1-min)
|
||||
|
||||
Composite and custom functions MUST be expressed as canonical **FCS/1** descriptors that contain **only the execution recipe**:
|
||||
`function_ptr`, `parameter_block (PCB1)`, and `arity`.
|
||||
Identical descriptors SHALL hash to identical CIDs and MUST remain immutable after publication. **No policy/intent/notes** appear in FCS/1.
|
||||
|
||||
### FR-015 Registry Determinism (Descriptor Admission)
|
||||
|
||||
Functional registries MUST admit **only canonical FCS/1 descriptors** (per FR-014) and enforce descriptor validation (TLV order, PCB1 arity, acyclicity).
|
||||
Registries MUST NOT infer or embed policy/intent into descriptors; publication governance is handled at certification time (FR-017).
|
||||
|
||||
### FR-016 Evaluation Receipt Integrity (FER/1)
|
||||
|
||||
Every execution of a composite function under curated or locked policies MUST emit a **FER/1** receipt. The receipt SHALL encode, in canonical TLV order, at least the following evidence:
|
||||
|
||||
1. `function_cid` → evaluated FCS/1 descriptor (v1-min) preserving CIP indirection.
|
||||
2. `input_manifest` → GS/1 BCF/1 set of consumed input CIDs (deduped and byte-lexicographic).
|
||||
3. `environment` → ICD/1 (or PH03 env capsule) snapshot pinning toolchain/runtime state.
|
||||
4. `evaluator_id` → stable evaluator identity bytes.
|
||||
5. `executor_set` → implementations that executed the recipe, keyed in canonical byte order.
|
||||
6. `parity_vector` → per-executor digests with matching `executor` ordering, shared `output` (`== output_cid`), and `sbom_cid` entries.
|
||||
7. `executor_fingerprint` + `run_id` → optional SBOM fingerprint CID and deterministic dedup hash (`H("AMDUAT:RUN\0" || function || manifest || env || fingerprint)`).
|
||||
8. `logs` → typed evidence capsules binding `kind`, `cid`, and `sha256` for stdout/stderr/metrics traces.
|
||||
9. `limits` → declared execution envelope (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`).
|
||||
10. `determinism_level` / `rng_seed` → declared determinism class (`D1_bit_exact` default, `D2_numeric_stable` requires a 0–32 byte seed).
|
||||
11. `output_cid` → single canonical output CID for the run.
|
||||
12. `started_at` / `completed_at` → epoch-second timestamps satisfying FR-020 bounds.
|
||||
13. `signature` → Ed25519 metadata verifying `H("AMDUAT:FER\0" || canonical bytes)`.
|
||||
|
||||
Receipts MAY include optional `logs` (typed capsules), `context`, `witnesses`, `parent`, and `signature_ext` TLVs but MUST NOT leak policy/intent (those belong to FCT/1).
|
||||
|
||||
From Phase 04 onwards, governance and runtime layers MUST require FER/1 v1.1 receipts; ER/1 artefacts remain valid only as historical evidence and SHALL NOT satisfy FR-016 compliance gates.
|
||||
|
||||
Parity discipline is mandatory: unsorted executor keys or mismatched parity orderings SHALL raise `ERR_IMPL_PARITY_ORDER`; divergent outputs or missing executors SHALL raise `ERR_IMPL_PARITY`. Unknown TLVs or cardinality violations SHALL raise `ERR_FER_UNKNOWN_TAG`. GS/1 manifest violations emit `ERR_FER_INPUT_MANIFEST_SHAPE`; missing RNG seed when determinism ≠ D1 emits `ERR_FER_RNG_REQUIRED`. All signatures MUST verify against the domain-separated hash (`ERR_FER_SIGNATURE` on failure).
|
||||
|
||||
### FR-017 Certification Transactions (FCT/1: Policy & Intent)
|
||||
|
||||
Certification events MUST be recorded as **FCT/1** transactions that aggregate one or more FER/1 receipts and bind **registry policy, intent, domain scope, and authority role**.
|
||||
Transactions MUST include attestations whenever `registry_policy != 0` and SHALL expose publication pointers when federated.
|
||||
**All intent/scope/role/authority metadata lives in FCT/1 (not in FCS/1).**
|
||||
|
||||
### FR-BS-001 ByteStore Deterministic Identity
|
||||
|
||||
ByteStore SHALL derive CIDs using the canonical CAS domain separator: `CID = algo || H("CAS:OBJ\0" || payload)`.
|
||||
The derived CID returned by `put()` and `import_cor()` MUST match the CID embedded in COR/1 envelopes and SHALL remain stable across runs, implementations, and ingest modes (DDS §11.2; ADR-030).
|
||||
|
||||
### FR-BS-002 Atomic Durability Ladder
|
||||
|
||||
ByteStore persistence MUST follow the atomic write ladder: write → `fsync(tmp)` → `rename` → `fsync(shard)` → `fsync(root)`.
|
||||
Crash-window simulations triggered via `AMDUAT_BYTESTORE_CRASH_STEP` MUST leave the public area consistent upon recovery, with no visible partial objects (DDS §11.4; ADR-030; evidence PH05-EV-BS-001).
|
||||
|
||||
### FR-BS-003 Secure/Public Area Isolation
|
||||
|
||||
ByteStore SHALL enforce SA/PA isolation such that public payload roots and secure state roots are disjoint and non-overlapping.
|
||||
Violations MUST raise `ERR_AREA_VIOLATION` and SHALL be surfaced to callers (DDS §11.5; ADR-030).
|
||||
|
||||
### FR-BS-004 COR/1 Round-Trip Identity
|
||||
|
||||
Importing COR/1 bytes via ByteStore and exporting the same CID MUST yield a byte-identical envelope.
|
||||
Any mismatch between stored bytes and derived CID SHALL raise `ERR_IDENTITY_MISMATCH` (DDS §11.3; ADR-030).
|
||||
|
||||
### FR-BS-005 Streaming Determinism & Policy Enforcement
|
||||
|
||||
Chunked ingestion (`put_stream`) MUST produce the same CID as single-shot `put` for equivalent payloads and reject non-bytes or missing data with deterministic errors (`ERR_STREAM_ORDER`, `ERR_STREAM_TRUNCATED`).
|
||||
ByteStore SHALL enforce ICD/1 `max_object_size` for all ingest paths, raising `ERR_POLICY_SIZE` when exceeded (DDS §11.6–11.7; ADR-030).
|
||||
|
||||
### FR-022 Federation Publication Digest (FPD/1)
|
||||
|
||||
Every publish event emerging from an FCT/1 certification MUST emit exactly one **FPD/1** digest satisfying ADR-007 single-digest guarantees.
|
||||
The digest SHALL canonically hash the certified FCT/1 record, all attested FER/1 receipts, and the emitted governance edges (`certifies`, `attests`, `publishes`).
|
||||
Implementations MUST persist the FPD/1 bytes alongside the FCT/1 payload under `/logs/ph03/evidence/fct/` (or successor evidence path) and reference the resulting CID from `fct.publication`.
|
||||
Repeated invocations over identical inputs SHALL reproduce the same digest; mismatches SHALL be treated as certification failures.
|
||||
|
||||
### FR-018 Provenance Enforcement
|
||||
|
||||
Caching or replay layers MUST validate FER/1 receipts and FCT/1 transactions before serving composite outputs. Serving uncertified artefacts when policy requires certification is forbidden.
|
||||
|
||||
### FR-019 Transaction Envelope Rejection
|
||||
|
||||
Systems MUST reject FER/1 or FCT/1 envelopes whose CID lineage does not match the referenced FCS/1 descriptor, whose timestamps are non-monotonic, or whose signatures/attestations fail verification.
|
||||
|
||||
### FR-020 Deterministic Execution Envelope
|
||||
|
||||
| ID | Statement | Verification | Notes |
|
||||
| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
|
||||
| **FR-020 — Deterministic Execution Envelope** | Each executor SHALL complete within a bounded deterministic time envelope (default 5 s). Execution time SHALL be measured and logged as evidence. Non-termination SHALL yield symbolic error `ERR_EXEC_TIMEOUT`. | Verified via CI parity harness and evidence file `/logs/ph03/evidence/<date>-execution-times.jsonl`. | Implements Maat’s Balance principle. Tags: [deterministic-timing, evidence, maat-balance]. |
|
||||
|
||||
### FR-021 Acyclic Composition
|
||||
|
||||
FCS/1 descriptors referencing FPS/1 primitives, PCB1 parameter blocks, or nested FCS/1 descriptors MUST form an acyclic graph.
|
||||
Registries SHALL reject submissions introducing self-references or cycles and emit `ERR_FCS_CYCLE_DETECTED` or
|
||||
`ERR_PCB_ARITY_MISMATCH` when arity metadata conflicts with PCB1 manifests.
|
||||
|
||||
### FR-028 Concept-Native Domain Materialization
|
||||
|
||||
Federated domain manifests SHALL be materialized exclusively from CRS Concepts
|
||||
and Relations. Given a DomainNode Concept, registries MUST traverse
|
||||
`hasManifest` → `ManifestEntry` Concepts, extract `entryName` and
|
||||
`entryChildVersion` relations, dedupe the `(name, version)` set, and compute the
|
||||
GS/1 domain state deterministically. Duplicated pairs trigger `ERR_DG_DUP_ENTRY`;
|
||||
missing relations trigger `ERR_DG_ENTRY_INCOMPLETE`; self references or
|
||||
ancestor loops raise `ERR_DG_CYCLE`. Evidence: `tools/ci/dg_snapshot.py`
|
||||
→ `logs/ph04/evidence/dg1/PH04-EV-DG-001/`.
|
||||
|
||||
Operational linkage: router listings (`GET /links`) MUST return entries sorted
|
||||
lexicographically by `fls_cid` and treat `since` query parameters as exclusive
|
||||
lower bounds, ensuring deterministic replay of linkage events.
|
||||
|
||||
### FR-029 Publication Recursion Discipline
|
||||
|
||||
Publication Concepts SHALL declare their supporting FPD/1 digest, GS/1 cover
|
||||
state, endorsed member FPD CIDs, and optional lineage parent using CRS
|
||||
relations (`covers`, `endorses`, `parent`). Validators MUST recompute GS/1 from
|
||||
the FPD payload, enforce duplicate-free membership, and detect recursive
|
||||
cycles (`ERR_FPD_CYCLE`). Timestamp regressions raise `ERR_FPD_TIMESTAMP`; state
|
||||
mismatches raise `ERR_PUB_STATE_MISMATCH`. Evidence: `tools/ci/pub_validate.py`
|
||||
→ `logs/ph04/evidence/pub1/PH04-EV-PUB-001/`.
|
||||
|
||||
Operational linkage: non-genesis publications SHOULD enable the parent-required
|
||||
policy, supplying `fpd.parent` and guaranteeing strictly monotonic
|
||||
`fpd.timestamp` to align with ADR-019 v1.2.1 and PH04 parent-policy harnesses.
|
||||
|
||||
### FR-030 Predicate Concepts
|
||||
|
||||
Every CRR/1 relation predicate MUST resolve to a CRS Concept. When the
|
||||
taxonomy defines a `Predicate` Concept, predicate entries SHALL expose an
|
||||
`is_a` edge into that class. Missing predicate Concepts raise
|
||||
`ERR_CRR_PREDICATE_NOT_CONCEPT`; missing taxonomy membership raises
|
||||
`ERR_CRR_PREDICATE_CLASS_MISSING`. Evidence: CRS validator vectors and
|
||||
`logs/ph04/evidence/crs1/PH04-EV-CRS-001.md`.
|
||||
|
||||
Operational linkage: FPD feed endpoints SHALL implement stateless, content-anchored pagination over parent-chained publications. `GET /feed/fpd` MUST traverse the publisher’s current tip toward genesis until either the caller-provided `limit` is satisfied or the supplied `since` CID is encountered; identical `publisher_id`, `since`, and `limit` inputs SHALL yield identical CID sequences. Detail lookups (`GET /feed/fpd/:cid`) SHALL expose publisher, members, parent, and state metadata without server-side session state. Evidence: `tools/ci/feeds_check.py` → `/amduat/logs/ph04/evidence/feeds/PH04-EV-FEEDS-001/pass.jsonl`.
|
||||
|
||||
### FR-031 Authority Anchoring via CRS & FPD
|
||||
|
||||
Publishing authorities SHALL represent identities as CRS Concepts linked via
|
||||
`owns` and `hasRole` relations to key material and governance roles. Signatures
|
||||
remain confined to FCT/1 and FPD/1 surfaces; CRS layers stay unsigned. FLS/1
|
||||
transport MAY carry Concept or Relation payloads but MUST NOT mutate them and
|
||||
MUST perform payload-kind checks when requested (`--check-crs-payload`).
|
||||
|
||||
Operational linkage: FLS router deployments SHALL expose `POST /fls`,
|
||||
`GET /fls/:cid`, `GET /links`, `GET /healthz`, and `GET /readyz` endpoints and
|
||||
enforce SA/PA separation (`ERR_AREA_VIOLATION` if misconfigured) so that public
|
||||
ingest never mutates state areas directly. Audited ticket intake SHALL be
|
||||
implemented via WT/1 (ADR-023) with:
|
||||
|
||||
* `POST /wt` (Protected Area) accepting WT/1 BCF/1 payloads, validating
|
||||
`has_pubkey(wt.author, wt.pubkey)` (or registered equivalent), verifying
|
||||
signatures over `H("AMDUAT:WT\0" || canonical_bytes_without_signature)`,
|
||||
enforcing registered ADR-010 intents (deduped + byte-lexicographically
|
||||
sorted), ensuring monotonic `wt.timestamp` per `wt.author`, and optionally
|
||||
chaining `wt.parent` lineage. Violations yield `ERR_WT_SIGNATURE`,
|
||||
`ERR_WT_KEY_UNBOUND`, `ERR_WT_INTENT_UNREGISTERED`, `ERR_WT_INTENT_DUP`,
|
||||
`ERR_WT_INTENT_EMPTY`, `ERR_WT_TIMESTAMP`, `ERR_WT_PARENT_UNKNOWN`, or
|
||||
`ERR_WT_PARENT_REQUIRED`. Router policy MUST surface scope denials as
|
||||
`ERR_WT_SCOPE_UNAUTHORIZED` and log the governing policy capsule.
|
||||
* `GET /wt/:cid` returning the canonical WT/1 bytes for any accepted ticket.
|
||||
* Deterministic pagination (`GET /wt?after=<cid>&limit=<n>`) that emits WT/1
|
||||
entries in byte-lexicographic CID order with stable page boundaries. The
|
||||
`after` parameter is an exclusive bound and routers SHALL enforce
|
||||
`1 ≤ limit ≤ Nmax` to guarantee replay stability.
|
||||
|
||||
Evidence: `/amduat/logs/ph04/evidence/wt1/PH04-EV-WT-001/summary.md` captures the
|
||||
validator run over vectors `TV-WT-001…009`, ensuring unknown keys, signature
|
||||
failures, timestamp regressions (including parent inversions), unbound keys,
|
||||
unregistered intents, policy rejections, and unresolved parents reject as
|
||||
specified.
|
||||
|
||||
Compat overlays SHALL reference ADR-025 MPR/1 provenance capsules and ADR-026
|
||||
IER/1 inference evidence when operating in policy lane `compat`. Routers MUST
|
||||
validate that `executor_fingerprint` equals the supplied MPR/1 CID, enforce
|
||||
`determinism_level` plus `rng_seed` (raising `ERR_FER_RNG_REQUIRED` when
|
||||
omitted), and verify log digests via the IER/1 manifest before accepting
|
||||
overlays (`ERR_IER_LOG_HASH`/`ERR_IER_LOG_MANIFEST`). Evidence surfaces
|
||||
`/amduat/logs/ph04/evidence/mpr1/PH04-EV-MPR-001/pass.jsonl` and
|
||||
`/amduat/logs/ph04/evidence/ier1/PH04-EV-IER-001/pass.jsonl` prove vector
|
||||
coverage `TV-MPR-001…003` (hash triple, missing weights, signature domain) and
|
||||
`TV-IER-001…004` (ok, missing seed, fingerprint mismatch, log digest mismatch)
|
||||
respectively with scenario summaries in accompanying `summary.md` files.
|
||||
|
||||
### FR-032 CT/1 Deterministic Replay (D1)
|
||||
|
||||
Given identical AC/1 + DTF/1 + topology inputs, executing the runtime twice in
|
||||
isolation MUST produce byte-identical CT/1 snapshots (header and payload) with
|
||||
matching CIDs whenever `ct.determinism_level = 0`. Evidence:
|
||||
`tools/ci/ct_replay.py` (`runA`/`runB`) →
|
||||
`/amduat/logs/ph05/evidence/ct1/PH05-EV-CT1-REPLAY-001/`.
|
||||
|
||||
### FR-033 CT/1 Numeric Stability (D2)
|
||||
|
||||
When `ct.determinism_level = 1`, numeric observables MAY diverge, but the
|
||||
maximum absolute delta MUST remain within the tolerance documented by
|
||||
`ct.kernel_cfg`. Evidence: `tools/ci/ct_replay.py` D2 replay outputs and kernel
|
||||
configuration manifests in the same evidence set.
|
||||
|
||||
### FR-034 CT/1 Header Integrity
|
||||
|
||||
CT/1 headers MUST follow ADR-027: canonical BCF/1 key ordering, rejection of
|
||||
unknown keys, monotonic `ct.tick`, canonical `cid:` formatting for topology and
|
||||
AC/1/DTF/1 pointers (ADR-028), and Ed25519 signatures over
|
||||
`H("AMDUAT:CT\0" || canonical_bytes_without_signature)`. Evidence:
|
||||
`tools/validate/ct1_validator.py` with vectors
|
||||
`/amduat/vectors/ph05/ct1/TV-CT1-001…004` and AC/DTF fixtures
|
||||
`TV-AC1-001…002`, `TV-DTF1-001…002`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Non-Functional Requirements
|
||||
|
||||
### NFR-001 Determinism
|
||||
|
||||
Platform/language differences MUST NOT affect CID.
|
||||
|
||||
### NFR-002 Performance
|
||||
|
||||
Put/get latency MUST remain within configured OPS budgets.
|
||||
|
||||
### NFR-003 Reliability
|
||||
|
||||
CAS operations MUST be atomic; partial writes MUST NOT be visible.
|
||||
|
||||
### NFR-004 Portability
|
||||
|
||||
Implementations MUST operate on common filesystems.
|
||||
|
||||
### NFR-005 Security Posture
|
||||
|
||||
Domain separation strings MUST be applied for all hashed surfaces.
|
||||
|
||||
### 4.3 Future Scope Alignment (Informative)
|
||||
|
||||
Phase 02 introduces deterministic transformation primitives (**FPS/1**) extending the Kheper CAS model defined herein.
|
||||
See `/amduat/arc/adrs/adr-015.md` and `/amduat/tier1/fps.md` for details.
|
||||
No behavioural changes apply retroactively to PH01 surfaces.
|
||||
|
||||
---
|
||||
|
||||
## 5. Data Model (Behavioural View)
|
||||
|
||||
* CAS objects identified strictly by CID.
|
||||
* COR/1 envelope provides size, payload, algo_id.
|
||||
* ICD/1 descriptor provides instance configuration.
|
||||
|
||||
> See DDS §2 (COR/1) and §3 (ICD/1) for normative byte layouts.
|
||||
|
||||
---
|
||||
|
||||
## 6. API Semantics
|
||||
|
||||
### `put(payload_bytes, algo_id=default) → CID`
|
||||
|
||||
* Compute CID using domain separation: `CID = algo_id || H("CAS:OBJ\0" || payload_bytes)`
|
||||
* If CID exists: return existing CID (idempotent)
|
||||
* If absent: write canonical COR/1 envelope atomically
|
||||
* Reject on size limit breach, malformed payload, non-canonical COR/1, I/O errors
|
||||
* Writes MUST be atomic: temp file → fsync → rename → fsync parent dir
|
||||
|
||||
### `get(CID) → payload_bytes`
|
||||
|
||||
* Retrieve raw payload bytes
|
||||
* MUST validate canonical COR/1 envelope
|
||||
* Implementation MAY verify hash on read by policy
|
||||
* Reject on missing object, hash mismatch
|
||||
|
||||
### `exists(CID) → bool`
|
||||
|
||||
* Return true if object is present and canonical
|
||||
|
||||
### `stat(CID) → { present, size, algo_id }`
|
||||
|
||||
* MUST return canonical metadata
|
||||
|
||||
### `verify(CID) → { ok|error, expected:CID, actual:CID }`
|
||||
|
||||
* Recompute CID from canonical bytes
|
||||
* MUST detect corruption and reject non-canonical encodings
|
||||
|
||||
### `import(stream_COR1) → CID`
|
||||
|
||||
* Validate canonical TLV ordering
|
||||
* Reject duplicate tags, extraneous tags, malformed VARINTs
|
||||
* MUST round-trip to identical CID
|
||||
|
||||
### `export(CID) → stream_COR1`
|
||||
|
||||
* Emit canonical envelope; re-encoding MUST preserve canonical bytes
|
||||
|
||||
### Deterministic Errors
|
||||
|
||||
Errors MUST be emitted as stable symbolic codes including but not limited to:
|
||||
|
||||
* `E_CID_NOT_FOUND`
|
||||
* `E_CORRUPT_OBJECT`
|
||||
* `E_CANONICALITY_VIOLATION`
|
||||
* `E_IO_FAILURE`
|
||||
|
||||
---
|
||||
|
||||
## 7. Success Criteria
|
||||
|
||||
* Byte-for-byte CID agreement (≥ 3 platforms)
|
||||
* Zero false positives in `verify()`
|
||||
* Idempotent concurrent `put()`
|
||||
* COR/1 import/export round-trips cleanly
|
||||
|
||||
---
|
||||
|
||||
## 8. GC Semantics (Behavioural)
|
||||
|
||||
* Reachability from configured roots
|
||||
* Dry-run mode MUST NOT delete
|
||||
* Removal MUST be atomic per object
|
||||
|
||||
---
|
||||
|
||||
## 9. Acceptance Criteria (Phase Exit)
|
||||
|
||||
* Golden vectors published
|
||||
* Cross-impl CI passing
|
||||
* COR/1 and ICD/1 documented in DDS
|
||||
* Security posture validated by SEC
|
||||
|
||||
---
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
* Requirements link to tests/defects in Phase Packs
|
||||
* ADRs reference affected FR/NFR IDs
|
||||
|
||||
---
|
||||
|
||||
## 11. Future Phases
|
||||
|
||||
* Multi-object transactions bind to `instance_id`
|
||||
* Provenance graph consumes COR/1 metadata
|
||||
|
||||
---
|
||||
|
||||
## 12. Functional Primitive Surface (FPS/1)
|
||||
|
||||
> Defines the canonical deterministic operations over canonical payloads.
|
||||
> Each primitive produces exactly one payload and one CID.
|
||||
|
||||
| Primitive | Signature | Description | Determinism / Errors |
|
||||
| ------------- | ------------------------------ | ------------------------------------------- | ---------------------------------------------- |
|
||||
| `put` | `(payload_bytes) → CID` | Canonical write, atomic fsync ladder. | ADR-006 `ERR_IO_FAILURE`, `ERR_NORMALIZATION`. |
|
||||
| `get` | `(CID) → payload_bytes` | Fetch canonical bytes. | `ERR_CID_NOT_FOUND`. |
|
||||
| `slice` | `(CID, offset, length) → CID` | Extract contiguous bytes. | `ERR_SLICE_RANGE`. |
|
||||
| `concatenate` | `([CID₁,…,CIDₙ]) → CID` | Sequential join of payloads. | `ERR_EMPTY_INPUTS`. |
|
||||
| `reverse` | `(CID, level) → CID` | Reverse payload order (bit/byte/word/long). | `ERR_REV_ALIGNMENT`, `ERR_INVALID_LEVEL`. |
|
||||
| `splice` | `(CID_a, offset, CID_b) → CID` | Insert payload b into a at offset. | `ERR_SPLICE_RANGE`. |
|
||||
|
||||
**Determinism:** identical inputs → identical outputs.
|
||||
**Immutability:** inputs never mutated.
|
||||
**Closure:** outputs valid for reuse as inputs to any primitive.
|
||||
**Error handling:** all symbolic per ADR-006.
|
||||
|
||||
---
|
||||
|
||||
## Appendix A — Surface Version Table
|
||||
|
||||
| Surface | Version | Notes |
|
||||
| ------- | ------- | ----- |
|
||||
| FCS/1 | v1-min | Canonical execution descriptors; governance captured in FCT/1. |
|
||||
| FER/1 | v1.1 | Receipts enforce parity-first evidence, run_id dedup, typed logs, and RNG discipline (ADR-017). |
|
||||
| FCT/1 | v1.0 | Certification transactions binding policy/intent/attestations with FER/1 sets. |
|
||||
| FPD/1 | v1.0 | Publication digest linking FCT/1 to FER/1 receipts for federation replay. |
|
||||
|
||||
---
|
||||
|
||||
## Document History
|
||||
|
||||
* 0.2.1 (2025-10-26) — Phase Pack pointer updated; no semantic changes; archival preserves historical lineage per ADR-002.
|
||||
* 0.2.2 (2025-10-26) — Promoted PH01 baseline to Approved; synchronized Phase Pack §1 anchors and closure snapshot.
|
||||
* 0.2.3 (2025-10-27) — Added future scope alignment note pointing to FPS/1 and ADR-015; PH01 semantics remain unchanged.
|
||||
* **0.2.4 (2025-11-14):** Added FR-014–FR-019 for FCS/1 composition, FER/1 receipts, and FCT/1 certification policies.
|
||||
* **0.2.5 (2025-11-15):** Added FR-021 (formerly FR-020) enforcing acyclic FCS/1 composition and PCB1 arity validation.
|
||||
* **0.2.6 (2025-11-19):** Registered FR-020 Deterministic Execution Envelope (Maat’s Balance) with timing evidence tags.
|
||||
* **0.3.0 (2025-11-02):** Trimmed FCS/1 to execution-only (v1-min) under FR-014/FR-015; moved policy/intent/scope/role/authority to FCT/1 (FR-017); clarified registry admission behaviour and kept FER/1 unchanged.
|
||||
* **0.3.1 (2025-11-21):** Updated FR-016 to require parity-first FER/1 receipts with executor sets, parity vectors, and FR-020 aligned timestamps.
|
||||
* **0.3.2 (2025-11-22):** Registered FR-022 Federation Publication Digest (FPD/1) requirement tying FCT/1 publications to single-digest evidence and canonical logging.
|
||||
|
||||
* **0.3.4 (2025-11-07):** Recorded FER/1 v1.1 requirement for Phase 04 and added surface version table.
|
||||
|
||||
* **0.3.5 (2025-11-08):** Registered PH04 linkage & semantic placeholder requirements (FR-028…031).
|
||||
* **0.3.6 (2025-11-09):** Promoted FR-028…031 to normative linkage requirements with CRS/1 validator enforcement.
|
||||
|
||||
* **0.3.7 (2025-11-08):** Finalized FR-028…031 with CRS/1 immutability, GS/1 linkage, and certification coverage.
|
||||
|
||||
* **0.3.8 (2025-11-09):** Promoted FR-028…FR-031 for concept-native domain and publication validation.
|
||||
* **0.3.9 (2025-11-09):** Documented operational linkage: router endpoints, deterministic `/links`, and parent-required publish policy guidance.
|
||||
* **0.3.10 (2025-11-11):** Registered FR-030 stateless, content-anchored FPD feed pagination requirement.
|
||||
|
||||
* **0.3.11 (2025-11-09):** Extended FR-031 with WT/1 intake endpoints, validation, and evidence log references.
|
||||
* **0.3.12 (2025-11-20):** Tightened FR-031 with `wt.pubkey` bindings, signature preimage exclusion, lineage/policy errors, and
|
||||
expanded WT/1 vector evidence coverage.
|
||||
|
||||
* **0.3.13 (2025-11-21):** Updated FR-031 for `has_pubkey` bindings (`ERR_WT_KEY_UNBOUND`), intent registry enforcement (`ERR_WT_INTENT_UNREGISTERED`), lineage policy rejection (`ERR_WT_PARENT_REQUIRED`), and expanded WT/1 vectors `TV-WT-001…009`.
|
||||
* **0.3.14 (2025-11-22):** WT/1 intake and SOS/1 compat overlays proven with PH04-M4/M5 audit evidence.
|
||||
* **0.3.15 (2025-11-22):** Recorded ADR-025/026 compat path requirements and evidence anchors for FR-031.
|
||||
|
||||
* **0.3.16 (2025-11-23):** Compat lane now enforces ADR-025/026 validators (MPR/1 hash triple, IER/1 replay) with updated evidence surfaces.
|
||||
|
||||
* **0.3.17 (2025-11-24):** Added FR-032–FR-034 for CT/1 replay determinism, numeric stability, and header integrity (ADR-027/028).
|
||||
|
||||
* **0.4.0 (2025-11-11):** Added FR-BS-001…005 for ByteStore identity, atomic durability, SA/PA isolation, COR round-trip, and streaming determinism linked to DDS §11 / ADR-030.
|
||||
158
tier1/tgk-1.md
Normal file
158
tier1/tgk-1.md
Normal file
|
|
@ -0,0 +1,158 @@
|
|||
# TGK/1 — Trace Graph Kernel Semantics
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: No
|
||||
Last Updated: 2025-11-30
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [tgk, determinism, index, federation]
|
||||
|
||||
<!-- Source: /amduat-api/tier1/tgk-1.md | Canonical: /amduat/tier1/tgk-1.md -->
|
||||
|
||||
**Document ID:** `TGK/1`
|
||||
**Layer:** L1 — Semantic graph layer over ASL artifacts and PERs (no encodings)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE`
|
||||
* `ASL/1-CORE-INDEX`
|
||||
* `ASL/LOG/1`
|
||||
* `ASL/SYSTEM/1`
|
||||
* `TGK/1-CORE`
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ENC/TGK1-EDGE/1` — core edge encoding
|
||||
* `ENC/TGK-INDEX/1` — index encoding draft
|
||||
* `ASL/INDEX-ACCEL/1`
|
||||
* `ENC/ASL-CORE-INDEX/1`
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||||
|
||||
TGK/1 defines semantic meaning only. It does not define storage formats, on-disk encodings, or execution operators.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose & Scope
|
||||
|
||||
TGK/1 defines the **semantic layer** for Trace Graph Kernel (TGK) edges that relate ASL artifacts and PERs.
|
||||
It keeps TGK thin and deterministic by reusing ASL index and log semantics.
|
||||
|
||||
Non-goals:
|
||||
|
||||
* New encodings for edges or indexes
|
||||
* Query operators or execution plans
|
||||
* Federation protocols or transport
|
||||
* Re-definition of ASL or PEL semantics
|
||||
|
||||
---
|
||||
|
||||
## 2. TGK Objects
|
||||
|
||||
### 2.1 TGK Edge
|
||||
|
||||
A TGK Edge is an **immutable record** representing a directed relationship between ASL artifacts and/or PERs.
|
||||
TGK edges are semantic overlays and **MUST NOT** redefine or bypass ASL identity.
|
||||
TGK/1-CORE defines the EdgeBody structure with ordered `from`/`to` lists; TGK/1
|
||||
does not further constrain cardinality.
|
||||
|
||||
### 2.2 Canonical Edge Key
|
||||
|
||||
Each TGK edge has a **Canonical Edge Key** that uniquely identifies it.
|
||||
The Canonical Edge Key MUST be derived from the logical `EdgeBody` defined in
|
||||
`TGK/1-CORE`, preserving list order and multiplicity:
|
||||
|
||||
* `from`: ordered list of source node identifiers (MAY be empty)
|
||||
* `to`: ordered list of destination node identifiers (MAY be empty)
|
||||
* `payload`: reference carried by the edge
|
||||
* `type`: edge type identifier
|
||||
* Projection context (for example, PER or execution identity) when not already
|
||||
captured by the edge payload or type profile
|
||||
|
||||
Classification attributes (edge type keys, labels) **MUST NOT** affect canonical identity.
|
||||
|
||||
---
|
||||
|
||||
## 3. Index and Visibility (Normative)
|
||||
|
||||
TGK edges are **indexed objects** and inherit visibility from the ASL index and log:
|
||||
|
||||
1. A TGK edge becomes visible only when its index record is admitted by a sealed segment and log order (ASL/LOG/1).
|
||||
2. TGK traversal and lookup **MUST NOT** bypass index visibility or log ordering.
|
||||
3. For a fixed `{Snapshot, LogPrefix}`, TGK edge lookup and shadowing **MUST** be deterministic (ASL/1-CORE-INDEX).
|
||||
4. Tombstones and shadowing semantics follow ASL/1-CORE-INDEX and ASL/LOG/1 replay order.
|
||||
|
||||
Index records MUST reference TGK/1-CORE edge identities. Index encodings MUST
|
||||
NOT re-encode edge structure (`from[]`, `to[]`); they reference TGK/1-CORE edges
|
||||
and carry only routing/filter metadata.
|
||||
|
||||
---
|
||||
|
||||
## 4. Deterministic Traversal (Normative)
|
||||
|
||||
TGK traversal operates over a snapshot/log-bounded view:
|
||||
|
||||
* Inputs: `{Snapshot, LogPrefix}` and a seed set (nodes or edges).
|
||||
* Outputs: only edges visible under the same `{Snapshot, LogPrefix}`.
|
||||
* Traversal **MUST** be deterministic and replay-compatible with ASL/LOG/1.
|
||||
|
||||
Deterministic ordering for traversal output MUST be:
|
||||
|
||||
1. `logseq` ascending
|
||||
2. Canonical Edge Key as tie-break
|
||||
|
||||
Acceleration structures MAY be used but MUST NOT change semantics.
|
||||
|
||||
---
|
||||
|
||||
## 5. Federation Alignment (Normative)
|
||||
|
||||
Federation does not change TGK semantics. It only propagates edges and artifacts that are already visible under index rules.
|
||||
|
||||
* Domain visibility and publication status are enforced via index metadata (ENC-ASL-CORE-INDEX).
|
||||
* TGK edges keep canonical identity across domains.
|
||||
* Cross-domain propagation MUST preserve snapshot/log determinism.
|
||||
|
||||
---
|
||||
|
||||
## 6. Non-Goals
|
||||
|
||||
TGK/1 does not define:
|
||||
|
||||
* Edge encoding or storage layout
|
||||
* Index segment formats
|
||||
* Query languages or execution plans
|
||||
* Acceleration rules beyond ASL/INDEX-ACCEL/1
|
||||
|
||||
---
|
||||
|
||||
## 7. Normative Invariants
|
||||
|
||||
Conforming implementations MUST enforce:
|
||||
|
||||
1. TGK edges are immutable and indexed objects.
|
||||
2. No TGK visibility without index admission and log ordering.
|
||||
3. Traversal is snapshot/log bounded and deterministic.
|
||||
4. Federation does not alter TGK semantics; it only propagates visible edges.
|
||||
5. Edge classification is not part of canonical identity.
|
||||
Loading…
Reference in a new issue