amduat-api/tier1/asl-core-index.md

213 lines
7.4 KiB
Markdown
Raw Normal View History

2026-01-17 06:29:58 +01:00
# ASL/1-CORE-INDEX — Semantic Index Model
Status: Draft
Owner: Niklas Rydberg
Version: 0.1.0
SoT: No
Last Updated: 2025-11-16
Tags: [deterministic, index, semantics]
**Document ID:** `ASL/1-CORE-INDEX`
**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)
**Depends on (normative):**
* `ASL/1-CORE`
* `ASL/1-STORE`
**Informative references:**
* `ASL-STORE-INDEX` — store lifecycle and replay contracts
* `ENC-ASL-CORE-INDEX` — bytes-on-disk encoding profile (`tier1/enc-asl-core-index.md`)
* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding)
* `ASL/LOG/1` — append-only semantic log (segment visibility)
---
## 0. Conventions
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.
---
## 1. Purpose & Non-Goals
### 1.1 Purpose
ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:
* It specifies what it means to map an artifact identity to a byte location.
* It defines visibility, immutability, and shadowing semantics.
* It ensures deterministic lookup for a fixed snapshot and log prefix.
### 1.2 Non-goals
ASL/1-CORE-INDEX explicitly does **not** define:
* On-disk layouts, segment files, or memory representations.
* Block allocation, packing, GC, or lifecycle rules.
* Snapshot implementation details, checkpoints, or log storage.
* Performance optimizations (bloom filters, sharding, SIMD).
* Federation, provenance, or execution semantics.
---
## 2. Terminology
* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE.
* **Reference** — ASL/1 content address of an Artifact (hash_id + digest).
* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE).
* **Block** — immutable storage unit containing artifact bytes.
* **BlockID** — opaque identifier for a block.
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
* **CURRENT** — effective state after replaying a log prefix on a snapshot.
---
## 3. Core Mapping Semantics
### 3.1 Index Mapping
The index defines a semantic mapping:
```
Reference -> ArtifactLocation
```
For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state.
### 3.2 Determinism
For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
### 3.3 StoreConfig Consistency
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.
2026-01-17 06:29:58 +01:00
---
## 4. ArtifactLocation Semantics
* An ArtifactLocation is an **ordered list** of ArtifactExtents.
* Each extent references immutable bytes within a block.
* The artifact bytes are defined by **concatenating extents in order**.
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
2026-01-17 06:29:58 +01:00
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
* An ArtifactLocation is valid only while all referenced blocks are retained.
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
---
## 5. Visibility Model
An index entry is **visible** at CURRENT if and only if:
1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
2026-01-17 06:29:58 +01:00
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
Visibility is binary; entries are either visible or not visible.
---
## 6. Snapshot and Log Semantics
Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.
2026-01-17 06:29:58 +01:00
The index state for a given CURRENT is defined as:
```
Index(CURRENT) = Index(snapshot) + replay(log_prefix)
```
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
---
## 7. Immutability and Shadowing
### 7.1 Immutability
* Index entries are never mutated.
* Once visible, an entrys meaning does not change.
* Referenced bytes are immutable for the lifetime of the entry.
### 7.2 Shadowing
* Later entries MAY shadow earlier entries with the same Reference.
* Precedence is determined solely by log order.
* Snapshot boundaries do not alter shadowing semantics.
---
## 8. Tombstones (Optional)
Tombstone entries MAY be used to invalidate prior mappings.
* A tombstone shadows earlier entries for the same Reference.
* Visibility rules are identical to regular entries.
* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.
---
## 9. Determinism Guarantees
For fixed:
* StoreConfig
* Snapshot
* Log prefix
ASL/1-CORE-INDEX guarantees:
* Deterministic lookup results
* Deterministic shadowing resolution
* Deterministic visibility
---
## 10. Normative Invariants
Conforming implementations MUST enforce:
1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
2026-01-17 06:29:58 +01:00
2. No mutation of visible index entries.
3. Referenced bytes remain immutable for the entrys lifetime.
4. Shadowing follows strict log order.
5. Snapshot + log replay uniquely defines CURRENT.
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.
2026-01-17 06:29:58 +01:00
Violation of any invariant constitutes index corruption.
---
## 11. Relationship to Other Specifications
| Layer | Responsibility |
| ------------------ | ---------------------------------------------------------- |
| ASL/1-CORE | Artifact semantics and identity |
| ASL/1-STORE | StoreSnapshot and put/get logical model |
| ASL/1-CORE-INDEX | Semantic mapping of Reference → ArtifactLocation |
| ASL-STORE-INDEX | Lifecycle, replay, and visibility contracts |
| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records |
---
## 12. Summary
ASL/1-CORE-INDEX specifies the semantic meaning of the index:
* It maps artifact References to byte locations deterministically.
* It defines visibility and shadowing rules across snapshot + log replay.
* It guarantees immutability and deterministic lookup.
It answers one question:
> *Given a Reference and a CURRENT state, where are the bytes?*