amduat/tier1/asl-core-index-1.md

249 lines
9 KiB
Markdown
Raw Normal View History

2026-01-17 11:18:00 +01:00
# ASL/1-CORE-INDEX — Semantic Index Model
Status: Draft
Owner: Niklas Rydberg
Version: 0.1.0
SoT: No
Last Updated: 2025-11-16
Linked Phase Pack: N/A
Tags: [deterministic, index, semantics]
<!-- Source: /amduat-api/tier1/asl-core-index.md | Canonical: /amduat/tier1/asl-core-index-1.md -->
**Document ID:** `ASL/1-CORE-INDEX`
**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)
**Depends on (normative):**
* `ASL/1-CORE`
* `ASL/1-STORE`
**Informative references:**
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
* `ENC/ASL-CORE-INDEX/1` — bytes-on-disk encoding profile
* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding)
* `ASL/LOG/1` — append-only semantic log (segment visibility)
* `TGK/1` — TGK edge visibility and traversal alignment
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
© 2025 Niklas Rydberg.
## License
Except where otherwise noted, this document (text and diagrams) is licensed under
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
Universal (CC0) to enable unrestricted reuse in implementations and derivative
specifications.
Code examples in this document are provided under the Apache License 2.0 unless
explicitly stated otherwise. Test vectors, where present, are dedicated to the
public domain under CC0 1.0.
---
## 0. Conventions
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.
---
## 1. Purpose & Non-Goals
### 1.1 Purpose
ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:
* It specifies what it means to map an artifact identity to a byte location.
* It defines visibility, immutability, and shadowing semantics.
2026-01-17 12:21:15 +01:00
* It ensures deterministic lookup for a fixed snapshot and log position.
2026-01-17 11:18:00 +01:00
### 1.2 Non-goals
ASL/1-CORE-INDEX explicitly does **not** define:
* On-disk layouts, segment files, or memory representations.
* Block allocation, packing, GC, or lifecycle rules.
* Snapshot implementation details, checkpoints, or log storage.
* Performance optimizations (bloom filters, sharding, SIMD).
* Federation, provenance, or execution semantics.
---
## 2. Terminology
* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE.
* **Reference** — ASL/1 content address of an Artifact (hash_id + digest).
* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE).
* **Block** — immutable storage unit containing artifact bytes.
* **BlockID** — opaque identifier for a block.
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
2026-01-17 12:21:15 +01:00
* **Degenerate store** — a store that treats each artifact as its own block,
with a single extent covering the entire blob.
2026-01-17 11:18:00 +01:00
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
2026-01-17 12:21:15 +01:00
* **CURRENT** — effective state after replaying a log position on a snapshot.
2026-01-17 11:18:00 +01:00
---
## 3. Core Mapping Semantics
### 3.1 Index Mapping
The index defines a semantic mapping:
```
Reference -> ArtifactLocation
```
For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state.
### 3.2 Determinism
2026-01-17 12:21:15 +01:00
For a fixed `{StoreConfig, Snapshot, LogPosition}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
2026-01-17 11:18:00 +01:00
### 3.3 StoreConfig Consistency
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.
---
## 4. ArtifactLocation Semantics
* An ArtifactLocation is an **ordered list** of ArtifactExtents.
* Each extent references immutable bytes within a block.
* The artifact bytes are defined by **concatenating extents in order**.
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
* An ArtifactLocation is valid only while all referenced blocks are retained.
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
2026-01-17 12:21:15 +01:00
* In a degenerate store, an ArtifactLocation consists of a single extent that
spans the full blob in its dedicated block.
2026-01-17 11:18:00 +01:00
---
## 5. Visibility Model
An index entry is **visible** at CURRENT if and only if:
2026-01-17 12:21:15 +01:00
1. The entry is admitted by the store's visibility mechanism as defined in
`ASL/STORE-INDEX/1` (e.g., via sealed segments and an append-only log), for
the given snapshot/log position.
2. The referenced bytes are immutable (e.g., the underlying block is sealed by
store rules).
2026-01-17 11:18:00 +01:00
Visibility is binary; entries are either visible or not visible.
2026-01-17 12:21:15 +01:00
**Implementation note:** A store MAY implement a degenerate visibility
mechanism (e.g., a single implicit segment that is always sealed and a trivial
log position), which is sufficient for simple filesystem-backed stores such as
`asl_store_fs`.
2026-01-17 11:18:00 +01:00
---
## 6. Snapshot and Log Semantics
Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.
The index state for a given CURRENT is defined as:
```
2026-01-17 12:21:15 +01:00
Index(CURRENT) = Index(snapshot) + replay(log_position)
2026-01-17 11:18:00 +01:00
```
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
---
## 7. Immutability and Shadowing
### 7.1 Immutability
* Index entries are never mutated.
* Once visible, an entrys meaning does not change.
* Referenced bytes are immutable for the lifetime of the entry.
### 7.2 Shadowing
* Later entries MAY shadow earlier entries with the same Reference.
* Precedence is determined solely by log order.
* Snapshot boundaries do not alter shadowing semantics.
---
## 8. Tombstones (Optional)
2026-01-17 12:21:15 +01:00
Tombstones MAY be used to invalidate prior mappings.
2026-01-17 11:18:00 +01:00
* A tombstone shadows earlier entries for the same Reference.
2026-01-17 12:21:15 +01:00
* Tombstones are visibility policy records (see `ASL/LOG/1`) and are applied
during replay; they are not required to appear as index entries.
* If an encoding chooses to materialize tombstones in index segments, they MUST
have no `ArtifactLocation` and MUST follow the same visibility rules as other
entries.
2026-01-17 11:18:00 +01:00
---
## 9. Determinism Guarantees
For fixed:
* StoreConfig
* Snapshot
* Log prefix
ASL/1-CORE-INDEX guarantees:
* Deterministic lookup results
* Deterministic shadowing resolution
* Deterministic visibility
---
## 10. Normative Invariants
Conforming implementations MUST enforce:
1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
2. No mutation of visible index entries.
3. Referenced bytes remain immutable for the entrys lifetime.
4. Shadowing follows strict log order.
5. Snapshot + log replay uniquely defines CURRENT.
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.
Violation of any invariant constitutes index corruption.
---
## 11. Relationship to Other Specifications
| Layer | Responsibility |
| ------------------ | ---------------------------------------------------------- |
| ASL/1-CORE | Artifact semantics and identity |
| ASL/1-STORE | StoreSnapshot and put/get logical model |
| ASL/1-CORE-INDEX | Semantic mapping of Reference → ArtifactLocation |
| ASL-STORE-INDEX | Lifecycle, replay, and visibility contracts |
| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records |
---
## 12. Summary
ASL/1-CORE-INDEX specifies the semantic meaning of the index:
* It maps artifact References to byte locations deterministically.
* It defines visibility and shadowing rules across snapshot + log replay.
* It guarantees immutability and deterministic lookup.
It answers one question:
> *Given a Reference and a CURRENT state, where are the bytes?*