234 lines
8.2 KiB
Markdown
234 lines
8.2 KiB
Markdown
# ASL/1-CORE-INDEX — Semantic Index Model
|
||
|
||
Status: Draft
|
||
Owner: Niklas Rydberg
|
||
Version: 0.1.0
|
||
SoT: No
|
||
Last Updated: 2025-11-16
|
||
Linked Phase Pack: N/A
|
||
Tags: [deterministic, index, semantics]
|
||
|
||
<!-- Source: /amduat-api/tier1/asl-core-index.md | Canonical: /amduat/tier1/asl-core-index-1.md -->
|
||
|
||
**Document ID:** `ASL/1-CORE-INDEX`
|
||
**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)
|
||
|
||
**Depends on (normative):**
|
||
|
||
* `ASL/1-CORE`
|
||
* `ASL/1-STORE`
|
||
|
||
**Informative references:**
|
||
|
||
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
||
* `ENC/ASL-CORE-INDEX/1` — bytes-on-disk encoding profile
|
||
* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding)
|
||
* `ASL/LOG/1` — append-only semantic log (segment visibility)
|
||
* `TGK/1` — TGK edge visibility and traversal alignment
|
||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||
|
||
© 2025 Niklas Rydberg.
|
||
|
||
## License
|
||
|
||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||
|
||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||
specifications.
|
||
|
||
Code examples in this document are provided under the Apache License 2.0 unless
|
||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||
public domain under CC0 1.0.
|
||
|
||
---
|
||
|
||
## 0. Conventions
|
||
|
||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||
|
||
ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.
|
||
|
||
---
|
||
|
||
## 1. Purpose & Non-Goals
|
||
|
||
### 1.1 Purpose
|
||
|
||
ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:
|
||
|
||
* It specifies what it means to map an artifact identity to a byte location.
|
||
* It defines visibility, immutability, and shadowing semantics.
|
||
* It ensures deterministic lookup for a fixed snapshot and log prefix.
|
||
|
||
### 1.2 Non-goals
|
||
|
||
ASL/1-CORE-INDEX explicitly does **not** define:
|
||
|
||
* On-disk layouts, segment files, or memory representations.
|
||
* Block allocation, packing, GC, or lifecycle rules.
|
||
* Snapshot implementation details, checkpoints, or log storage.
|
||
* Performance optimizations (bloom filters, sharding, SIMD).
|
||
* Federation, provenance, or execution semantics.
|
||
|
||
---
|
||
|
||
## 2. Terminology
|
||
|
||
* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE.
|
||
* **Reference** — ASL/1 content address of an Artifact (hash_id + digest).
|
||
* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE).
|
||
* **Block** — immutable storage unit containing artifact bytes.
|
||
* **BlockID** — opaque identifier for a block.
|
||
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
|
||
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
|
||
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
|
||
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
|
||
* **CURRENT** — effective state after replaying a log prefix on a snapshot.
|
||
|
||
---
|
||
|
||
## 3. Core Mapping Semantics
|
||
|
||
### 3.1 Index Mapping
|
||
|
||
The index defines a semantic mapping:
|
||
|
||
```
|
||
Reference -> ArtifactLocation
|
||
```
|
||
|
||
For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state.
|
||
|
||
### 3.2 Determinism
|
||
|
||
For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
|
||
|
||
### 3.3 StoreConfig Consistency
|
||
|
||
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.
|
||
|
||
---
|
||
|
||
## 4. ArtifactLocation Semantics
|
||
|
||
* An ArtifactLocation is an **ordered list** of ArtifactExtents.
|
||
* Each extent references immutable bytes within a block.
|
||
* The artifact bytes are defined by **concatenating extents in order**.
|
||
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
|
||
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
|
||
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
|
||
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
||
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
||
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
|
||
|
||
---
|
||
|
||
## 5. Visibility Model
|
||
|
||
An index entry is **visible** at CURRENT if and only if:
|
||
|
||
1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
|
||
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
|
||
|
||
Visibility is binary; entries are either visible or not visible.
|
||
|
||
---
|
||
|
||
## 6. Snapshot and Log Semantics
|
||
|
||
Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.
|
||
|
||
The index state for a given CURRENT is defined as:
|
||
|
||
```
|
||
Index(CURRENT) = Index(snapshot) + replay(log_prefix)
|
||
```
|
||
|
||
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
|
||
|
||
---
|
||
|
||
## 7. Immutability and Shadowing
|
||
|
||
### 7.1 Immutability
|
||
|
||
* Index entries are never mutated.
|
||
* Once visible, an entry’s meaning does not change.
|
||
* Referenced bytes are immutable for the lifetime of the entry.
|
||
|
||
### 7.2 Shadowing
|
||
|
||
* Later entries MAY shadow earlier entries with the same Reference.
|
||
* Precedence is determined solely by log order.
|
||
* Snapshot boundaries do not alter shadowing semantics.
|
||
|
||
---
|
||
|
||
## 8. Tombstones (Optional)
|
||
|
||
Tombstone entries MAY be used to invalidate prior mappings.
|
||
|
||
* A tombstone shadows earlier entries for the same Reference.
|
||
* Visibility rules are identical to regular entries.
|
||
* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.
|
||
|
||
---
|
||
|
||
## 9. Determinism Guarantees
|
||
|
||
For fixed:
|
||
|
||
* StoreConfig
|
||
* Snapshot
|
||
* Log prefix
|
||
|
||
ASL/1-CORE-INDEX guarantees:
|
||
|
||
* Deterministic lookup results
|
||
* Deterministic shadowing resolution
|
||
* Deterministic visibility
|
||
|
||
---
|
||
|
||
## 10. Normative Invariants
|
||
|
||
Conforming implementations MUST enforce:
|
||
|
||
1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
|
||
2. No mutation of visible index entries.
|
||
3. Referenced bytes remain immutable for the entry’s lifetime.
|
||
4. Shadowing follows strict log order.
|
||
5. Snapshot + log replay uniquely defines CURRENT.
|
||
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.
|
||
|
||
Violation of any invariant constitutes index corruption.
|
||
|
||
---
|
||
|
||
## 11. Relationship to Other Specifications
|
||
|
||
| Layer | Responsibility |
|
||
| ------------------ | ---------------------------------------------------------- |
|
||
| ASL/1-CORE | Artifact semantics and identity |
|
||
| ASL/1-STORE | StoreSnapshot and put/get logical model |
|
||
| ASL/1-CORE-INDEX | Semantic mapping of Reference → ArtifactLocation |
|
||
| ASL-STORE-INDEX | Lifecycle, replay, and visibility contracts |
|
||
| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records |
|
||
|
||
---
|
||
|
||
## 12. Summary
|
||
|
||
ASL/1-CORE-INDEX specifies the semantic meaning of the index:
|
||
|
||
* It maps artifact References to byte locations deterministically.
|
||
* It defines visibility and shadowing rules across snapshot + log replay.
|
||
* It guarantees immutability and deterministic lookup.
|
||
|
||
It answers one question:
|
||
|
||
> *Given a Reference and a CURRENT state, where are the bytes?*
|