amduat-api/tier1/asl-core-index.md
2026-01-17 07:05:11 +01:00

7.4 KiB
Raw Blame History

ASL/1-CORE-INDEX — Semantic Index Model

Status: Draft Owner: Niklas Rydberg Version: 0.1.0 SoT: No Last Updated: 2025-11-16 Tags: [deterministic, index, semantics]

Document ID: ASL/1-CORE-INDEX Layer: L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)

Depends on (normative):

  • ASL/1-CORE
  • ASL/1-STORE

Informative references:

  • ASL-STORE-INDEX — store lifecycle and replay contracts
  • ENC-ASL-CORE-INDEX — bytes-on-disk encoding profile (tier1/enc-asl-core-index.md)
  • ASL/INDEX-ACCEL/1 — acceleration semantics (routing, filters, sharding)
  • ASL/LOG/1 — append-only semantic log (segment visibility)

0. Conventions

The key words MUST, MUST NOT, REQUIRED, SHOULD, and MAY are to be interpreted as in RFC 2119.

ASL/1-CORE-INDEX defines semantic meaning only. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.


1. Purpose & Non-Goals

1.1 Purpose

ASL/1-CORE-INDEX defines the semantic model for indexing artifacts:

  • It specifies what it means to map an artifact identity to a byte location.
  • It defines visibility, immutability, and shadowing semantics.
  • It ensures deterministic lookup for a fixed snapshot and log prefix.

1.2 Non-goals

ASL/1-CORE-INDEX explicitly does not define:

  • On-disk layouts, segment files, or memory representations.
  • Block allocation, packing, GC, or lifecycle rules.
  • Snapshot implementation details, checkpoints, or log storage.
  • Performance optimizations (bloom filters, sharding, SIMD).
  • Federation, provenance, or execution semantics.

2. Terminology

  • Artifact — ASL/1 immutable value defined in ASL/1-CORE.
  • Reference — ASL/1 content address of an Artifact (hash_id + digest).
  • StoreConfig{ encoding_profile, hash_id } fixed per StoreSnapshot (ASL/1-STORE).
  • Block — immutable storage unit containing artifact bytes.
  • BlockID — opaque identifier for a block.
  • ArtifactExtent(BlockID, offset, length) identifying a byte slice within a block.
  • ArtifactLocation — ordered list of ArtifactExtent values that, when concatenated, produce the artifact bytes.
  • Snapshot — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
  • Append-Only Log — ordered sequence of index-visible mutations after a snapshot.
  • CURRENT — effective state after replaying a log prefix on a snapshot.

3. Core Mapping Semantics

3.1 Index Mapping

The index defines a semantic mapping:

Reference -> ArtifactLocation

For any visible Reference, there is exactly one ArtifactLocation at a given CURRENT state.

3.2 Determinism

For a fixed {StoreConfig, Snapshot, LogPrefix}, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.

3.3 StoreConfig Consistency

All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when hash_id is fixed by StoreConfig, but the semantic key is always a full Reference. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from hash_id and StoreConfig.


4. ArtifactLocation Semantics

  • An ArtifactLocation is an ordered list of ArtifactExtents.
  • Each extent references immutable bytes within a block.
  • The artifact bytes are defined by concatenating extents in order.
  • A visible ArtifactLocation MUST be non-empty and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
  • Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
  • Extents MUST have length > 0 and MUST reference valid byte ranges within their blocks.
  • Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
  • An ArtifactLocation is valid only while all referenced blocks are retained.
  • ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.

5. Visibility Model

An index entry is visible at CURRENT if and only if:

  1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
  2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).

Visibility is binary; entries are either visible or not visible.


6. Snapshot and Log Semantics

Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.

The index state for a given CURRENT is defined as:

Index(CURRENT) = Index(snapshot) + replay(log_prefix)

Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.


7. Immutability and Shadowing

7.1 Immutability

  • Index entries are never mutated.
  • Once visible, an entrys meaning does not change.
  • Referenced bytes are immutable for the lifetime of the entry.

7.2 Shadowing

  • Later entries MAY shadow earlier entries with the same Reference.
  • Precedence is determined solely by log order.
  • Snapshot boundaries do not alter shadowing semantics.

8. Tombstones (Optional)

Tombstone entries MAY be used to invalidate prior mappings.

  • A tombstone shadows earlier entries for the same Reference.
  • Visibility rules are identical to regular entries.
  • Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.

9. Determinism Guarantees

For fixed:

  • StoreConfig
  • Snapshot
  • Log prefix

ASL/1-CORE-INDEX guarantees:

  • Deterministic lookup results
  • Deterministic shadowing resolution
  • Deterministic visibility

10. Normative Invariants

Conforming implementations MUST enforce:

  1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
  2. No mutation of visible index entries.
  3. Referenced bytes remain immutable for the entrys lifetime.
  4. Shadowing follows strict log order.
  5. Snapshot + log replay uniquely defines CURRENT.
  6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.

Violation of any invariant constitutes index corruption.


11. Relationship to Other Specifications

Layer Responsibility
ASL/1-CORE Artifact semantics and identity
ASL/1-STORE StoreSnapshot and put/get logical model
ASL/1-CORE-INDEX Semantic mapping of Reference → ArtifactLocation
ASL-STORE-INDEX Lifecycle, replay, and visibility contracts
ENC-ASL-CORE-INDEX On-disk encoding for index segments and records

12. Summary

ASL/1-CORE-INDEX specifies the semantic meaning of the index:

  • It maps artifact References to byte locations deterministically.
  • It defines visibility and shadowing rules across snapshot + log replay.
  • It guarantees immutability and deterministic lookup.

It answers one question:

Given a Reference and a CURRENT state, where are the bytes?