Carl Niklas Rydberg f2225f7a73 sceaning up index documents.

2026-01-17 06:29:58 +01:00

6.9 KiB

Raw Blame History

ASL/1-CORE-INDEX — Semantic Index Model

Status: Draft Owner: Niklas Rydberg Version: 0.1.0 SoT: No Last Updated: 2025-11-16 Tags: [deterministic, index, semantics]

Document ID: ASL/1-CORE-INDEX Layer: L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)

Depends on (normative):

ASL/1-CORE
ASL/1-STORE

Informative references:

ASL-STORE-INDEX — store lifecycle and replay contracts
ENC-ASL-CORE-INDEX — bytes-on-disk encoding profile (tier1/enc-asl-core-index.md)
ASL/INDEX-ACCEL/1 — acceleration semantics (routing, filters, sharding)
ASL/LOG/1 — append-only semantic log (segment visibility)

0. Conventions

The key words MUST, MUST NOT, REQUIRED, SHOULD, and MAY are to be interpreted as in RFC 2119.

ASL/1-CORE-INDEX defines semantic meaning only. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.

1. Purpose & Non-Goals

1.1 Purpose

ASL/1-CORE-INDEX defines the semantic model for indexing artifacts:

It specifies what it means to map an artifact identity to a byte location.
It defines visibility, immutability, and shadowing semantics.
It ensures deterministic lookup for a fixed snapshot and log prefix.

1.2 Non-goals

ASL/1-CORE-INDEX explicitly does not define:

On-disk layouts, segment files, or memory representations.
Block allocation, packing, GC, or lifecycle rules.
Snapshot implementation details, checkpoints, or log storage.
Performance optimizations (bloom filters, sharding, SIMD).
Federation, provenance, or execution semantics.

2. Terminology

Artifact — ASL/1 immutable value defined in ASL/1-CORE.
Reference — ASL/1 content address of an Artifact (hash_id + digest).
StoreConfig — { encoding_profile, hash_id } fixed per StoreSnapshot (ASL/1-STORE).
Block — immutable storage unit containing artifact bytes.
BlockID — opaque identifier for a block.
ArtifactExtent — (BlockID, offset, length) identifying a byte slice within a block.
ArtifactLocation — ordered list of ArtifactExtent values that, when concatenated, produce the artifact bytes.
Snapshot — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
Append-Only Log — ordered sequence of index-visible mutations after a snapshot.
CURRENT — effective state after replaying a log prefix on a snapshot.

3. Core Mapping Semantics

3.1 Index Mapping

The index defines a semantic mapping:

Reference -> ArtifactLocation

For any visible Reference, there is exactly one ArtifactLocation at a given CURRENT state.

3.2 Determinism

For a fixed {StoreConfig, Snapshot, LogPrefix}, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.

3.3 StoreConfig Consistency

All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when hash_id is fixed by StoreConfig, but the semantic key is always a full Reference.

4. ArtifactLocation Semantics

An ArtifactLocation is an ordered list of ArtifactExtents.
Each extent references immutable bytes within a block.
The artifact bytes are defined by concatenating extents in order.
A visible ArtifactLocation MUST be non-empty and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
Extents MUST have length > 0 and MUST reference valid byte ranges within their blocks.
Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
An ArtifactLocation is valid only while all referenced blocks are retained.
ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.

5. Visibility Model

An index entry is visible at CURRENT if and only if:

The entry is admitted in the ordered log prefix for CURRENT.
The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).

Visibility is binary; entries are either visible or not visible.

6. Snapshot and Log Semantics

Snapshots provide a base mapping; the append-only log defines subsequent changes.

The index state for a given CURRENT is defined as:

Index(CURRENT) = Index(snapshot) + replay(log_prefix)

Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.

7. Immutability and Shadowing

7.1 Immutability

Index entries are never mutated.
Once visible, an entry’s meaning does not change.
Referenced bytes are immutable for the lifetime of the entry.

7.2 Shadowing

Later entries MAY shadow earlier entries with the same Reference.
Precedence is determined solely by log order.
Snapshot boundaries do not alter shadowing semantics.

8. Tombstones (Optional)

Tombstone entries MAY be used to invalidate prior mappings.

A tombstone shadows earlier entries for the same Reference.
Visibility rules are identical to regular entries.
Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.

9. Determinism Guarantees

For fixed:

StoreConfig
Snapshot
Log prefix

ASL/1-CORE-INDEX guarantees:

Deterministic lookup results
Deterministic shadowing resolution
Deterministic visibility

10. Normative Invariants

Conforming implementations MUST enforce:

No visibility without a log-admitted entry.
No mutation of visible index entries.
Referenced bytes remain immutable for the entry’s lifetime.
Shadowing follows strict log order.
Snapshot + log replay uniquely defines CURRENT.
Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun).

Violation of any invariant constitutes index corruption.

11. Relationship to Other Specifications

Layer	Responsibility
ASL/1-CORE	Artifact semantics and identity
ASL/1-STORE	StoreSnapshot and put/get logical model
ASL/1-CORE-INDEX	Semantic mapping of Reference → ArtifactLocation
ASL-STORE-INDEX	Lifecycle, replay, and visibility contracts
ENC-ASL-CORE-INDEX	On-disk encoding for index segments and records

12. Summary

ASL/1-CORE-INDEX specifies the semantic meaning of the index:

It maps artifact References to byte locations deterministically.
It defines visibility and shadowing rules across snapshot + log replay.
It guarantees immutability and deterministic lookup.

It answers one question:

Given a Reference and a CURRENT state, where are the bytes?

6.9 KiB Raw Blame History Unescape Escape