# ASL/1-CORE-INDEX — Semantic Index Model

Status: Draft
Owner: Niklas Rydberg
Version: 0.1.0
SoT: No
Last Updated: 2025-11-16
Tags: [deterministic, index, semantics]

**Document ID:** `ASL/1-CORE-INDEX`
**Layer:** L0.5 — Semantic mapping over ASL/1-CORE values (no storage / encoding / lifecycle)

**Depends on (normative):**

* `ASL/1-CORE`
* `ASL/1-STORE`

**Informative references:**

* `ASL-STORE-INDEX` — store lifecycle and replay contracts
* `ENC-ASL-CORE-INDEX` — bytes-on-disk encoding profile (`tier1/enc-asl-core-index.md`)
* `ASL/INDEX-ACCEL/1` — acceleration semantics (routing, filters, sharding)
* `ASL/LOG/1` — append-only semantic log (segment visibility)

---

## 0. Conventions

The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.

ASL/1-CORE-INDEX defines **semantic meaning only**. It does not define storage formats, on-disk encoding, or operational lifecycle. Those belong to ASL-STORE-INDEX, ASL/LOG/1, and ENC-ASL-CORE-INDEX.

---

## 1. Purpose & Non-Goals

### 1.1 Purpose

ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:

* It specifies what it means to map an artifact identity to a byte location.
* It defines visibility, immutability, and shadowing semantics.
* It ensures deterministic lookup for a fixed snapshot and log prefix.

### 1.2 Non-goals

ASL/1-CORE-INDEX explicitly does **not** define:

* On-disk layouts, segment files, or memory representations.
* Block allocation, packing, GC, or lifecycle rules.
* Snapshot implementation details, checkpoints, or log storage.
* Performance optimizations (bloom filters, sharding, SIMD).
* Federation, provenance, or execution semantics.

---

## 2. Terminology

* **Artifact** — ASL/1 immutable value defined in ASL/1-CORE.
* **Reference** — ASL/1 content address of an Artifact (hash_id + digest).
* **StoreConfig** — `{ encoding_profile, hash_id }` fixed per StoreSnapshot (ASL/1-STORE).
* **Block** — immutable storage unit containing artifact bytes.
* **BlockID** — opaque identifier for a block.
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
* **CURRENT** — effective state after replaying a log prefix on a snapshot.

---

## 3. Core Mapping Semantics

### 3.1 Index Mapping

The index defines a semantic mapping:

```
Reference -> ArtifactLocation
```

For any visible `Reference`, there is exactly one `ArtifactLocation` at a given CURRENT state.

### 3.2 Determinism

For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.

### 3.3 StoreConfig Consistency

All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.

---

## 4. ArtifactLocation Semantics

* An ArtifactLocation is an **ordered list** of ArtifactExtents.
* Each extent references immutable bytes within a block.
* The artifact bytes are defined by **concatenating extents in order**.
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
* An ArtifactLocation is valid only while all referenced blocks are retained.
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.

---

## 5. Visibility Model

An index entry is **visible** at CURRENT if and only if:

1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).

Visibility is binary; entries are either visible or not visible.

---

## 6. Snapshot and Log Semantics

Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.

The index state for a given CURRENT is defined as:

```
Index(CURRENT) = Index(snapshot) + replay(log_prefix)
```

Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.

---

## 7. Immutability and Shadowing

### 7.1 Immutability

* Index entries are never mutated.
* Once visible, an entry’s meaning does not change.
* Referenced bytes are immutable for the lifetime of the entry.

### 7.2 Shadowing

* Later entries MAY shadow earlier entries with the same Reference.
* Precedence is determined solely by log order.
* Snapshot boundaries do not alter shadowing semantics.

---

## 8. Tombstones (Optional)

Tombstone entries MAY be used to invalidate prior mappings.

* A tombstone shadows earlier entries for the same Reference.
* Visibility rules are identical to regular entries.
* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.

---

## 9. Determinism Guarantees

For fixed:

* StoreConfig
* Snapshot
* Log prefix

ASL/1-CORE-INDEX guarantees:

* Deterministic lookup results
* Deterministic shadowing resolution
* Deterministic visibility

---

## 10. Normative Invariants

Conforming implementations MUST enforce:

1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
2. No mutation of visible index entries.
3. Referenced bytes remain immutable for the entry’s lifetime.
4. Shadowing follows strict log order.
5. Snapshot + log replay uniquely defines CURRENT.
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.

Violation of any invariant constitutes index corruption.

---

## 11. Relationship to Other Specifications

| Layer              | Responsibility                                             |
| ------------------ | ---------------------------------------------------------- |
| ASL/1-CORE         | Artifact semantics and identity                            |
| ASL/1-STORE        | StoreSnapshot and put/get logical model                    |
| ASL/1-CORE-INDEX   | Semantic mapping of Reference → ArtifactLocation           |
| ASL-STORE-INDEX    | Lifecycle, replay, and visibility contracts                |
| ENC-ASL-CORE-INDEX | On-disk encoding for index segments and records            |

---

## 12. Summary

ASL/1-CORE-INDEX specifies the semantic meaning of the index:

* It maps artifact References to byte locations deterministically.
* It defines visibility and shadowing rules across snapshot + log replay.
* It guarantees immutability and deterministic lookup.

It answers one question:

> *Given a Reference and a CURRENT state, where are the bytes?*