amduat-api/notes/asl-core-index.md

246 lines
6.3 KiB
Markdown
Raw Normal View History

# ASL-CORE-INDEX
### Semantic Addendum to ASL-CORE
---
## 1. Purpose
This document defines the **semantic model of the ASL index**, extending ASL-CORE artifact semantics to include **mapping artifacts to storage locations**.
The ASL index provides a **deterministic, snapshot-relative mapping** from artifact identities to byte locations within **immutable storage blocks**.
It specifies **what the index means**, not:
* How the index is stored or encoded
* How blocks are allocated or packed
* Performance optimizations
* Garbage collection or memory strategies
Those are handled by:
* **ASL-STORE-INDEX** (store semantics and contracts)
* **ENC-ASL-CORE-INDEX** (bytes-on-disk encoding)
---
## 2. Scope
This document defines:
* Logical structure of index entries
* Visibility rules
* Snapshot and log interaction
* Immutability and shadowing semantics
* Determinism guarantees
* Required invariants
It does **not** define:
* On-disk formats
* Index segmentation or sharding
* Bloom filters or probabilistic structures
* Memory residency
* Performance targets
---
## 3. Terminology
* **Artifact**: An immutable sequence of bytes managed by ASL.
* **ArtifactKey**: Opaque identifier for an artifact (typically a hash).
* **Block**: Immutable storage unit containing artifact bytes.
* **BlockID**: Opaque, unique identifier for a block.
* **ArtifactLocation**: Tuple `(BlockID, offset, length)` identifying bytes within a block.
* **Snapshot**: Checkpoint capturing a consistent base state of ASL-managed storage and metadata.
* **Append-Only Log**: Strictly ordered log of index-visible mutations occurring after a snapshot.
* **CURRENT**: The effective system state obtained by replaying the append-only log on top of a checkpoint snapshot.
---
## 4. Block Semantics
ASL-CORE introduces **blocks** minimally:
1. Blocks are **existential storage atoms** for artifact bytes.
2. Each block is uniquely identified by a **BlockID**.
3. Blocks are **immutable once sealed**.
4. Addressing: `(BlockID, offset, length) → bytes`.
5. No block layout, allocation, packing, or size semantics are defined at the core level.
---
## 5. Core Semantic Mapping
The ASL index defines a **total mapping**:
```
ArtifactKey → ArtifactLocation
```
Semantic guarantees:
* Each visible `ArtifactKey` maps to exactly one `ArtifactLocation`.
* Mapping is **immutable once visible**.
* Mapping is **snapshot-relative**.
* Mapping is **deterministic** given `(snapshot, log prefix)`.
---
## 6. ArtifactLocation Semantics
* `block_id` references an ASL block.
* `offset` and `length` define bytes within the block.
* Only valid for the lifetime of the referenced block.
* No interpretation of bytes is implied.
---
## 7. Visibility Model
An index entry is **visible** if and only if:
1. The referenced block is sealed.
2. A corresponding log record exists.
3. The log record is ≤ CURRENT replay position.
**Consequences**:
* Entries referencing unsealed blocks are invisible.
* Entries above CURRENT are invisible.
* Visibility is binary (no gradual exposure).
---
## 8. Snapshot and Log Semantics
* Snapshots act as **checkpoints**, not full state representations.
* Index state at any time:
```
Index(CURRENT) = Index(snapshot) + replay(log)
```
* Replay is strictly ordered, deterministic, and idempotent.
* Snapshot and log entries are semantically equivalent once replayed.
---
## 9. Immutability and Shadowing
### 9.1 Immutability
* Index entries are never mutated.
* Once visible, an entrys meaning never changes.
* Blocks referenced by entries are immutable.
### 9.2 Shadowing
* Later entries may shadow earlier entries with the same `ArtifactKey`.
* Precedence is determined by log order.
* Snapshot boundaries do not alter shadowing semantics.
---
## 10. Tombstones (Optional)
* Tombstone entries are allowed to invalidate prior mappings.
* Semantics:
* Shadows previous entries for the same `ArtifactKey`.
* Visibility follows the same rules as regular entries.
* Existence and encoding of tombstones are optional.
---
## 11. Determinism Guarantees
For fixed:
* Snapshot
* Log prefix
* ASL configuration
* Hash algorithm
The index guarantees:
* Deterministic lookup results
* Deterministic shadowing resolution
* Deterministic visibility
No nondeterministic input may influence index semantics.
---
## 12. Separation of Concerns
* **ASL-CORE**: Defines artifact semantics and the existence of blocks as storage atoms.
* **ASL-CORE-INDEX**: Defines how artifact keys map to blocks, offsets, and lengths.
* **ASL-STORE-INDEX**: Defines lifecycle, replay, and visibility guarantees.
* **ENC-ASL-CORE-INDEX**: Defines exact bytes-on-disk representation.
Index semantics **do not** prescribe:
* Block allocation
* Packing strategies
* Performance optimizations
* Memory residency or caching
---
## 13. Normative Invariants
All conforming implementations must enforce:
1. No visibility without a log record.
2. No mutation of visible index entries.
3. No mutation of sealed blocks.
4. Shadowing follows strict log order.
5. Replay of snapshot + log uniquely defines CURRENT.
6. ArtifactLocation always resolves to immutable bytes.
Violation of any invariant constitutes index corruption.
---
## 14. Non-Goals (Explicit)
ASL-CORE-INDEX does **not** define:
* Disk layout or encoding
* Segment structure, sharding, or bloom filters
* GC policies or memory management
* Small vs. large block packing
* Federation or provenance mechanics
---
## 15. Relationship to Other Specifications
| Layer | Responsibility |
| ------------------ | ---------------------------------------------------------- |
| ASL-CORE | Defines artifact semantics and existence of blocks |
| ASL-CORE-INDEX | Defines semantic mapping of ArtifactKey → ArtifactLocation |
| ASL-STORE-INDEX | Defines store contracts to realize index semantics |
| ENC-ASL-CORE-INDEX | Defines exact encoding on disk |
---
## 16. Summary
The ASL index:
* Maps artifact identities to block locations deterministically
* Is immutable once entries are visible
* Resolves visibility via snapshots + append-only log
* Supports optional tombstones
* Provides a stable substrate for store, encoding, and higher layers like PEL
It answers **exactly one question**:
> *Given an artifact identity and a point in time, where are the bytes?*
Nothing more, nothing less.