amduat-api/notes/asl-core-index.md
2026-01-17 00:19:49 +01:00

6.3 KiB
Raw Blame History

ASL-CORE-INDEX

Semantic Addendum to ASL-CORE


1. Purpose

This document defines the semantic model of the ASL index, extending ASL-CORE artifact semantics to include mapping artifacts to storage locations.

The ASL index provides a deterministic, snapshot-relative mapping from artifact identities to byte locations within immutable storage blocks.

It specifies what the index means, not:

  • How the index is stored or encoded
  • How blocks are allocated or packed
  • Performance optimizations
  • Garbage collection or memory strategies

Those are handled by:

  • ASL-STORE-INDEX (store semantics and contracts)
  • ENC-ASL-CORE-INDEX (bytes-on-disk encoding)

2. Scope

This document defines:

  • Logical structure of index entries
  • Visibility rules
  • Snapshot and log interaction
  • Immutability and shadowing semantics
  • Determinism guarantees
  • Required invariants

It does not define:

  • On-disk formats
  • Index segmentation or sharding
  • Bloom filters or probabilistic structures
  • Memory residency
  • Performance targets

3. Terminology

  • Artifact: An immutable sequence of bytes managed by ASL.
  • ArtifactKey: Opaque identifier for an artifact (typically a hash).
  • Block: Immutable storage unit containing artifact bytes.
  • BlockID: Opaque, unique identifier for a block.
  • ArtifactLocation: Tuple (BlockID, offset, length) identifying bytes within a block.
  • Snapshot: Checkpoint capturing a consistent base state of ASL-managed storage and metadata.
  • Append-Only Log: Strictly ordered log of index-visible mutations occurring after a snapshot.
  • CURRENT: The effective system state obtained by replaying the append-only log on top of a checkpoint snapshot.

4. Block Semantics

ASL-CORE introduces blocks minimally:

  1. Blocks are existential storage atoms for artifact bytes.
  2. Each block is uniquely identified by a BlockID.
  3. Blocks are immutable once sealed.
  4. Addressing: (BlockID, offset, length) → bytes.
  5. No block layout, allocation, packing, or size semantics are defined at the core level.

5. Core Semantic Mapping

The ASL index defines a total mapping:

ArtifactKey → ArtifactLocation

Semantic guarantees:

  • Each visible ArtifactKey maps to exactly one ArtifactLocation.
  • Mapping is immutable once visible.
  • Mapping is snapshot-relative.
  • Mapping is deterministic given (snapshot, log prefix).

6. ArtifactLocation Semantics

  • block_id references an ASL block.
  • offset and length define bytes within the block.
  • Only valid for the lifetime of the referenced block.
  • No interpretation of bytes is implied.

7. Visibility Model

An index entry is visible if and only if:

  1. The referenced block is sealed.
  2. A corresponding log record exists.
  3. The log record is ≤ CURRENT replay position.

Consequences:

  • Entries referencing unsealed blocks are invisible.
  • Entries above CURRENT are invisible.
  • Visibility is binary (no gradual exposure).

8. Snapshot and Log Semantics

  • Snapshots act as checkpoints, not full state representations.
  • Index state at any time:
Index(CURRENT) = Index(snapshot) + replay(log)
  • Replay is strictly ordered, deterministic, and idempotent.
  • Snapshot and log entries are semantically equivalent once replayed.

9. Immutability and Shadowing

9.1 Immutability

  • Index entries are never mutated.
  • Once visible, an entrys meaning never changes.
  • Blocks referenced by entries are immutable.

9.2 Shadowing

  • Later entries may shadow earlier entries with the same ArtifactKey.
  • Precedence is determined by log order.
  • Snapshot boundaries do not alter shadowing semantics.

10. Tombstones (Optional)

  • Tombstone entries are allowed to invalidate prior mappings.

  • Semantics:

    • Shadows previous entries for the same ArtifactKey.
    • Visibility follows the same rules as regular entries.
  • Existence and encoding of tombstones are optional.


11. Determinism Guarantees

For fixed:

  • Snapshot
  • Log prefix
  • ASL configuration
  • Hash algorithm

The index guarantees:

  • Deterministic lookup results
  • Deterministic shadowing resolution
  • Deterministic visibility

No nondeterministic input may influence index semantics.


12. Separation of Concerns

  • ASL-CORE: Defines artifact semantics and the existence of blocks as storage atoms.
  • ASL-CORE-INDEX: Defines how artifact keys map to blocks, offsets, and lengths.
  • ASL-STORE-INDEX: Defines lifecycle, replay, and visibility guarantees.
  • ENC-ASL-CORE-INDEX: Defines exact bytes-on-disk representation.

Index semantics do not prescribe:

  • Block allocation
  • Packing strategies
  • Performance optimizations
  • Memory residency or caching

13. Normative Invariants

All conforming implementations must enforce:

  1. No visibility without a log record.
  2. No mutation of visible index entries.
  3. No mutation of sealed blocks.
  4. Shadowing follows strict log order.
  5. Replay of snapshot + log uniquely defines CURRENT.
  6. ArtifactLocation always resolves to immutable bytes.

Violation of any invariant constitutes index corruption.


14. Non-Goals (Explicit)

ASL-CORE-INDEX does not define:

  • Disk layout or encoding
  • Segment structure, sharding, or bloom filters
  • GC policies or memory management
  • Small vs. large block packing
  • Federation or provenance mechanics

15. Relationship to Other Specifications

Layer Responsibility
ASL-CORE Defines artifact semantics and existence of blocks
ASL-CORE-INDEX Defines semantic mapping of ArtifactKey → ArtifactLocation
ASL-STORE-INDEX Defines store contracts to realize index semantics
ENC-ASL-CORE-INDEX Defines exact encoding on disk

16. Summary

The ASL index:

  • Maps artifact identities to block locations deterministically
  • Is immutable once entries are visible
  • Resolves visibility via snapshots + append-only log
  • Supports optional tombstones
  • Provides a stable substrate for store, encoding, and higher layers like PEL

It answers exactly one question:

Given an artifact identity and a point in time, where are the bytes?

Nothing more, nothing less.