Carl Niklas Rydberg 1d552bd46a Added some notes that needs to be analyzed.

2026-01-17 00:19:49 +01:00

12 KiB

Raw Blame History

ASL-STORE-INDEX

Store Semantics and Contracts for ASL Index

1. Purpose

This document defines the store-level responsibilities and contracts required to implement the ASL-CORE-INDEX semantics.

It bridges the gap between index meaning and physical storage, ensuring:

Deterministic replay
Snapshot-aware visibility
Immutable block guarantees
Idempotent recovery
Correctness of CURRENT state

It does not define exact encoding, memory layout, or acceleration structures (see ENC-ASL-CORE-INDEX).

2. Scope

This specification covers:

Index segment lifecycle
Interaction between index and ASL blocks
Append-only log semantics
Snapshot integration
Visibility and lookup rules
Crash safety and recovery
Garbage collection constraints

It does not cover:

Disk format details
Bloom filter algorithms
File system specifics
Placement heuristics beyond semantic guarantees

3. Core Concepts

3.1 Index Segment

A segment is a contiguous set of index entries written by the store.

Open while accepting new entries
Sealed when closed for append
Sealed segments are immutable
Sealed segments are snapshot-visible only after log record

Segments are the unit of persistence, replay, and GC.

3.2 ASL Block Relationship

Each index entry references a sealed block via:

ArtifactKey → (BlockID, offset, length)

The store must ensure the block is sealed before the entry becomes log-visible
Blocks are immutable after seal
Open blocks may be abandoned without violating invariants

3.3 Append-Only Log

All store-visible mutations are recorded in a strictly ordered, append-only log:

Entries include index additions, tombstones, and segment seals
Log is durable and replayable
Log defines visibility above checkpoint snapshots

CURRENT state is derived as:

CURRENT = checkpoint_state + replay(log)

4. Segment Lifecycle

4.1 Creation

Open segment is allocated
Index entries appended in log order
Entries are invisible until segment seal and log append

4.2 Seal

Segment is closed to append
Seal record is written to append-only log
Segment becomes visible for lookup
Sealed segment may be snapshot-pinned

4.3 Snapshot Interaction

Snapshots capture sealed segments
Open segments need not survive snapshot
Segments below snapshot are replay anchors

4.4 Garbage Collection

Only sealed and unreachable segments can be deleted
GC operates at segment granularity
GC must not break CURRENT or violate invariants

5. Lookup Semantics

To resolve an ArtifactKey:

Identify all visible segments ≤ CURRENT
Search segments in reverse creation order (newest first)
Return the first matching entry
Respect tombstone entries (if present)

Lookups may use memory-mapped structures, bloom filters, sharding, or SIMD, but correctness must be independent of acceleration strategies.

6. Visibility Guarantees

Entry visible iff:
- The block is sealed
- Log record exists ≤ CURRENT
- Segment seal recorded in log
Entries above CURRENT or referencing unsealed blocks are invisible

7. Crash and Recovery Semantics

7.1 Crash During Open Segment

Open segments may be lost
Index entries may be leaked
No sealed segment may be corrupted

7.2 Recovery Procedure

Mount latest checkpoint snapshot
Replay append-only log from checkpoint
Rebuild CURRENT
Resume normal operation

Recovery must be deterministic and idempotent.

8. Tombstone Semantics

Optional: tombstones may exist to invalidate prior mappings
Tombstones shadow prior entries with the same ArtifactKey
Tombstone visibility follows same rules as regular entries

9. Invariants (Normative)

The store must enforce:

No segment visible without seal log record
No mutation of sealed segment or block
Shadowing follows log order strictly
Replay uniquely reconstructs CURRENT
GC does not remove segments referenced by snapshot or log
ArtifactLocation always points to immutable bytes

10. Non-Goals

ASL-STORE-INDEX does not define:

Disk layout or encoding (ENC-ASL-CORE-INDEX)
Placement heuristics (small vs. large block packing)
Performance targets
Memory caching strategies
Federation or provenance mechanics

11. Relationship to Other Documents

Layer	Responsibility
ASL-CORE-INDEX	Defines semantic meaning of mapping `ArtifactKey → ArtifactLocation`
ASL-STORE-INDEX	Defines contracts for store to realize those semantics
ENC-ASL-CORE-INDEX	Defines bytes-on-disk format

12. Summary

The store-index layer guarantees:

Immutable, snapshot-safe segments
Deterministic and idempotent replay
Correct visibility semantics
Safe crash recovery
Garbage collection constraints

This specification ensures that ASL-CORE-INDEX semantics are faithfully realized in the store without constraining encoding or acceleration strategies.

Here’s a fully refined version of ASL-STORE-INDEX, incorporating block lifecycle, sealing, snapshot safety, retention, and GC rules, fully aligned with ASL-CORE-INDEX semantics. This makes the store layer complete and unambiguous.

ASL-STORE-INDEX

Store Semantics and Contracts for ASL Core Index (Refined)

1. Purpose

This document defines the operational and store-level semantics necessary to implement ASL-CORE-INDEX.

It specifies:

Block lifecycle: creation, sealing, retention
Index segment lifecycle: creation, append, seal, visibility
Snapshot interaction: pinning, deterministic visibility
Append-only log semantics
Garbage collection rules

It does not define encoding (see ENC-ASL-CORE-INDEX) or semantic mapping (see ASL-CORE-INDEX).

2. Scope

Covers:

Lifecycle of blocks and index entries
Snapshot and CURRENT consistency guarantees
Deterministic replay and recovery
GC and tombstone semantics

Excludes:

Disk-level encoding
Sharding strategies
Bloom filters or acceleration structures
Memory residency or caching
Federation or PEL semantics

3. Core Concepts

3.1 Block

Definition: Immutable storage unit containing artifact bytes.
Identifier: BlockID (opaque, unique)
Properties:
- Once sealed, contents never change
- Can be referenced by multiple artifacts
- May be pinned by snapshots for retention
Lifecycle Events:
1. Creation: block allocated but contents may still be written
2. Sealing: block is finalized, immutable, and log-visible
3. Retention: block remains accessible while pinned by snapshots or needed by CURRENT
4. Garbage collection: block may be deleted if no longer referenced and unpinned

3.2 Index Segment

Segments group index entries and provide persistence and recovery units.

Open segment: accepting new index entries, not visible for lookup
Sealed segment: closed for append, log-visible, snapshot-pinnable
Segment components: header, optional bloom filter, index records, footer
Segment visibility: only after seal and log append

3.3 Append-Only Log

All store operations affecting index visibility are recorded in a strictly ordered, append-only log:

Entries include:
- Index additions
- Tombstones
- Segment seals
Log is replayable to reconstruct CURRENT
Determinism: replay produces identical CURRENT from same snapshot and log prefix

4. Block Lifecycle Semantics

Event	Description	Semantic Guarantees
Creation	Block allocated; bytes may be written	Not visible to index until sealed
Sealing	Block is finalized and immutable	Sealed blocks are stable and safe to reference from index
Retention	Block remains accessible	Blocks referenced by snapshots or CURRENT must not be removed
Garbage Collection	Block may be deleted	Only unpinned, unreachable blocks may be removed

Notes:

Sealing ensures that any index entry referencing the block is deterministic and immutable.
Retention is driven by snapshot and log visibility rules.
GC must never violate CURRENT reconstruction guarantees.

5. Snapshot Interaction

Snapshots capture the set of sealed blocks and sealed index segments at a point in time.
Blocks referenced by a snapshot are pinned and cannot be garbage-collected until snapshot expiration.
CURRENT is reconstructed as:

CURRENT = snapshot_state + replay(log)

Segment and block visibility rules:

Entity	Visible in snapshot	Visible in CURRENT
Open segment/block	No	Only after seal and log append
Sealed segment/block	Yes, if included in snapshot	Yes, replayed from log
Tombstone	Yes, if log-recorded	Yes, shadows prior entries

6. Index Lookup Semantics

To resolve an ArtifactKey:

Identify all visible segments ≤ CURRENT
Search segments in reverse creation order (newest first)
Return first matching entry
Respect tombstones to shadow prior entries

Determinism:

Lookup results are identical across platforms given the same snapshot and log prefix
Accelerations (bloom filters, sharding, SIMD) do not alter correctness

7. Garbage Collection

Eligibility for GC:
- Segments: sealed, no references from CURRENT or snapshots
- Blocks: unpinned, unreferenced by any segment or artifact
Rules:
- GC is safe only on sealed segments and blocks
- Must respect snapshot pins
- Tombstones may aid in invalidating unreachable blocks
Outcome:
- GC never violates CURRENT reconstruction
- Blocks can be reclaimed without breaking provenance

8. Tombstone Semantics

Optional marker to invalidate prior mappings
Visibility rules identical to regular index entries
Used to maintain deterministic CURRENT in face of shadowing or deletions

9. Crash and Recovery Semantics

Open segments or unsealed blocks may be lost; no invariant is broken
Recovery procedure:
1. Mount last checkpoint snapshot
2. Replay append-only log
3. Reconstruct CURRENT
Recovery is deterministic and idempotent
Segments and blocks never partially visible after crash

10. Normative Invariants

Sealed blocks are immutable
Index entries referencing blocks are immutable once visible
Shadowing follows strict log order
Replay of snapshot + log uniquely reconstructs CURRENT
GC cannot remove blocks or segments needed by snapshot or CURRENT
Tombstones shadow prior entries without deleting underlying blocks prematurely

11. Non-Goals

Disk-level encoding (ENC-ASL-CORE-INDEX)
Memory layout or caching
Sharding or performance heuristics
Federation / multi-domain semantics (handled elsewhere)
Block packing strategies (small vs large blocks)

12. Relationship to Other Layers

Layer	Responsibility
ASL-CORE	Artifact semantics, existence of blocks, immutability
ASL-CORE-INDEX	Semantic mapping of ArtifactKey → ArtifactLocation
ASL-STORE-INDEX	Lifecycle and operational contracts for blocks and segments
ENC-ASL-CORE-INDEX	Bytes-on-disk layout for segments, index records, and optional bloom filters

13. Summary

The refined ASL-STORE-INDEX:

Defines block lifecycle: creation, sealing, retention, GC
Ensures snapshot safety and deterministic visibility
Guarantees immutable, replayable, and recoverable CURRENT
Provides operational contracts to faithfully implement ASL-CORE-INDEX semantics

12 KiB Raw Blame History Unescape Escape

ASL-STORE-INDEX

Store Semantics and Contracts for ASL Index

1. Purpose

2. Scope

3. Core Concepts

3.1 Index Segment

3.2 ASL Block Relationship

3.3 Append-Only Log

4. Segment Lifecycle

4.1 Creation

4.2 Seal

4.3 Snapshot Interaction

4.4 Garbage Collection

5. Lookup Semantics

6. Visibility Guarantees

7. Crash and Recovery Semantics

7.1 Crash During Open Segment

7.2 Recovery Procedure

8. Tombstone Semantics

9. Invariants (Normative)

10. Non-Goals

11. Relationship to Other Documents

12. Summary

ASL-STORE-INDEX

Store Semantics and Contracts for ASL Core Index (Refined)

1. Purpose

2. Scope

3. Core Concepts

3.1 Block

3.2 Index Segment

3.3 Append-Only Log

4. Block Lifecycle Semantics

5. Snapshot Interaction

6. Index Lookup Semantics

7. Garbage Collection

8. Tombstone Semantics

9. Crash and Recovery Semantics

10. Normative Invariants

11. Non-Goals

12. Relationship to Other Layers

13. Summary

12 KiB

Raw Blame History