amduat-api/notes/ASL-Block-Architecture-&-Specification.md
2026-01-17 00:19:49 +01:00

5.6 KiB

ASL Block Architecture & Specification

1. Purpose and Scope

The Artifact Storage Layer (ASL) is responsible for the physical storage, layout, and retrieval of immutable artifact bytes. ASL operates beneath CAS and above the storage substrate (ZFS).

ASL concerns itself with:

  • Efficient packaging of artifacts into blocks
  • Stable block addressing
  • Snapshot-safe immutability
  • Storage-local optimizations

ASL does not define:

  • Artifact identity
  • Hash semantics
  • Provenance
  • Interpretation
  • Indexing semantics

2. Core Abstractions

2.1 Artifact

An artifact is an immutable byte sequence produced or consumed by higher layers.

ASL treats artifacts as opaque bytes.


2.2 ASL Block

An ASL block is the smallest independently addressable, immutable unit of storage managed by ASL.

Properties:

  • Identified by an ASL Block ID
  • Contains one or more artifacts
  • Written sequentially
  • Immutable once sealed
  • Snapshot-safe

ASL blocks are the unit of:

  • Storage
  • Reachability
  • Garbage collection

2.3 ASL Block ID

An ASL Block ID is an opaque, stable identifier.

Invariants

  • Globally unique within an ASL instance

  • Never reused

  • Never mutated

  • Does not encode:

    • Artifact size
    • Placement
    • Snapshot
    • Storage topology
    • Policy decisions

Semantics

Block IDs identify logical blocks, not physical locations.

Higher layers must treat block IDs as uninterpretable tokens.


3. Addressing Model

ASL exposes a single addressing primitive:

(block_id, offset, length) → bytes

This is the only contract between CAS and ASL.

Notes:

  • offset and length are stable for the lifetime of the block
  • ASL guarantees that reads are deterministic per snapshot
  • No size-class or block-kind information is exposed

4. Block Allocation Model

4.1 Global Block Namespace

ASL maintains a single global block namespace.

Block IDs are allocated from a monotonically increasing sequence:

next_block_id := next_block_id + 1

Properties:

  • Allocation is append-only
  • Leaked IDs are permitted
  • No coordination with CAS is required

4.2 Open Blocks

At any time, ASL may maintain one or more open blocks.

Open blocks:

  • Accept new artifact writes
  • Are not visible to readers
  • Are not referenced by the index
  • May be abandoned on crash

4.3 Sealed Blocks

A block becomes sealed when:

  • It reaches an internal fill threshold, or
  • ASL decides to finalize it for policy reasons

Once sealed:

  • No further writes are permitted
  • Offsets and lengths become permanent
  • The block becomes visible to CAS
  • The block may be referenced by index entries

Sealed blocks are immutable forever.


5. Packaging Policy (Non-Semantic)

ASL applies packaging heuristics when choosing how to place artifacts into blocks.

Examples:

  • Prefer packing many small artifacts together
  • Prefer isolating very large artifacts
  • Avoid mixing vastly different sizes when convenient

Important rule

Packaging decisions are:

  • Best-effort
  • Local
  • Replaceable
  • Not part of the ASL contract

No higher layer may assume anything about block contents based on artifact size.


6. Storage Layout and Locality

6.1 Single Dataset, Structured Locality

ASL stores all blocks within a single ZFS dataset.

Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:

asl/blocks/dense/
asl/blocks/sparse/

These subpaths:

  • Exist purely for storage optimization
  • May carry ZFS property overrides
  • Are not encoded into block identity

Block resolution does not depend on knowing which subpath was used.


6.2 Placement Hints

At allocation time, ASL may apply placement hints, such as:

  • Preferred directory
  • Write size
  • Compression preference
  • Recordsize alignment

These hints:

  • Affect only physical layout
  • May change over time
  • Do not affect block identity or correctness

7. Snapshot Semantics

ASL is snapshot-aware but snapshot-agnostic.

Rules:

  • ASL blocks live inside snapshot-capable storage
  • Snapshots naturally pin sealed blocks
  • ASL does not encode snapshot IDs into block IDs
  • CAS determines snapshot visibility

ASL guarantees:

  • Deterministic reads for a given snapshot
  • No mutation of sealed blocks across snapshots

8. Crash Safety and Recovery

8.1 Crash During Open Block

If a crash occurs:

  • Open blocks may be lost or abandoned
  • Block IDs allocated but not sealed may be leaked
  • No sealed block may be corrupted

This is acceptable and expected.


8.2 Recovery Rules

On startup, ASL:

  • Scans for sealed blocks
  • Ignores or cleans up abandoned open blocks
  • Resumes allocation from the next unused block ID

No global replay or rebuild is required.


9. Garbage Collection

ASL performs garbage collection at block granularity.

Rules:

  • A block is eligible for deletion if:

    • It is sealed, and
    • It is unreachable from all retained snapshots
  • ASL does not perform partial block mutation

  • Compaction (if any) rewrites artifacts into new blocks

Block deletion is irreversible.


10. Non-Goals (Explicit)

ASL explicitly does not provide:

  • Artifact identity management
  • Deduplication decisions
  • Provenance interpretation
  • Size-class semantics
  • Execution semantics

Those concerns belong to CAS, PEL, and higher layers.


11. Design Summary (Executive)

  • One block namespace
  • One addressing model
  • One read path
  • Placement is an optimization
  • Immutability is absolute
  • Snapshots provide safety
  • Size is a courtesy, not a contract