Carl Niklas Rydberg 1d552bd46a Added some notes that needs to be analyzed.

2026-01-17 00:19:49 +01:00

5.6 KiB

Raw Blame History

ASL Block Architecture & Specification

1. Purpose and Scope

The Artifact Storage Layer (ASL) is responsible for the physical storage, layout, and retrieval of immutable artifact bytes. ASL operates beneath CAS and above the storage substrate (ZFS).

ASL concerns itself with:

Efficient packaging of artifacts into blocks
Stable block addressing
Snapshot-safe immutability
Storage-local optimizations

ASL does not define:

Artifact identity
Hash semantics
Provenance
Interpretation
Indexing semantics

2. Core Abstractions

2.1 Artifact

An artifact is an immutable byte sequence produced or consumed by higher layers.

ASL treats artifacts as opaque bytes.

2.2 ASL Block

An ASL block is the smallest independently addressable, immutable unit of storage managed by ASL.

Properties:

Identified by an ASL Block ID
Contains one or more artifacts
Written sequentially
Immutable once sealed
Snapshot-safe

ASL blocks are the unit of:

Storage
Reachability
Garbage collection

2.3 ASL Block ID

An ASL Block ID is an opaque, stable identifier.

Invariants

Globally unique within an ASL instance
Never reused
Never mutated
Does not encode:
- Artifact size
- Placement
- Snapshot
- Storage topology
- Policy decisions

Semantics

Block IDs identify logical blocks, not physical locations.

Higher layers must treat block IDs as uninterpretable tokens.

3. Addressing Model

ASL exposes a single addressing primitive:

(block_id, offset, length) → bytes

This is the only contract between CAS and ASL.

Notes:

offset and length are stable for the lifetime of the block
ASL guarantees that reads are deterministic per snapshot
No size-class or block-kind information is exposed

4. Block Allocation Model

4.1 Global Block Namespace

ASL maintains a single global block namespace.

Block IDs are allocated from a monotonically increasing sequence:

next_block_id := next_block_id + 1

Properties:

Allocation is append-only
Leaked IDs are permitted
No coordination with CAS is required

4.2 Open Blocks

At any time, ASL may maintain one or more open blocks.

Open blocks:

Accept new artifact writes
Are not visible to readers
Are not referenced by the index
May be abandoned on crash

4.3 Sealed Blocks

A block becomes sealed when:

It reaches an internal fill threshold, or
ASL decides to finalize it for policy reasons

Once sealed:

No further writes are permitted
Offsets and lengths become permanent
The block becomes visible to CAS
The block may be referenced by index entries

Sealed blocks are immutable forever.

5. Packaging Policy (Non-Semantic)

ASL applies packaging heuristics when choosing how to place artifacts into blocks.

Examples:

Prefer packing many small artifacts together
Prefer isolating very large artifacts
Avoid mixing vastly different sizes when convenient

Important rule

Packaging decisions are:

Best-effort
Local
Replaceable
Not part of the ASL contract

No higher layer may assume anything about block contents based on artifact size.

6. Storage Layout and Locality

6.1 Single Dataset, Structured Locality

ASL stores all blocks within a single ZFS dataset.

Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:

asl/blocks/dense/
asl/blocks/sparse/

These subpaths:

Exist purely for storage optimization
May carry ZFS property overrides
Are not encoded into block identity

Block resolution does not depend on knowing which subpath was used.

6.2 Placement Hints

At allocation time, ASL may apply placement hints, such as:

Preferred directory
Write size
Compression preference
Recordsize alignment

These hints:

Affect only physical layout
May change over time
Do not affect block identity or correctness

7. Snapshot Semantics

ASL is snapshot-aware but snapshot-agnostic.

Rules:

ASL blocks live inside snapshot-capable storage
Snapshots naturally pin sealed blocks
ASL does not encode snapshot IDs into block IDs
CAS determines snapshot visibility

ASL guarantees:

Deterministic reads for a given snapshot
No mutation of sealed blocks across snapshots

8. Crash Safety and Recovery

8.1 Crash During Open Block

If a crash occurs:

Open blocks may be lost or abandoned
Block IDs allocated but not sealed may be leaked
No sealed block may be corrupted

This is acceptable and expected.

8.2 Recovery Rules

On startup, ASL:

Scans for sealed blocks
Ignores or cleans up abandoned open blocks
Resumes allocation from the next unused block ID

No global replay or rebuild is required.

9. Garbage Collection

ASL performs garbage collection at block granularity.