5.6 KiB
ASL Block Architecture & Specification
1. Purpose and Scope
The Artifact Storage Layer (ASL) is responsible for the physical storage, layout, and retrieval of immutable artifact bytes. ASL operates beneath CAS and above the storage substrate (ZFS).
ASL concerns itself with:
- Efficient packaging of artifacts into blocks
- Stable block addressing
- Snapshot-safe immutability
- Storage-local optimizations
ASL does not define:
- Artifact identity
- Hash semantics
- Provenance
- Interpretation
- Indexing semantics
2. Core Abstractions
2.1 Artifact
An artifact is an immutable byte sequence produced or consumed by higher layers.
ASL treats artifacts as opaque bytes.
2.2 ASL Block
An ASL block is the smallest independently addressable, immutable unit of storage managed by ASL.
Properties:
- Identified by an ASL Block ID
- Contains one or more artifacts
- Written sequentially
- Immutable once sealed
- Snapshot-safe
ASL blocks are the unit of:
- Storage
- Reachability
- Garbage collection
2.3 ASL Block ID
An ASL Block ID is an opaque, stable identifier.
Invariants
-
Globally unique within an ASL instance
-
Never reused
-
Never mutated
-
Does not encode:
- Artifact size
- Placement
- Snapshot
- Storage topology
- Policy decisions
Semantics
Block IDs identify logical blocks, not physical locations.
Higher layers must treat block IDs as uninterpretable tokens.
3. Addressing Model
ASL exposes a single addressing primitive:
(block_id, offset, length) → bytes
This is the only contract between CAS and ASL.
Notes:
offsetandlengthare stable for the lifetime of the block- ASL guarantees that reads are deterministic per snapshot
- No size-class or block-kind information is exposed
4. Block Allocation Model
4.1 Global Block Namespace
ASL maintains a single global block namespace.
Block IDs are allocated from a monotonically increasing sequence:
next_block_id := next_block_id + 1
Properties:
- Allocation is append-only
- Leaked IDs are permitted
- No coordination with CAS is required
4.2 Open Blocks
At any time, ASL may maintain one or more open blocks.
Open blocks:
- Accept new artifact writes
- Are not visible to readers
- Are not referenced by the index
- May be abandoned on crash
4.3 Sealed Blocks
A block becomes sealed when:
- It reaches an internal fill threshold, or
- ASL decides to finalize it for policy reasons
Once sealed:
- No further writes are permitted
- Offsets and lengths become permanent
- The block becomes visible to CAS
- The block may be referenced by index entries
Sealed blocks are immutable forever.
5. Packaging Policy (Non-Semantic)
ASL applies packaging heuristics when choosing how to place artifacts into blocks.
Examples:
- Prefer packing many small artifacts together
- Prefer isolating very large artifacts
- Avoid mixing vastly different sizes when convenient
Important rule
Packaging decisions are:
- Best-effort
- Local
- Replaceable
- Not part of the ASL contract
No higher layer may assume anything about block contents based on artifact size.
6. Storage Layout and Locality
6.1 Single Dataset, Structured Locality
ASL stores all blocks within a single ZFS dataset.
Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:
asl/blocks/dense/
asl/blocks/sparse/
These subpaths:
- Exist purely for storage optimization
- May carry ZFS property overrides
- Are not encoded into block identity
Block resolution does not depend on knowing which subpath was used.
6.2 Placement Hints
At allocation time, ASL may apply placement hints, such as:
- Preferred directory
- Write size
- Compression preference
- Recordsize alignment
These hints:
- Affect only physical layout
- May change over time
- Do not affect block identity or correctness
7. Snapshot Semantics
ASL is snapshot-aware but snapshot-agnostic.
Rules:
- ASL blocks live inside snapshot-capable storage
- Snapshots naturally pin sealed blocks
- ASL does not encode snapshot IDs into block IDs
- CAS determines snapshot visibility
ASL guarantees:
- Deterministic reads for a given snapshot
- No mutation of sealed blocks across snapshots
8. Crash Safety and Recovery
8.1 Crash During Open Block
If a crash occurs:
- Open blocks may be lost or abandoned
- Block IDs allocated but not sealed may be leaked
- No sealed block may be corrupted
This is acceptable and expected.
8.2 Recovery Rules
On startup, ASL:
- Scans for sealed blocks
- Ignores or cleans up abandoned open blocks
- Resumes allocation from the next unused block ID
No global replay or rebuild is required.
9. Garbage Collection
ASL performs garbage collection at block granularity.
Rules:
-
A block is eligible for deletion if:
- It is sealed, and
- It is unreachable from all retained snapshots
-
ASL does not perform partial block mutation
-
Compaction (if any) rewrites artifacts into new blocks
Block deletion is irreversible.
10. Non-Goals (Explicit)
ASL explicitly does not provide:
- Artifact identity management
- Deduplication decisions
- Provenance interpretation
- Size-class semantics
- Execution semantics
Those concerns belong to CAS, PEL, and higher layers.
11. Design Summary (Executive)
- One block namespace
- One addressing model
- One read path
- Placement is an optimization
- Immutability is absolute
- Snapshots provide safety
- Size is a courtesy, not a contract