# ASL Block Architecture & Specification ## 1. Purpose and Scope The **Artifact Storage Layer (ASL)** is responsible for the **physical storage, layout, and retrieval of immutable artifact bytes**. ASL operates beneath CAS and above the storage substrate (ZFS). ASL concerns itself with: * Efficient packaging of artifacts into blocks * Stable block addressing * Snapshot-safe immutability * Storage-local optimizations ASL does **not** define: * Artifact identity * Hash semantics * Provenance * Interpretation * Indexing semantics --- ## 2. Core Abstractions ### 2.1 Artifact An **artifact** is an immutable byte sequence produced or consumed by higher layers. ASL treats artifacts as opaque bytes. --- ### 2.2 ASL Block An **ASL block** is the smallest independently addressable, immutable unit of storage managed by ASL. Properties: * Identified by an **ASL Block ID** * Contains one or more artifacts * Written sequentially * Immutable once sealed * Snapshot-safe ASL blocks are the unit of: * Storage * Reachability * Garbage collection --- ### 2.3 ASL Block ID An **ASL Block ID** is an opaque, stable identifier. #### Invariants * Globally unique within an ASL instance * Never reused * Never mutated * Does **not** encode: * Artifact size * Placement * Snapshot * Storage topology * Policy decisions #### Semantics Block IDs identify *logical blocks*, not physical locations. Higher layers must treat block IDs as uninterpretable tokens. --- ## 3. Addressing Model ASL exposes a single addressing primitive: ``` (block_id, offset, length) → bytes ``` This is the **only** contract between CAS and ASL. Notes: * `offset` and `length` are stable for the lifetime of the block * ASL guarantees that reads are deterministic per snapshot * No size-class or block-kind information is exposed --- ## 4. Block Allocation Model ### 4.1 Global Block Namespace ASL maintains a **single global block namespace**. Block IDs are allocated from a monotonically increasing sequence: ``` next_block_id := next_block_id + 1 ``` Properties: * Allocation is append-only * Leaked IDs are permitted * No coordination with CAS is required --- ### 4.2 Open Blocks At any time, ASL may maintain one or more **open blocks**. Open blocks: * Accept new artifact writes * Are not visible to readers * Are not referenced by the index * May be abandoned on crash --- ### 4.3 Sealed Blocks A block becomes **sealed** when: * It reaches an internal fill threshold, or * ASL decides to finalize it for policy reasons Once sealed: * No further writes are permitted * Offsets and lengths become permanent * The block becomes visible to CAS * The block may be referenced by index entries Sealed blocks are immutable forever. --- ## 5. Packaging Policy (Non-Semantic) ASL applies **packaging heuristics** when choosing how to place artifacts into blocks. Examples: * Prefer packing many small artifacts together * Prefer isolating very large artifacts * Avoid mixing vastly different sizes when convenient ### Important rule Packaging decisions are: * Best-effort * Local * Replaceable * **Not part of the ASL contract** No higher layer may assume anything about block contents based on artifact size. --- ## 6. Storage Layout and Locality ### 6.1 Single Dataset, Structured Locality ASL stores all blocks within a **single ZFS dataset**. Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.: ``` asl/blocks/dense/ asl/blocks/sparse/ ``` These subpaths: * Exist purely for storage optimization * May carry ZFS property overrides * Are not encoded into block identity Block resolution does **not** depend on knowing which subpath was used. --- ### 6.2 Placement Hints At allocation time, ASL may apply **placement hints**, such as: * Preferred directory * Write size * Compression preference * Recordsize alignment These hints: * Affect only physical layout * May change over time * Do not affect block identity or correctness --- ## 7. Snapshot Semantics ASL is snapshot-aware but snapshot-agnostic. Rules: * ASL blocks live inside snapshot-capable storage * Snapshots naturally pin sealed blocks * ASL does not encode snapshot IDs into block IDs * CAS determines snapshot visibility ASL guarantees: * Deterministic reads for a given snapshot * No mutation of sealed blocks across snapshots --- ## 8. Crash Safety and Recovery ### 8.1 Crash During Open Block If a crash occurs: * Open blocks may be lost or abandoned * Block IDs allocated but not sealed may be leaked * No sealed block may be corrupted This is acceptable and expected. --- ### 8.2 Recovery Rules On startup, ASL: * Scans for sealed blocks * Ignores or cleans up abandoned open blocks * Resumes allocation from the next unused block ID No global replay or rebuild is required. --- ## 9. Garbage Collection ASL performs garbage collection at **block granularity**. Rules: * A block is eligible for deletion if: * It is sealed, and * It is unreachable from all retained snapshots * ASL does not perform partial block mutation * Compaction (if any) rewrites artifacts into new blocks Block deletion is irreversible. --- ## 10. Non-Goals (Explicit) ASL explicitly does **not** provide: * Artifact identity management * Deduplication decisions * Provenance interpretation * Size-class semantics * Execution semantics Those concerns belong to CAS, PEL, and higher layers. --- ## 11. Design Summary (Executive) * One block namespace * One addressing model * One read path * Placement is an optimization * Immutability is absolute * Snapshots provide safety * Size is a courtesy, not a contract