# ASL Block Architecture & Specification

## 1. Purpose and Scope

The **Artifact Storage Layer (ASL)** is responsible for the **physical storage, layout, and retrieval of immutable artifact bytes**.
ASL operates beneath CAS and above the storage substrate (ZFS).

ASL concerns itself with:

* Efficient packaging of artifacts into blocks
* Stable block addressing
* Snapshot-safe immutability
* Storage-local optimizations

ASL does **not** define:

* Artifact identity
* Hash semantics
* Provenance
* Interpretation
* Indexing semantics

---

## 2. Core Abstractions

### 2.1 Artifact

An **artifact** is an immutable byte sequence produced or consumed by higher layers.

ASL treats artifacts as opaque bytes.

---

### 2.2 ASL Block

An **ASL block** is the smallest independently addressable, immutable unit of storage managed by ASL.

Properties:

* Identified by an **ASL Block ID**
* Contains one or more artifacts
* Written sequentially
* Immutable once sealed
* Snapshot-safe

ASL blocks are the unit of:

* Storage
* Reachability
* Garbage collection

---

### 2.3 ASL Block ID

An **ASL Block ID** is an opaque, stable identifier.

#### Invariants

* Globally unique within an ASL instance
* Never reused
* Never mutated
* Does **not** encode:

  * Artifact size
  * Placement
  * Snapshot
  * Storage topology
  * Policy decisions

#### Semantics

Block IDs identify *logical blocks*, not physical locations.

Higher layers must treat block IDs as uninterpretable tokens.

---

## 3. Addressing Model

ASL exposes a single addressing primitive:

```
(block_id, offset, length) → bytes
```

This is the **only** contract between CAS and ASL.

Notes:

* `offset` and `length` are stable for the lifetime of the block
* ASL guarantees that reads are deterministic per snapshot
* No size-class or block-kind information is exposed

---

## 4. Block Allocation Model

### 4.1 Global Block Namespace

ASL maintains a **single global block namespace**.

Block IDs are allocated from a monotonically increasing sequence:

```
next_block_id := next_block_id + 1
```

Properties:

* Allocation is append-only
* Leaked IDs are permitted
* No coordination with CAS is required

---

### 4.2 Open Blocks

At any time, ASL may maintain one or more **open blocks**.

Open blocks:

* Accept new artifact writes
* Are not visible to readers
* Are not referenced by the index
* May be abandoned on crash

---

### 4.3 Sealed Blocks

A block becomes **sealed** when:

* It reaches an internal fill threshold, or
* ASL decides to finalize it for policy reasons

Once sealed:

* No further writes are permitted
* Offsets and lengths become permanent
* The block becomes visible to CAS
* The block may be referenced by index entries

Sealed blocks are immutable forever.

---

## 5. Packaging Policy (Non-Semantic)

ASL applies **packaging heuristics** when choosing how to place artifacts into blocks.

Examples:

* Prefer packing many small artifacts together
* Prefer isolating very large artifacts
* Avoid mixing vastly different sizes when convenient

### Important rule

Packaging decisions are:

* Best-effort
* Local
* Replaceable
* **Not part of the ASL contract**

No higher layer may assume anything about block contents based on artifact size.

---

## 6. Storage Layout and Locality

### 6.1 Single Dataset, Structured Locality

ASL stores all blocks within a **single ZFS dataset**.

Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:

```
asl/blocks/dense/
asl/blocks/sparse/
```

These subpaths:

* Exist purely for storage optimization
* May carry ZFS property overrides
* Are not encoded into block identity

Block resolution does **not** depend on knowing which subpath was used.

---

### 6.2 Placement Hints

At allocation time, ASL may apply **placement hints**, such as:

* Preferred directory
* Write size
* Compression preference
* Recordsize alignment

These hints:

* Affect only physical layout
* May change over time
* Do not affect block identity or correctness

---

## 7. Snapshot Semantics

ASL is snapshot-aware but snapshot-agnostic.

Rules:

* ASL blocks live inside snapshot-capable storage
* Snapshots naturally pin sealed blocks
* ASL does not encode snapshot IDs into block IDs
* CAS determines snapshot visibility

ASL guarantees:

* Deterministic reads for a given snapshot
* No mutation of sealed blocks across snapshots

---

## 8. Crash Safety and Recovery

### 8.1 Crash During Open Block

If a crash occurs:

* Open blocks may be lost or abandoned
* Block IDs allocated but not sealed may be leaked
* No sealed block may be corrupted

This is acceptable and expected.

---

### 8.2 Recovery Rules

On startup, ASL:

* Scans for sealed blocks
* Ignores or cleans up abandoned open blocks
* Resumes allocation from the next unused block ID

No global replay or rebuild is required.

---

## 9. Garbage Collection

ASL performs garbage collection at **block granularity**.

Rules:

* A block is eligible for deletion if:

  * It is sealed, and
  * It is unreachable from all retained snapshots
* ASL does not perform partial block mutation
* Compaction (if any) rewrites artifacts into new blocks

Block deletion is irreversible.

---

## 10. Non-Goals (Explicit)

ASL explicitly does **not** provide:

* Artifact identity management
* Deduplication decisions
* Provenance interpretation
* Size-class semantics
* Execution semantics

Those concerns belong to CAS, PEL, and higher layers.

---

## 11. Design Summary (Executive)

* One block namespace
* One addressing model
* One read path
* Placement is an optimization
* Immutability is absolute
* Snapshots provide safety
* Size is a courtesy, not a contract