amduat-api/notes/ASL-Block-Architecture-&-Specification.md

# ASL Block Architecture & Specification

## 1. Purpose and Scope

The **Artifact Storage Layer (ASL)** is responsible for the **physical storage, layout, and retrieval of immutable artifact bytes**.
ASL operates beneath CAS and above the storage substrate (ZFS).

ASL concerns itself with:

* Efficient packaging of artifacts into blocks
* Stable block addressing
* Snapshot-safe immutability
* Storage-local optimizations

ASL does **not** define:

* Artifact identity
* Hash semantics
* Provenance
* Interpretation
* Indexing semantics

---

## 2. Core Abstractions

### 2.1 Artifact

An **artifact** is an immutable byte sequence produced or consumed by higher layers.

ASL treats artifacts as opaque bytes.

---

### 2.2 ASL Block

An **ASL block** is the smallest independently addressable, immutable unit of storage managed by ASL.

Properties:

* Identified by an **ASL Block ID**
* Contains one or more artifacts
* Written sequentially
* Immutable once sealed
* Snapshot-safe

ASL blocks are the unit of:

* Storage
* Reachability
* Garbage collection

---

### 2.3 ASL Block ID

An **ASL Block ID** is an opaque, stable identifier.

#### Invariants

* Globally unique within an ASL instance
* Never reused
* Never mutated
* Does **not** encode:

  * Artifact size
  * Placement
  * Snapshot
  * Storage topology
  * Policy decisions

#### Semantics

Block IDs identify *logical blocks*, not physical locations.

Higher layers must treat block IDs as uninterpretable tokens.

---

## 3. Addressing Model

ASL exposes a single addressing primitive:

```
(block_id, offset, length) → bytes
```

This is the **only** contract between CAS and ASL.

Notes:

* `offset` and `length` are stable for the lifetime of the block
* ASL guarantees that reads are deterministic per snapshot
* No size-class or block-kind information is exposed

---

## 4. Block Allocation Model

### 4.1 Global Block Namespace

ASL maintains a **single global block namespace**.

Block IDs are allocated from a monotonically increasing sequence:

```
next_block_id := next_block_id + 1
```

Properties:

* Allocation is append-only
* Leaked IDs are permitted
* No coordination with CAS is required

---

### 4.2 Open Blocks

At any time, ASL may maintain one or more **open blocks**.

Open blocks:

* Accept new artifact writes
* Are not visible to readers
* Are not referenced by the index
* May be abandoned on crash

---

### 4.3 Sealed Blocks

A block becomes **sealed** when:

* It reaches an internal fill threshold, or
* ASL decides to finalize it for policy reasons

Once sealed:

* No further writes are permitted
* Offsets and lengths become permanent
* The block becomes visible to CAS
* The block may be referenced by index entries

Sealed blocks are immutable forever.

---

## 5. Packaging Policy (Non-Semantic)

ASL applies **packaging heuristics** when choosing how to place artifacts into blocks.

Examples:

* Prefer packing many small artifacts together
* Prefer isolating very large artifacts
* Avoid mixing vastly different sizes when convenient

### Important rule

Packaging decisions are:

* Best-effort
* Local
* Replaceable
* **Not part of the ASL contract**

No higher layer may assume anything about block contents based on artifact size.

---

## 6. Storage Layout and Locality

### 6.1 Single Dataset, Structured Locality

ASL stores all blocks within a **single ZFS dataset**.

Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:

```
asl/blocks/dense/
asl/blocks/sparse/
```

These subpaths:

* Exist purely for storage optimization
* May carry ZFS property overrides
* Are not encoded into block identity

Block resolution does **not** depend on knowing which subpath was used.

---

### 6.2 Placement Hints

At allocation time, ASL may apply **placement hints**, such as:

* Preferred directory
* Write size
* Compression preference
* Recordsize alignment

These hints:

* Affect only physical layout
* May change over time
* Do not affect block identity or correctness

---

## 7. Snapshot Semantics

ASL is snapshot-aware but snapshot-agnostic.

Rules:

* ASL blocks live inside snapshot-capable storage
* Snapshots naturally pin sealed blocks
* ASL does not encode snapshot IDs into block IDs
* CAS determines snapshot visibility

ASL guarantees:

* Deterministic reads for a given snapshot
* No mutation of sealed blocks across snapshots

---

## 8. Crash Safety and Recovery

### 8.1 Crash During Open Block

If a crash occurs:

* Open blocks may be lost or abandoned
* Block IDs allocated but not sealed may be leaked
* No sealed block may be corrupted

This is acceptable and expected.

---

### 8.2 Recovery Rules

On startup, ASL:

* Scans for sealed blocks
* Ignores or cleans up abandoned open blocks
* Resumes allocation from the next unused block ID

No global replay or rebuild is required.

---

## 9. Garbage Collection

ASL performs garbage collection at **block granularity**.

Rules:

* A block is eligible for deletion if:

  * It is sealed, and
  * It is unreachable from all retained snapshots
* ASL does not perform partial block mutation
* Compaction (if any) rewrites artifacts into new blocks

Block deletion is irreversible.

---

## 10. Non-Goals (Explicit)

ASL explicitly does **not** provide:

* Artifact identity management
* Deduplication decisions
* Provenance interpretation
* Size-class semantics
* Execution semantics

Those concerns belong to CAS, PEL, and higher layers.

---

## 11. Design Summary (Executive)

* One block namespace
* One addressing model
* One read path
* Placement is an optimization
* Immutability is absolute
* Snapshots provide safety
* Size is a courtesy, not a contract
Added some notes that needs to be analyzed. 2026-01-17 00:19:49 +01:00			`# ASL Block Architecture & Specification`

			`## 1. Purpose and Scope`

			`The Artifact Storage Layer (ASL) is responsible for the physical storage, layout, and retrieval of immutable artifact bytes.`
			`ASL operates beneath CAS and above the storage substrate (ZFS).`

			`ASL concerns itself with:`

			`* Efficient packaging of artifacts into blocks`
			`* Stable block addressing`
			`* Snapshot-safe immutability`
			`* Storage-local optimizations`

			`ASL does not define:`

			`* Artifact identity`
			`* Hash semantics`
			`* Provenance`
			`* Interpretation`
			`* Indexing semantics`

			`---`

			`## 2. Core Abstractions`

			`### 2.1 Artifact`

			`An artifact is an immutable byte sequence produced or consumed by higher layers.`

			`ASL treats artifacts as opaque bytes.`

			`---`

			`### 2.2 ASL Block`

			`An ASL block is the smallest independently addressable, immutable unit of storage managed by ASL.`

			`Properties:`

			`* Identified by an ASL Block ID`
			`* Contains one or more artifacts`
			`* Written sequentially`
			`* Immutable once sealed`
			`* Snapshot-safe`

			`ASL blocks are the unit of:`

			`* Storage`
			`* Reachability`
			`* Garbage collection`

			`---`

			`### 2.3 ASL Block ID`

			`An ASL Block ID is an opaque, stable identifier.`

			`#### Invariants`

			`* Globally unique within an ASL instance`
			`* Never reused`
			`* Never mutated`
			`* Does not encode:`

			`* Artifact size`
			`* Placement`
			`* Snapshot`
			`* Storage topology`
			`* Policy decisions`

			`#### Semantics`

			`Block IDs identify logical blocks, not physical locations.`

			`Higher layers must treat block IDs as uninterpretable tokens.`

			`---`

			`## 3. Addressing Model`

			`ASL exposes a single addressing primitive:`

			```
			`(block_id, offset, length) → bytes`
			```

			`This is the only contract between CAS and ASL.`

			`Notes:`

			* `offset` and `length` are stable for the lifetime of the block
			`* ASL guarantees that reads are deterministic per snapshot`
			`* No size-class or block-kind information is exposed`

			`---`

			`## 4. Block Allocation Model`

			`### 4.1 Global Block Namespace`

			`ASL maintains a single global block namespace.`

			`Block IDs are allocated from a monotonically increasing sequence:`

			```
			`next_block_id := next_block_id + 1`
			```

			`Properties:`

			`* Allocation is append-only`
			`* Leaked IDs are permitted`
			`* No coordination with CAS is required`

			`---`

			`### 4.2 Open Blocks`

			`At any time, ASL may maintain one or more open blocks.`

			`Open blocks:`

			`* Accept new artifact writes`
			`* Are not visible to readers`
			`* Are not referenced by the index`
			`* May be abandoned on crash`

			`---`

			`### 4.3 Sealed Blocks`

			`A block becomes sealed when:`

			`* It reaches an internal fill threshold, or`
			`* ASL decides to finalize it for policy reasons`

			`Once sealed:`

			`* No further writes are permitted`
			`* Offsets and lengths become permanent`
			`* The block becomes visible to CAS`
			`* The block may be referenced by index entries`

			`Sealed blocks are immutable forever.`

			`---`

			`## 5. Packaging Policy (Non-Semantic)`

			`ASL applies packaging heuristics when choosing how to place artifacts into blocks.`

			`Examples:`

			`* Prefer packing many small artifacts together`
			`* Prefer isolating very large artifacts`
			`* Avoid mixing vastly different sizes when convenient`

			`### Important rule`

			`Packaging decisions are:`

			`* Best-effort`
			`* Local`
			`* Replaceable`
			`* Not part of the ASL contract`

			`No higher layer may assume anything about block contents based on artifact size.`

			`---`

			`## 6. Storage Layout and Locality`

			`### 6.1 Single Dataset, Structured Locality`

			`ASL stores all blocks within a single ZFS dataset.`

			`Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:`

			```
			`asl/blocks/dense/`
			`asl/blocks/sparse/`
			```

			`These subpaths:`

			`* Exist purely for storage optimization`
			`* May carry ZFS property overrides`
			`* Are not encoded into block identity`

			`Block resolution does not depend on knowing which subpath was used.`

			`---`

			`### 6.2 Placement Hints`

			`At allocation time, ASL may apply placement hints, such as:`

			`* Preferred directory`
			`* Write size`
			`* Compression preference`
			`* Recordsize alignment`

			`These hints:`

			`* Affect only physical layout`
			`* May change over time`
			`* Do not affect block identity or correctness`

			`---`

			`## 7. Snapshot Semantics`

			`ASL is snapshot-aware but snapshot-agnostic.`

			`Rules:`

			`* ASL blocks live inside snapshot-capable storage`
			`* Snapshots naturally pin sealed blocks`
			`* ASL does not encode snapshot IDs into block IDs`
			`* CAS determines snapshot visibility`

			`ASL guarantees:`

			`* Deterministic reads for a given snapshot`
			`* No mutation of sealed blocks across snapshots`

			`---`

			`## 8. Crash Safety and Recovery`

			`### 8.1 Crash During Open Block`

			`If a crash occurs:`

			`* Open blocks may be lost or abandoned`
			`* Block IDs allocated but not sealed may be leaked`
			`* No sealed block may be corrupted`

			`This is acceptable and expected.`

			`---`

			`### 8.2 Recovery Rules`

			`On startup, ASL:`

			`* Scans for sealed blocks`
			`* Ignores or cleans up abandoned open blocks`
			`* Resumes allocation from the next unused block ID`

			`No global replay or rebuild is required.`

			`---`

			`## 9. Garbage Collection`

			`ASL performs garbage collection at block granularity.`

			`Rules:`

			`* A block is eligible for deletion if:`

			`* It is sealed, and`
			`* It is unreachable from all retained snapshots`
			`* ASL does not perform partial block mutation`
			`* Compaction (if any) rewrites artifacts into new blocks`

			`Block deletion is irreversible.`

			`---`

			`## 10. Non-Goals (Explicit)`

			`ASL explicitly does not provide:`

			`* Artifact identity management`
			`* Deduplication decisions`
			`* Provenance interpretation`
			`* Size-class semantics`
			`* Execution semantics`

			`Those concerns belong to CAS, PEL, and higher layers.`

			`---`

			`## 11. Design Summary (Executive)`

			`* One block namespace`
			`* One addressing model`
			`* One read path`
			`* Placement is an optimization`
			`* Immutability is absolute`
			`* Snapshots provide safety`
			`* Size is a courtesy, not a contract`