298 lines
5.6 KiB
Markdown
298 lines
5.6 KiB
Markdown
|
|
# ASL Block Architecture & Specification
|
||
|
|
|
||
|
|
## 1. Purpose and Scope
|
||
|
|
|
||
|
|
The **Artifact Storage Layer (ASL)** is responsible for the **physical storage, layout, and retrieval of immutable artifact bytes**.
|
||
|
|
ASL operates beneath CAS and above the storage substrate (ZFS).
|
||
|
|
|
||
|
|
ASL concerns itself with:
|
||
|
|
|
||
|
|
* Efficient packaging of artifacts into blocks
|
||
|
|
* Stable block addressing
|
||
|
|
* Snapshot-safe immutability
|
||
|
|
* Storage-local optimizations
|
||
|
|
|
||
|
|
ASL does **not** define:
|
||
|
|
|
||
|
|
* Artifact identity
|
||
|
|
* Hash semantics
|
||
|
|
* Provenance
|
||
|
|
* Interpretation
|
||
|
|
* Indexing semantics
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. Core Abstractions
|
||
|
|
|
||
|
|
### 2.1 Artifact
|
||
|
|
|
||
|
|
An **artifact** is an immutable byte sequence produced or consumed by higher layers.
|
||
|
|
|
||
|
|
ASL treats artifacts as opaque bytes.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 2.2 ASL Block
|
||
|
|
|
||
|
|
An **ASL block** is the smallest independently addressable, immutable unit of storage managed by ASL.
|
||
|
|
|
||
|
|
Properties:
|
||
|
|
|
||
|
|
* Identified by an **ASL Block ID**
|
||
|
|
* Contains one or more artifacts
|
||
|
|
* Written sequentially
|
||
|
|
* Immutable once sealed
|
||
|
|
* Snapshot-safe
|
||
|
|
|
||
|
|
ASL blocks are the unit of:
|
||
|
|
|
||
|
|
* Storage
|
||
|
|
* Reachability
|
||
|
|
* Garbage collection
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 2.3 ASL Block ID
|
||
|
|
|
||
|
|
An **ASL Block ID** is an opaque, stable identifier.
|
||
|
|
|
||
|
|
#### Invariants
|
||
|
|
|
||
|
|
* Globally unique within an ASL instance
|
||
|
|
* Never reused
|
||
|
|
* Never mutated
|
||
|
|
* Does **not** encode:
|
||
|
|
|
||
|
|
* Artifact size
|
||
|
|
* Placement
|
||
|
|
* Snapshot
|
||
|
|
* Storage topology
|
||
|
|
* Policy decisions
|
||
|
|
|
||
|
|
#### Semantics
|
||
|
|
|
||
|
|
Block IDs identify *logical blocks*, not physical locations.
|
||
|
|
|
||
|
|
Higher layers must treat block IDs as uninterpretable tokens.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. Addressing Model
|
||
|
|
|
||
|
|
ASL exposes a single addressing primitive:
|
||
|
|
|
||
|
|
```
|
||
|
|
(block_id, offset, length) → bytes
|
||
|
|
```
|
||
|
|
|
||
|
|
This is the **only** contract between CAS and ASL.
|
||
|
|
|
||
|
|
Notes:
|
||
|
|
|
||
|
|
* `offset` and `length` are stable for the lifetime of the block
|
||
|
|
* ASL guarantees that reads are deterministic per snapshot
|
||
|
|
* No size-class or block-kind information is exposed
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Block Allocation Model
|
||
|
|
|
||
|
|
### 4.1 Global Block Namespace
|
||
|
|
|
||
|
|
ASL maintains a **single global block namespace**.
|
||
|
|
|
||
|
|
Block IDs are allocated from a monotonically increasing sequence:
|
||
|
|
|
||
|
|
```
|
||
|
|
next_block_id := next_block_id + 1
|
||
|
|
```
|
||
|
|
|
||
|
|
Properties:
|
||
|
|
|
||
|
|
* Allocation is append-only
|
||
|
|
* Leaked IDs are permitted
|
||
|
|
* No coordination with CAS is required
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 4.2 Open Blocks
|
||
|
|
|
||
|
|
At any time, ASL may maintain one or more **open blocks**.
|
||
|
|
|
||
|
|
Open blocks:
|
||
|
|
|
||
|
|
* Accept new artifact writes
|
||
|
|
* Are not visible to readers
|
||
|
|
* Are not referenced by the index
|
||
|
|
* May be abandoned on crash
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 4.3 Sealed Blocks
|
||
|
|
|
||
|
|
A block becomes **sealed** when:
|
||
|
|
|
||
|
|
* It reaches an internal fill threshold, or
|
||
|
|
* ASL decides to finalize it for policy reasons
|
||
|
|
|
||
|
|
Once sealed:
|
||
|
|
|
||
|
|
* No further writes are permitted
|
||
|
|
* Offsets and lengths become permanent
|
||
|
|
* The block becomes visible to CAS
|
||
|
|
* The block may be referenced by index entries
|
||
|
|
|
||
|
|
Sealed blocks are immutable forever.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. Packaging Policy (Non-Semantic)
|
||
|
|
|
||
|
|
ASL applies **packaging heuristics** when choosing how to place artifacts into blocks.
|
||
|
|
|
||
|
|
Examples:
|
||
|
|
|
||
|
|
* Prefer packing many small artifacts together
|
||
|
|
* Prefer isolating very large artifacts
|
||
|
|
* Avoid mixing vastly different sizes when convenient
|
||
|
|
|
||
|
|
### Important rule
|
||
|
|
|
||
|
|
Packaging decisions are:
|
||
|
|
|
||
|
|
* Best-effort
|
||
|
|
* Local
|
||
|
|
* Replaceable
|
||
|
|
* **Not part of the ASL contract**
|
||
|
|
|
||
|
|
No higher layer may assume anything about block contents based on artifact size.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Storage Layout and Locality
|
||
|
|
|
||
|
|
### 6.1 Single Dataset, Structured Locality
|
||
|
|
|
||
|
|
ASL stores all blocks within a **single ZFS dataset**.
|
||
|
|
|
||
|
|
Within that dataset, ASL may organize blocks into subpaths to improve locality, e.g.:
|
||
|
|
|
||
|
|
```
|
||
|
|
asl/blocks/dense/
|
||
|
|
asl/blocks/sparse/
|
||
|
|
```
|
||
|
|
|
||
|
|
These subpaths:
|
||
|
|
|
||
|
|
* Exist purely for storage optimization
|
||
|
|
* May carry ZFS property overrides
|
||
|
|
* Are not encoded into block identity
|
||
|
|
|
||
|
|
Block resolution does **not** depend on knowing which subpath was used.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 6.2 Placement Hints
|
||
|
|
|
||
|
|
At allocation time, ASL may apply **placement hints**, such as:
|
||
|
|
|
||
|
|
* Preferred directory
|
||
|
|
* Write size
|
||
|
|
* Compression preference
|
||
|
|
* Recordsize alignment
|
||
|
|
|
||
|
|
These hints:
|
||
|
|
|
||
|
|
* Affect only physical layout
|
||
|
|
* May change over time
|
||
|
|
* Do not affect block identity or correctness
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. Snapshot Semantics
|
||
|
|
|
||
|
|
ASL is snapshot-aware but snapshot-agnostic.
|
||
|
|
|
||
|
|
Rules:
|
||
|
|
|
||
|
|
* ASL blocks live inside snapshot-capable storage
|
||
|
|
* Snapshots naturally pin sealed blocks
|
||
|
|
* ASL does not encode snapshot IDs into block IDs
|
||
|
|
* CAS determines snapshot visibility
|
||
|
|
|
||
|
|
ASL guarantees:
|
||
|
|
|
||
|
|
* Deterministic reads for a given snapshot
|
||
|
|
* No mutation of sealed blocks across snapshots
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Crash Safety and Recovery
|
||
|
|
|
||
|
|
### 8.1 Crash During Open Block
|
||
|
|
|
||
|
|
If a crash occurs:
|
||
|
|
|
||
|
|
* Open blocks may be lost or abandoned
|
||
|
|
* Block IDs allocated but not sealed may be leaked
|
||
|
|
* No sealed block may be corrupted
|
||
|
|
|
||
|
|
This is acceptable and expected.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 8.2 Recovery Rules
|
||
|
|
|
||
|
|
On startup, ASL:
|
||
|
|
|
||
|
|
* Scans for sealed blocks
|
||
|
|
* Ignores or cleans up abandoned open blocks
|
||
|
|
* Resumes allocation from the next unused block ID
|
||
|
|
|
||
|
|
No global replay or rebuild is required.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 9. Garbage Collection
|
||
|
|
|
||
|
|
ASL performs garbage collection at **block granularity**.
|
||
|
|
|
||
|
|
Rules:
|
||
|
|
|
||
|
|
* A block is eligible for deletion if:
|
||
|
|
|
||
|
|
* It is sealed, and
|
||
|
|
* It is unreachable from all retained snapshots
|
||
|
|
* ASL does not perform partial block mutation
|
||
|
|
* Compaction (if any) rewrites artifacts into new blocks
|
||
|
|
|
||
|
|
Block deletion is irreversible.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 10. Non-Goals (Explicit)
|
||
|
|
|
||
|
|
ASL explicitly does **not** provide:
|
||
|
|
|
||
|
|
* Artifact identity management
|
||
|
|
* Deduplication decisions
|
||
|
|
* Provenance interpretation
|
||
|
|
* Size-class semantics
|
||
|
|
* Execution semantics
|
||
|
|
|
||
|
|
Those concerns belong to CAS, PEL, and higher layers.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 11. Design Summary (Executive)
|
||
|
|
|
||
|
|
* One block namespace
|
||
|
|
* One addressing model
|
||
|
|
* One read path
|
||
|
|
* Placement is an optimization
|
||
|
|
* Immutability is absolute
|
||
|
|
* Snapshots provide safety
|
||
|
|
* Size is a courtesy, not a contract
|
||
|
|
|
||
|
|
|