amduat-api/notes/ASL-STORE put-get contract.md
2026-01-17 00:19:49 +01:00

315 lines
6.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Great — this is the **right moment** to define it, because everything underneath (artifacts, DAGs, indexes, snapshots, logs) is now solid.
What follows is a **clean, minimal, *normative* ASL-STORE put/get contract** that:
* Sits *between* ASL-CORE / ASL-CORE-INDEX and any concrete storage engine
* Is compatible with your snapshot + log semantics
* Does **not** assume classical vs quantum
* Makes caching, deduplication, and replay possible
* Avoids over-specifying performance or layout
Think of this as **the membrane between semantics and mechanics**.
---
# ASL-STORE — Put/Get Contract (Normative)
## 1. Purpose
ASL-STORE defines the **operational contract** by which:
* Artifacts are **materialized and stored**
* Artifact content becomes **visible** via the ASL-CORE-INDEX
* Stored content is **retrieved deterministically**
ASL-STORE answers exactly two questions:
> **PUT**: How does an artifact become stored and indexed?
> **GET**: How are bytes retrieved once indexed?
Nothing more.
---
## 2. Scope
ASL-STORE defines:
* The **PUT lifecycle**
* The **GET lifecycle**
* Required interactions with:
* Content Index (ASL-CORE-INDEX)
* Structural DAG
* Materialization cache
* Visibility and determinism rules
ASL-STORE does **not** define:
* Block allocation strategy
* File layout
* IO APIs
* Concurrency primitives
* Caching policies
* Garbage collection
* Replication mechanics
---
## 3. Actors and Dependencies
ASL-STORE operates in the presence of:
* **Artifact DAG** (SID-addressed)
* **Materialization Cache** (`SID → CID`, optional)
* **Content Index** (`CID → ArtifactLocation`)
* **Block Store** (opaque byte storage)
* **Snapshot + Log** (for index visibility)
ASL-STORE **must not** bypass the Content Index.
---
## 4. PUT Contract
### 4.1 PUT Signature (Semantic)
```
put(artifact) → (CID, IndexState)
```
Where:
* `artifact` is an ASL artifact (possibly lazy, possibly quantum)
* `CID` is the semantic content identity
* `IndexState = (SnapshotID, LogPosition)` after the put
---
### 4.2 PUT Semantics (Step-by-step)
The following steps are **logically ordered**.
An implementation may optimize, but may not violate the semantics.
---
#### Step 1 — Structural registration (mandatory)
* The artifact **must** be registered in the Structural Index (SID → DAG).
* If an identical SID already exists, it **must be reused**.
> This guarantees derivation identity independent of storage.
---
#### Step 2 — CID resolution (lazy, cache-aware)
* If `(SID → CID)` exists in the Materialization Cache:
* Use it.
* Otherwise:
* Materialize the artifact DAG
* Compute the CID
* Cache `(SID → CID)`
> Materialization may recursively invoke child artifacts.
---
#### Step 3 — Deduplication check (mandatory)
* Lookup `CID` in the Content Index at CURRENT.
* If an entry exists:
* **No bytes are written**
* **No new index entry is required**
* PUT completes successfully
> This is **global deduplication**.
---
#### Step 4 — Physical storage (conditional)
If no existing entry exists:
* Bytes corresponding to `CID` **must be written** to a block
* A concrete `ArtifactLocation` is produced:
```
ArtifactLocation = Sequence[BlockSlice]
BlockSlice = (BlockID, offset, length)
```
No assumptions are made about block layout.
---
#### Step 5 — Index mutation (mandatory)
* Append a **PUT log entry** to the Content Index:
```
CID → ArtifactLocation
```
* The entry is **not visible** until the log position is ≤ CURRENT.
> This is the *only* moment storage becomes visible.
---
### 4.3 PUT Guarantees
After successful PUT:
* The artifacts CID:
* Is stable
* Is retrievable
* Will resolve to immutable bytes
* The Content Index state:
* Advances monotonically
* Is replayable
* Repeating PUT with the same artifact:
* Is idempotent
---
## 5. GET Contract
### 5.1 GET Signature (Semantic)
```
get(CID, IndexState?) → bytes | NOT_FOUND
```
Where:
* `CID` is the content identity
* `IndexState` is optional:
* Defaults to CURRENT
* May specify `(SnapshotID, LogPosition)`
---
### 5.2 GET Semantics
1. Resolve `CID → ArtifactLocation` using:
```
Index(snapshot, log_prefix)
```
2. If no entry exists:
* Return `NOT_FOUND`
3. Otherwise:
* Read exactly `(length)` bytes from `(BlockID, offset)`
* Return bytes **verbatim**
No interpretation is applied.
---
### 5.3 GET Guarantees
* Returned bytes are:
* Immutable
* Deterministic
* Content-addressed
* GET never triggers materialization
* GET never mutates state
---
## 6. Visibility Rules
An index entry is visible **if and only if**:
1. The referenced block is sealed
2. The log entry position ≤ CURRENT log position
3. The snapshot + log prefix includes the entry
ASL-STORE must respect these rules strictly.
---
## 7. Failure Semantics (Minimal)
ASL-STORE must guarantee:
* No visible index entry points to missing or mutable bytes
* Partial writes must not become visible
* Replaying snapshot + log after crash yields a valid index
No stronger guarantees are required at this level.
---
## 8. Determinism Contract
For fixed:
* SnapshotID
* LogPosition
* Artifact DAG
* Hash algorithms
ASL-STORE guarantees:
* Same CID
* Same ArtifactLocation resolution
* Same GET results
This is the foundation for:
* Federation
* Replication
* Provenance
* Certification
---
## 9. Relationship to q-bits (Explicit)
ASL-STORE:
* Is **agnostic** to q-bits
* Sees only:
* CIDs
* Bytes
* Does not care how CIDs were derived
Quantum artifacts affect **materialization**, not storage semantics.
---
## 10. Summary (Mental Model)
You can summarize ASL-STORE like this:
> **PUT** = “Make this artifact real, once, forever”
> **GET** = “Given identity and time, give me the bytes”
Everything else is someone elses problem — by design.
---
If you want, next we can:
* define **crash consistency boundaries**
* add **federated PUT/GET**
* define **certificates binding DAG → CID → Snapshot**
* show **how LLM outputs become artifacts**
* or formalize **overlay indexes** (concepts, domains, ontologies)
This layer is now solid enough to build *anything* above it.