Refine ASL indexes taxonomy

This commit is contained in:
Carl Niklas Rydberg 2026-01-18 06:55:00 +01:00
parent 7a3dcc3978
commit 8c5fa71388

View file

@ -76,6 +76,38 @@ Properties:
This is the ASL/1-CORE-INDEX and is the only index that governs visibility.
### 2.2 Structural Identity (SID)
SID is the canonical identity of a derivation, not of bytes.
```
SID = H(ProgramRef || Inputs[] || ParamsRef || ExecProfile)
```
Notes:
* `Inputs[]` order is canonical and stable.
* `ParamsRef` is optional; absence must be encoded explicitly in the hash.
* `ExecProfile` captures execution profile/versioning parameters (optional, but
presence/absence is part of the SID).
### 2.2.1 SID Canonicalization (Normative)
Implementations MUST canonicalize SID inputs as follows:
1. **ProgramRef** is encoded as `ReferenceBytes` (`ENC/ASL1-CORE`).
2. **Inputs[]** are ordered exactly as declared by the Program DAG inputs.
3. **ParamsRef** is encoded as:
* `0x00` if absent, or
* `0x01 || ReferenceBytes` if present.
4. **ExecProfile** is encoded as:
* `0x00` if absent, or
* `0x01 || ExecProfileBytes` if present.
5. **SID hash input** is the concatenation of the above fields with no padding.
`ExecProfileBytes` is an opaque, deterministic byte sequence defined by the
execution environment. Any change in encoding or content MUST change SID.
### 2.2 Structural Index
Purpose: map structural identity to a derivation DAG node.
@ -90,7 +122,51 @@ Properties:
* Does not imply materialization
* May be in-memory or persisted
### 2.3 Materialization Cache
### 2.3 Derivation Index
Purpose: map a materialized ArtifactKey to the set of known derivations that
produce it.
```
ArtifactKey -> [DerivationRecord]
```
Where:
```
DerivationRecord = { SID, ProgramRef, InputRefs[], ParamsRef, ExecProfile }
```
Properties:
* Authoritative for known derivations
* Recomputable from replay + execution, so storage is optional
* Enables dedup and semantic correlation across multiple derivations
* Multiple SIDs MAY map to the same ArtifactKey
#### 2.3.1 DerivationRecord Data Model (Normative)
Canonical fields:
```
DerivationRecord {
sid: SID
program_ref: Reference
input_refs: Reference[] // ordered as Program DAG inputs
params_ref: Optional<Reference>
exec_profile: Optional<ExecProfileBytes>
}
```
Rules:
* `input_refs` order MUST be preserved.
* `params_ref` and `exec_profile` MUST use explicit presence markers as defined
in SID canonicalization.
* Additional metadata MAY be stored but MUST NOT affect SID or canonical
equivalence.
### 2.4 Materialization Cache
Purpose: record previously materialized content for a structural identity.
@ -112,6 +188,7 @@ Dependencies MUST follow this direction:
```
Structural Index -> Materialization Cache -> Content Index
Derivation Index -> Content Index
```
Rules:
@ -119,6 +196,7 @@ Rules:
* The Content Index MUST NOT depend on the Structural Index.
* The Structural Index MUST NOT depend on stored bytes.
* The Materialization Cache MAY depend on both.
* The Derivation Index MAY depend on the Content Index.
---
@ -126,7 +204,10 @@ Rules:
* PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index.
* GET consults only the Content Index and reads bytes from the store.
* The Structural Index and Materialization Cache are optional optimizations for PUT.
* The Structural Index, Derivation Index, and Materialization Cache are optional
optimizations for PUT.
Note: versioning relationships are modeled in TGK, not in these indexes.
---
@ -137,3 +218,9 @@ ASL/INDEXES/1 does not define:
* Encodings for any index
* Storage layout or sharding
* Query operators or traversal semantics
---
## Changelog
- 2026-01-18: Added SID canonicalization rules and DerivationRecord data model.