diff --git a/tier1/asl-indexes-1.md b/tier1/asl-indexes-1.md index f0d7c7d..fe53f6d 100644 --- a/tier1/asl-indexes-1.md +++ b/tier1/asl-indexes-1.md @@ -76,6 +76,38 @@ Properties: This is the ASL/1-CORE-INDEX and is the only index that governs visibility. +### 2.2 Structural Identity (SID) + +SID is the canonical identity of a derivation, not of bytes. + +``` +SID = H(ProgramRef || Inputs[] || ParamsRef || ExecProfile) +``` + +Notes: + +* `Inputs[]` order is canonical and stable. +* `ParamsRef` is optional; absence must be encoded explicitly in the hash. +* `ExecProfile` captures execution profile/versioning parameters (optional, but + presence/absence is part of the SID). + +### 2.2.1 SID Canonicalization (Normative) + +Implementations MUST canonicalize SID inputs as follows: + +1. **ProgramRef** is encoded as `ReferenceBytes` (`ENC/ASL1-CORE`). +2. **Inputs[]** are ordered exactly as declared by the Program DAG inputs. +3. **ParamsRef** is encoded as: + * `0x00` if absent, or + * `0x01 || ReferenceBytes` if present. +4. **ExecProfile** is encoded as: + * `0x00` if absent, or + * `0x01 || ExecProfileBytes` if present. +5. **SID hash input** is the concatenation of the above fields with no padding. + +`ExecProfileBytes` is an opaque, deterministic byte sequence defined by the +execution environment. Any change in encoding or content MUST change SID. + ### 2.2 Structural Index Purpose: map structural identity to a derivation DAG node. @@ -90,7 +122,51 @@ Properties: * Does not imply materialization * May be in-memory or persisted -### 2.3 Materialization Cache +### 2.3 Derivation Index + +Purpose: map a materialized ArtifactKey to the set of known derivations that +produce it. + +``` +ArtifactKey -> [DerivationRecord] +``` + +Where: + +``` +DerivationRecord = { SID, ProgramRef, InputRefs[], ParamsRef, ExecProfile } +``` + +Properties: + +* Authoritative for known derivations +* Recomputable from replay + execution, so storage is optional +* Enables dedup and semantic correlation across multiple derivations +* Multiple SIDs MAY map to the same ArtifactKey + +#### 2.3.1 DerivationRecord Data Model (Normative) + +Canonical fields: + +``` +DerivationRecord { + sid: SID + program_ref: Reference + input_refs: Reference[] // ordered as Program DAG inputs + params_ref: Optional + exec_profile: Optional +} +``` + +Rules: + +* `input_refs` order MUST be preserved. +* `params_ref` and `exec_profile` MUST use explicit presence markers as defined + in SID canonicalization. +* Additional metadata MAY be stored but MUST NOT affect SID or canonical + equivalence. + +### 2.4 Materialization Cache Purpose: record previously materialized content for a structural identity. @@ -112,6 +188,7 @@ Dependencies MUST follow this direction: ``` Structural Index -> Materialization Cache -> Content Index +Derivation Index -> Content Index ``` Rules: @@ -119,6 +196,7 @@ Rules: * The Content Index MUST NOT depend on the Structural Index. * The Structural Index MUST NOT depend on stored bytes. * The Materialization Cache MAY depend on both. +* The Derivation Index MAY depend on the Content Index. --- @@ -126,7 +204,10 @@ Rules: * PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index. * GET consults only the Content Index and reads bytes from the store. -* The Structural Index and Materialization Cache are optional optimizations for PUT. +* The Structural Index, Derivation Index, and Materialization Cache are optional + optimizations for PUT. + +Note: versioning relationships are modeled in TGK, not in these indexes. --- @@ -137,3 +218,9 @@ ASL/INDEXES/1 does not define: * Encodings for any index * Storage layout or sharding * Query operators or traversal semantics + +--- + +## Changelog + +- 2026-01-18: Added SID canonicalization rules and DerivationRecord data model.