2026-01-17 11:18:00 +01:00
|
|
|
# ASL/INDEXES/1 -- Index Taxonomy and Relationships
|
|
|
|
|
|
|
|
|
|
Status: Draft
|
|
|
|
|
Owner: Architecture
|
|
|
|
|
Version: 0.1.0
|
|
|
|
|
SoT: No
|
|
|
|
|
Last Updated: 2025-01-17
|
|
|
|
|
Linked Phase Pack: N/A
|
|
|
|
|
Tags: [indexes, content, structural, materialization]
|
|
|
|
|
|
|
|
|
|
<!-- Source: /amduat-api/tier1/asl-indexes-1.md | Canonical: /amduat/tier1/asl-indexes-1.md -->
|
|
|
|
|
|
|
|
|
|
**Document ID:** `ASL/INDEXES/1`
|
|
|
|
|
**Layer:** L2 -- Index taxonomy (no encoding)
|
|
|
|
|
|
|
|
|
|
**Depends on (normative):**
|
|
|
|
|
|
|
|
|
|
* `ASL/1-CORE-INDEX`
|
|
|
|
|
* `ASL/STORE-INDEX/1`
|
|
|
|
|
|
|
|
|
|
**Informative references:**
|
|
|
|
|
|
|
|
|
|
* `ASL/SYSTEM/1`
|
|
|
|
|
* `TGK/1`
|
|
|
|
|
* `ENC/ASL-CORE-INDEX/1`
|
|
|
|
|
|
|
|
|
|
© 2025 Niklas Rydberg.
|
|
|
|
|
|
|
|
|
|
## License
|
|
|
|
|
|
|
|
|
|
Except where otherwise noted, this document (text and diagrams) is licensed under
|
|
|
|
|
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
|
|
|
|
|
|
|
|
|
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
|
|
|
|
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
|
|
|
|
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
|
|
|
|
specifications.
|
|
|
|
|
|
|
|
|
|
Code examples in this document are provided under the Apache License 2.0 unless
|
|
|
|
|
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
|
|
|
|
public domain under CC0 1.0.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 0. Conventions
|
|
|
|
|
|
|
|
|
|
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
|
|
|
|
|
|
|
|
|
ASL/INDEXES/1 defines index roles and relationships. It does not define encodings or storage layouts.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 1. Purpose
|
|
|
|
|
|
|
|
|
|
This document defines the minimal set of indexes used by ASL systems and their dependency relationships.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 2. Index Taxonomy (Normative)
|
|
|
|
|
|
|
|
|
|
ASL systems use three distinct indexes:
|
|
|
|
|
|
|
|
|
|
### 2.1 Content Index
|
|
|
|
|
|
|
|
|
|
Purpose: map semantic identity to bytes.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
ArtifactKey -> ArtifactLocation
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Properties:
|
|
|
|
|
|
|
|
|
|
* Snapshot-relative and append-only
|
|
|
|
|
* Deterministic replay
|
|
|
|
|
* Optional tombstone shadowing
|
|
|
|
|
|
|
|
|
|
This is the ASL/1-CORE-INDEX and is the only index that governs visibility.
|
|
|
|
|
|
2026-01-18 06:55:00 +01:00
|
|
|
### 2.2 Structural Identity (SID)
|
|
|
|
|
|
|
|
|
|
SID is the canonical identity of a derivation, not of bytes.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
SID = H(ProgramRef || Inputs[] || ParamsRef || ExecProfile)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
|
|
|
|
|
* `Inputs[]` order is canonical and stable.
|
|
|
|
|
* `ParamsRef` is optional; absence must be encoded explicitly in the hash.
|
|
|
|
|
* `ExecProfile` captures execution profile/versioning parameters (optional, but
|
|
|
|
|
presence/absence is part of the SID).
|
|
|
|
|
|
|
|
|
|
### 2.2.1 SID Canonicalization (Normative)
|
|
|
|
|
|
|
|
|
|
Implementations MUST canonicalize SID inputs as follows:
|
|
|
|
|
|
|
|
|
|
1. **ProgramRef** is encoded as `ReferenceBytes` (`ENC/ASL1-CORE`).
|
|
|
|
|
2. **Inputs[]** are ordered exactly as declared by the Program DAG inputs.
|
|
|
|
|
3. **ParamsRef** is encoded as:
|
|
|
|
|
* `0x00` if absent, or
|
|
|
|
|
* `0x01 || ReferenceBytes` if present.
|
|
|
|
|
4. **ExecProfile** is encoded as:
|
|
|
|
|
* `0x00` if absent, or
|
|
|
|
|
* `0x01 || ExecProfileBytes` if present.
|
|
|
|
|
5. **SID hash input** is the concatenation of the above fields with no padding.
|
|
|
|
|
|
|
|
|
|
`ExecProfileBytes` is an opaque, deterministic byte sequence defined by the
|
|
|
|
|
execution environment. Any change in encoding or content MUST change SID.
|
|
|
|
|
|
2026-01-17 11:18:00 +01:00
|
|
|
### 2.2 Structural Index
|
|
|
|
|
|
|
|
|
|
Purpose: map structural identity to a derivation DAG node.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
SID -> DAG node
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Properties:
|
|
|
|
|
|
|
|
|
|
* Deterministic and rebuildable
|
|
|
|
|
* Does not imply materialization
|
|
|
|
|
* May be in-memory or persisted
|
|
|
|
|
|
2026-01-18 06:55:00 +01:00
|
|
|
### 2.3 Derivation Index
|
|
|
|
|
|
|
|
|
|
Purpose: map a materialized ArtifactKey to the set of known derivations that
|
|
|
|
|
produce it.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
ArtifactKey -> [DerivationRecord]
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Where:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
DerivationRecord = { SID, ProgramRef, InputRefs[], ParamsRef, ExecProfile }
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Properties:
|
|
|
|
|
|
|
|
|
|
* Authoritative for known derivations
|
|
|
|
|
* Recomputable from replay + execution, so storage is optional
|
|
|
|
|
* Enables dedup and semantic correlation across multiple derivations
|
|
|
|
|
* Multiple SIDs MAY map to the same ArtifactKey
|
|
|
|
|
|
|
|
|
|
#### 2.3.1 DerivationRecord Data Model (Normative)
|
|
|
|
|
|
|
|
|
|
Canonical fields:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
DerivationRecord {
|
|
|
|
|
sid: SID
|
|
|
|
|
program_ref: Reference
|
|
|
|
|
input_refs: Reference[] // ordered as Program DAG inputs
|
|
|
|
|
params_ref: Optional<Reference>
|
|
|
|
|
exec_profile: Optional<ExecProfileBytes>
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Rules:
|
|
|
|
|
|
|
|
|
|
* `input_refs` order MUST be preserved.
|
|
|
|
|
* `params_ref` and `exec_profile` MUST use explicit presence markers as defined
|
|
|
|
|
in SID canonicalization.
|
|
|
|
|
* Additional metadata MAY be stored but MUST NOT affect SID or canonical
|
|
|
|
|
equivalence.
|
|
|
|
|
|
|
|
|
|
### 2.4 Materialization Cache
|
2026-01-17 11:18:00 +01:00
|
|
|
|
|
|
|
|
Purpose: record previously materialized content for a structural identity.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
SID -> ArtifactKey
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Properties:
|
|
|
|
|
|
|
|
|
|
* Redundant and safe to drop
|
|
|
|
|
* Recomputable from DAG + content index
|
|
|
|
|
* Pure performance optimization
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 3. Dependency Rules (Normative)
|
|
|
|
|
|
|
|
|
|
Dependencies MUST follow this direction:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
Structural Index -> Materialization Cache -> Content Index
|
2026-01-18 06:55:00 +01:00
|
|
|
Derivation Index -> Content Index
|
2026-01-17 11:18:00 +01:00
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Rules:
|
|
|
|
|
|
|
|
|
|
* The Content Index MUST NOT depend on the Structural Index.
|
|
|
|
|
* The Structural Index MUST NOT depend on stored bytes.
|
|
|
|
|
* The Materialization Cache MAY depend on both.
|
2026-01-18 06:55:00 +01:00
|
|
|
* The Derivation Index MAY depend on the Content Index.
|
2026-01-17 11:18:00 +01:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 4. PUT/GET Interaction (Informative)
|
|
|
|
|
|
|
|
|
|
* PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index.
|
|
|
|
|
* GET consults only the Content Index and reads bytes from the store.
|
2026-01-18 06:55:00 +01:00
|
|
|
* The Structural Index, Derivation Index, and Materialization Cache are optional
|
|
|
|
|
optimizations for PUT.
|
|
|
|
|
|
|
|
|
|
Note: versioning relationships are modeled in TGK, not in these indexes.
|
2026-01-17 11:18:00 +01:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## 5. Non-Goals
|
|
|
|
|
|
|
|
|
|
ASL/INDEXES/1 does not define:
|
|
|
|
|
|
|
|
|
|
* Encodings for any index
|
|
|
|
|
* Storage layout or sharding
|
|
|
|
|
* Query operators or traversal semantics
|
2026-01-18 06:55:00 +01:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Changelog
|
|
|
|
|
|
|
|
|
|
- 2026-01-18: Added SID canonicalization rules and DerivationRecord data model.
|