5.4 KiB
ASL/INDEXES/1 -- Index Taxonomy and Relationships
Status: Draft Owner: Architecture Version: 0.1.0 SoT: No Last Updated: 2025-01-17 Linked Phase Pack: N/A Tags: [indexes, content, structural, materialization]
Document ID: ASL/INDEXES/1
Layer: L2 -- Index taxonomy (no encoding)
Depends on (normative):
ASL/1-CORE-INDEXASL/STORE-INDEX/1
Informative references:
ASL/SYSTEM/1TGK/1ENC/ASL-CORE-INDEX/1
© 2025 Niklas Rydberg.
License
Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.
Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.
0. Conventions
The key words MUST, MUST NOT, REQUIRED, SHOULD, and MAY are to be interpreted as in RFC 2119.
ASL/INDEXES/1 defines index roles and relationships. It does not define encodings or storage layouts.
1. Purpose
This document defines the minimal set of indexes used by ASL systems and their dependency relationships.
2. Index Taxonomy (Normative)
ASL systems use three distinct indexes:
2.1 Content Index
Purpose: map semantic identity to bytes.
ArtifactKey -> ArtifactLocation
Properties:
- Snapshot-relative and append-only
- Deterministic replay
- Optional tombstone shadowing
This is the ASL/1-CORE-INDEX and is the only index that governs visibility.
2.2 Structural Identity (SID)
SID is the canonical identity of a derivation, not of bytes.
SID = H(ProgramRef || Inputs[] || ParamsRef || ExecProfile)
Notes:
Inputs[]order is canonical and stable.ParamsRefis optional; absence must be encoded explicitly in the hash.ExecProfilecaptures execution profile/versioning parameters (optional, but presence/absence is part of the SID).
2.2.1 SID Canonicalization (Normative)
Implementations MUST canonicalize SID inputs as follows:
- ProgramRef is encoded as
ReferenceBytes(ENC/ASL1-CORE). - Inputs[] are ordered exactly as declared by the Program DAG inputs.
- ParamsRef is encoded as:
0x00if absent, or0x01 || ReferenceBytesif present.
- ExecProfile is encoded as:
0x00if absent, or0x01 || ExecProfileBytesif present.
- SID hash input is the concatenation of the above fields with no padding.
ExecProfileBytes is an opaque, deterministic byte sequence defined by the
execution environment. Any change in encoding or content MUST change SID.
2.2 Structural Index
Purpose: map structural identity to a derivation DAG node.
SID -> DAG node
Properties:
- Deterministic and rebuildable
- Does not imply materialization
- May be in-memory or persisted
2.3 Derivation Index
Purpose: map a materialized ArtifactKey to the set of known derivations that produce it.
ArtifactKey -> [DerivationRecord]
Where:
DerivationRecord = { SID, ProgramRef, InputRefs[], ParamsRef, ExecProfile }
Properties:
- Authoritative for known derivations
- Recomputable from replay + execution, so storage is optional
- Enables dedup and semantic correlation across multiple derivations
- Multiple SIDs MAY map to the same ArtifactKey
2.3.1 DerivationRecord Data Model (Normative)
Canonical fields:
DerivationRecord {
sid: SID
program_ref: Reference
input_refs: Reference[] // ordered as Program DAG inputs
params_ref: Optional<Reference>
exec_profile: Optional<ExecProfileBytes>
}
Rules:
input_refsorder MUST be preserved.params_refandexec_profileMUST use explicit presence markers as defined in SID canonicalization.- Additional metadata MAY be stored but MUST NOT affect SID or canonical equivalence.
2.4 Materialization Cache
Purpose: record previously materialized content for a structural identity.
SID -> ArtifactKey
Properties:
- Redundant and safe to drop
- Recomputable from DAG + content index
- Pure performance optimization
3. Dependency Rules (Normative)
Dependencies MUST follow this direction:
Structural Index -> Materialization Cache -> Content Index
Derivation Index -> Content Index
Rules:
- The Content Index MUST NOT depend on the Structural Index.
- The Structural Index MUST NOT depend on stored bytes.
- The Materialization Cache MAY depend on both.
- The Derivation Index MAY depend on the Content Index.
4. PUT/GET Interaction (Informative)
- PUT registers structure (if used), resolves to an ArtifactKey, and updates the Content Index.
- GET consults only the Content Index and reads bytes from the store.
- The Structural Index, Derivation Index, and Materialization Cache are optional optimizations for PUT.
Note: versioning relationships are modeled in TGK, not in these indexes.
5. Non-Goals
ASL/INDEXES/1 does not define:
- Encodings for any index
- Storage layout or sharding
- Query operators or traversal semantics
Changelog
- 2026-01-18: Added SID canonicalization rules and DerivationRecord data model.