297 lines
6.6 KiB
Markdown
297 lines
6.6 KiB
Markdown
# ASL/INDEX-ACCEL/1 — Index Acceleration Semantics
|
|
|
|
Status: Draft
|
|
Owner: Niklas Rydberg
|
|
Version: 0.1.0
|
|
SoT: No
|
|
Last Updated: 2025-11-16
|
|
Linked Phase Pack: N/A
|
|
Tags: [deterministic, index, acceleration]
|
|
|
|
<!-- Source: /amduat-api/tier1/asl-index-accel-1.md | Canonical: /amduat/tier1/asl-index-accel-1.md -->
|
|
|
|
**Document ID:** `ASL/INDEX-ACCEL/1`
|
|
**Layer:** L1 — Acceleration rules over index semantics (no storage / encoding)
|
|
|
|
**Depends on (normative):**
|
|
|
|
* `ASL/1-CORE-INDEX`
|
|
* `ASL/LOG/1`
|
|
|
|
**Informative references:**
|
|
|
|
* `ASL/STORE-INDEX/1` — store lifecycle and replay contracts
|
|
* `ENC/ASL-CORE-INDEX/1` — bytes-on-disk encoding profile
|
|
* `TGK/1` — TGK semantics and visibility alignment
|
|
* `TGK/1-CORE` — EdgeBody and EdgeTypeId definitions
|
|
|
|
© 2025 Niklas Rydberg.
|
|
|
|
## License
|
|
|
|
Except where otherwise noted, this document (text and diagrams) is licensed under
|
|
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
|
|
|
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
|
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
|
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
|
specifications.
|
|
|
|
Code examples in this document are provided under the Apache License 2.0 unless
|
|
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
|
public domain under CC0 1.0.
|
|
|
|
---
|
|
|
|
## 0. Conventions
|
|
|
|
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
|
|
|
ASL/INDEX-ACCEL/1 defines **acceleration semantics only**. It MUST NOT change index meaning defined by ASL/1-CORE-INDEX.
|
|
|
|
---
|
|
|
|
## 1. Purpose
|
|
|
|
ASL/INDEX-ACCEL/1 defines **acceleration mechanisms** used by ASL-based indexes, including:
|
|
|
|
* Routing keys
|
|
* Sharding
|
|
* Filters (Bloom, XOR, Ribbon, etc.)
|
|
* SIMD execution
|
|
* Hash recasting
|
|
|
|
All mechanisms defined herein are **observationally invisible** to ASL/1-CORE-INDEX semantics.
|
|
|
|
---
|
|
|
|
## 2. Scope
|
|
|
|
Applies to:
|
|
|
|
* Artifact indexes (ASL)
|
|
* Projection and graph indexes (e.g., TGK)
|
|
* Any index layered on ASL/1-CORE-INDEX semantics
|
|
|
|
Does **not** define:
|
|
|
|
* Artifact or edge identity
|
|
* Snapshot semantics
|
|
* Storage lifecycle
|
|
* Encoding details
|
|
|
|
---
|
|
|
|
## 3. Canonical Key vs Routing Key
|
|
|
|
### 3.1 Canonical Key
|
|
|
|
The **Canonical Key** uniquely identifies an indexable entity.
|
|
|
|
Examples:
|
|
|
|
* Artifact: `Reference`
|
|
* TGK Edge: canonical key defined by `TGK/1` and `TGK/1-CORE` (opaque here)
|
|
|
|
Properties:
|
|
|
|
* Defines semantic identity
|
|
* Used for equality, shadowing, and tombstones
|
|
* Stable and immutable
|
|
* Fully compared on index match
|
|
|
|
### 3.2 Routing Key
|
|
|
|
The **Routing Key** is a **derived, advisory key** used exclusively for acceleration.
|
|
|
|
Properties:
|
|
|
|
* Derived deterministically from Canonical Key and optional attributes
|
|
* MAY be used for sharding, filters, SIMD layouts
|
|
* MUST NOT affect index semantics
|
|
* MUST be verified by full Canonical Key comparison on match
|
|
|
|
Formal rule:
|
|
|
|
```
|
|
CanonicalKey determines correctness
|
|
RoutingKey determines performance
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Filter Semantics
|
|
|
|
### 4.1 Advisory Nature
|
|
|
|
All filters are **advisory only**.
|
|
|
|
Rules:
|
|
|
|
* False positives are permitted
|
|
* False negatives are forbidden
|
|
* Filter behavior MUST NOT affect correctness
|
|
|
|
Invariant:
|
|
|
|
```
|
|
Filter miss => key is definitely absent
|
|
Filter hit => key may be present
|
|
```
|
|
|
|
### 4.2 Filter Inputs
|
|
|
|
Filters operate over **Routing Keys**, not Canonical Keys.
|
|
|
|
A Routing Key MAY incorporate:
|
|
|
|
* Hash of Canonical Key
|
|
* Artifact type tag (if present)
|
|
* TGK `EdgeTypeId` or other immutable classification attributes (TGK/1-CORE)
|
|
* Direction, role, or other immutable classification attributes
|
|
|
|
Absence of optional attributes MUST be encoded explicitly.
|
|
|
|
### 4.3 Filter Construction
|
|
|
|
* Filters are built only over **sealed, immutable segments**
|
|
* Filters are immutable once built
|
|
* Filter construction MUST be deterministic
|
|
* Filter state MUST be covered by segment checksums
|
|
* Filters SHOULD be snapshot-scoped or versioned with their segment to avoid
|
|
unbounded false-positive accumulation over time
|
|
|
|
---
|
|
|
|
## 5. Sharding Semantics
|
|
|
|
### 5.1 Observational Invisibility
|
|
|
|
Sharding is a **mechanical partitioning** of the index.
|
|
|
|
Invariant:
|
|
|
|
```
|
|
LogicalIndex = union(all shards)
|
|
```
|
|
|
|
Rules:
|
|
|
|
* Shards MUST NOT affect lookup results
|
|
* Shard count and boundaries may change over time
|
|
* Rebalancing MUST preserve lookup semantics
|
|
|
|
### 5.2 Shard Assignment
|
|
|
|
Shard assignment MAY be based on:
|
|
|
|
* Hash of Canonical Key
|
|
* Routing Key
|
|
* Composite routing strategies
|
|
|
|
Shard selection MUST be deterministic per snapshot.
|
|
|
|
---
|
|
|
|
## 6. Hashing and Hash Recasting
|
|
|
|
### 6.1 Hashing
|
|
|
|
Hashes MAY be used for routing, filtering, or SIMD layout.
|
|
|
|
Hashes MUST NOT be treated as identity.
|
|
|
|
### 6.2 Hash Recasting
|
|
|
|
Hash recasting (changing hash functions or seeds) is permitted if:
|
|
|
|
1. It is deterministic
|
|
2. It does not change Canonical Keys
|
|
3. It does not affect index semantics
|
|
|
|
Recasting is equivalent to rebuilding acceleration structures.
|
|
|
|
---
|
|
|
|
## 7. SIMD Execution
|
|
|
|
SIMD operations MAY be used to:
|
|
|
|
* Evaluate filters
|
|
* Compare routing keys
|
|
* Accelerate scans
|
|
|
|
Rules:
|
|
|
|
* SIMD must operate only on immutable data
|
|
* SIMD must not short-circuit semantic checks
|
|
* SIMD must preserve deterministic behavior
|
|
|
|
---
|
|
|
|
## 8. Multi-Dimensional Routing Examples (Normative)
|
|
|
|
### 8.1 Artifact Index
|
|
|
|
* Canonical Key: `Reference`
|
|
* Routing Key components:
|
|
|
|
* `H(Reference)`
|
|
* `type_tag` (if present)
|
|
* `has_typetag`
|
|
|
|
### 8.2 TGK Edge Index
|
|
|
|
* Canonical Key: defined by `TGK/1` and `TGK/1-CORE` (opaque here)
|
|
* Routing Key components:
|
|
|
|
* `H(CanonicalEdgeKey)`
|
|
* `EdgeTypeId` (if present in the TGK profile)
|
|
* Direction or role (optional)
|
|
|
|
---
|
|
|
|
## 9. Snapshot Interaction
|
|
|
|
Acceleration structures:
|
|
|
|
* MUST respect snapshot visibility rules
|
|
* MUST operate over the same sealed segments visible to the snapshot
|
|
* MUST NOT bypass tombstones or shadowing
|
|
|
|
Snapshot cuts apply **after** routing and filtering.
|
|
|
|
---
|
|
|
|
## 10. Normative Invariants
|
|
|
|
1. Canonical Keys define identity and correctness
|
|
2. Routing Keys are advisory only
|
|
3. Filters may never introduce false negatives
|
|
4. Sharding is observationally invisible
|
|
5. Hashes are not identity
|
|
6. SIMD is an execution strategy, not a semantic construct
|
|
7. All acceleration is deterministic per snapshot
|
|
|
|
---
|
|
|
|
## 11. Non-Goals
|
|
|
|
ASL/INDEX-ACCEL/1 does not define:
|
|
|
|
* Specific filter algorithms
|
|
* Memory layout
|
|
* CPU instruction selection
|
|
* Encoding formats
|
|
* Federation policies
|
|
|
|
---
|
|
|
|
## 12. Summary
|
|
|
|
ASL/INDEX-ACCEL/1 establishes a strict contract:
|
|
|
|
> All acceleration exists to make the index faster, never different.
|
|
|
|
It formalizes Canonical vs Routing keys and constrains filters, sharding, hashing, and SIMD so that correctness is preserved under all optimizations.
|