273 lines
5.5 KiB
Markdown
273 lines
5.5 KiB
Markdown
|
|
# ASL/INDEX-ACCEL/1 — Index Acceleration Semantics
|
||
|
|
|
||
|
|
Status: Draft
|
||
|
|
Owner: Niklas Rydberg
|
||
|
|
Version: 0.1.0
|
||
|
|
SoT: No
|
||
|
|
Last Updated: 2025-11-16
|
||
|
|
Tags: [deterministic, index, acceleration]
|
||
|
|
|
||
|
|
**Document ID:** `ASL/INDEX-ACCEL/1`
|
||
|
|
**Layer:** L1 — Acceleration rules over index semantics (no storage / encoding)
|
||
|
|
|
||
|
|
**Depends on (normative):**
|
||
|
|
|
||
|
|
* `ASL/1-CORE-INDEX`
|
||
|
|
|
||
|
|
**Informative references:**
|
||
|
|
|
||
|
|
* `ASL-STORE-INDEX` — store lifecycle and replay contracts
|
||
|
|
* `ENC-ASL-CORE-INDEX` — bytes-on-disk encoding profile (`tier1/enc-asl-core-index.md`)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 0. Conventions
|
||
|
|
|
||
|
|
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHOULD**, and **MAY** are to be interpreted as in RFC 2119.
|
||
|
|
|
||
|
|
ASL/INDEX-ACCEL/1 defines **acceleration semantics only**. It MUST NOT change index meaning defined by ASL/1-CORE-INDEX.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 1. Purpose
|
||
|
|
|
||
|
|
ASL/INDEX-ACCEL/1 defines **acceleration mechanisms** used by ASL-based indexes, including:
|
||
|
|
|
||
|
|
* Routing keys
|
||
|
|
* Sharding
|
||
|
|
* Filters (Bloom, XOR, Ribbon, etc.)
|
||
|
|
* SIMD execution
|
||
|
|
* Hash recasting
|
||
|
|
|
||
|
|
All mechanisms defined herein are **observationally invisible** to ASL/1-CORE-INDEX semantics.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. Scope
|
||
|
|
|
||
|
|
Applies to:
|
||
|
|
|
||
|
|
* Artifact indexes (ASL)
|
||
|
|
* Projection and graph indexes (e.g., TGK)
|
||
|
|
* Any index layered on ASL/1-CORE-INDEX semantics
|
||
|
|
|
||
|
|
Does **not** define:
|
||
|
|
|
||
|
|
* Artifact or edge identity
|
||
|
|
* Snapshot semantics
|
||
|
|
* Storage lifecycle
|
||
|
|
* Encoding details
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. Canonical Key vs Routing Key
|
||
|
|
|
||
|
|
### 3.1 Canonical Key
|
||
|
|
|
||
|
|
The **Canonical Key** uniquely identifies an indexable entity.
|
||
|
|
|
||
|
|
Examples:
|
||
|
|
|
||
|
|
* Artifact: `Reference`
|
||
|
|
* TGK Edge: `CanonicalEdgeKey`
|
||
|
|
|
||
|
|
Properties:
|
||
|
|
|
||
|
|
* Defines semantic identity
|
||
|
|
* Used for equality, shadowing, and tombstones
|
||
|
|
* Stable and immutable
|
||
|
|
* Fully compared on index match
|
||
|
|
|
||
|
|
### 3.2 Routing Key
|
||
|
|
|
||
|
|
The **Routing Key** is a **derived, advisory key** used exclusively for acceleration.
|
||
|
|
|
||
|
|
Properties:
|
||
|
|
|
||
|
|
* Derived deterministically from Canonical Key and optional attributes
|
||
|
|
* MAY be used for sharding, filters, SIMD layouts
|
||
|
|
* MUST NOT affect index semantics
|
||
|
|
* MUST be verified by full Canonical Key comparison on match
|
||
|
|
|
||
|
|
Formal rule:
|
||
|
|
|
||
|
|
```
|
||
|
|
CanonicalKey determines correctness
|
||
|
|
RoutingKey determines performance
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Filter Semantics
|
||
|
|
|
||
|
|
### 4.1 Advisory Nature
|
||
|
|
|
||
|
|
All filters are **advisory only**.
|
||
|
|
|
||
|
|
Rules:
|
||
|
|
|
||
|
|
* False positives are permitted
|
||
|
|
* False negatives are forbidden
|
||
|
|
* Filter behavior MUST NOT affect correctness
|
||
|
|
|
||
|
|
Invariant:
|
||
|
|
|
||
|
|
```
|
||
|
|
Filter miss => key is definitely absent
|
||
|
|
Filter hit => key may be present
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.2 Filter Inputs
|
||
|
|
|
||
|
|
Filters operate over **Routing Keys**, not Canonical Keys.
|
||
|
|
|
||
|
|
A Routing Key MAY incorporate:
|
||
|
|
|
||
|
|
* Hash of Canonical Key
|
||
|
|
* Artifact type tag (if present)
|
||
|
|
* TGK edge type key
|
||
|
|
* Direction, role, or other immutable classification attributes
|
||
|
|
|
||
|
|
Absence of optional attributes MUST be encoded explicitly.
|
||
|
|
|
||
|
|
### 4.3 Filter Construction
|
||
|
|
|
||
|
|
* Filters are built only over **sealed, immutable segments**
|
||
|
|
* Filters are immutable once built
|
||
|
|
* Filter construction MUST be deterministic
|
||
|
|
* Filter state MUST be covered by segment checksums
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. Sharding Semantics
|
||
|
|
|
||
|
|
### 5.1 Observational Invisibility
|
||
|
|
|
||
|
|
Sharding is a **mechanical partitioning** of the index.
|
||
|
|
|
||
|
|
Invariant:
|
||
|
|
|
||
|
|
```
|
||
|
|
LogicalIndex = union(all shards)
|
||
|
|
```
|
||
|
|
|
||
|
|
Rules:
|
||
|
|
|
||
|
|
* Shards MUST NOT affect lookup results
|
||
|
|
* Shard count and boundaries may change over time
|
||
|
|
* Rebalancing MUST preserve lookup semantics
|
||
|
|
|
||
|
|
### 5.2 Shard Assignment
|
||
|
|
|
||
|
|
Shard assignment MAY be based on:
|
||
|
|
|
||
|
|
* Hash of Canonical Key
|
||
|
|
* Routing Key
|
||
|
|
* Composite routing strategies
|
||
|
|
|
||
|
|
Shard selection MUST be deterministic per snapshot.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Hashing and Hash Recasting
|
||
|
|
|
||
|
|
### 6.1 Hashing
|
||
|
|
|
||
|
|
Hashes MAY be used for routing, filtering, or SIMD layout.
|
||
|
|
|
||
|
|
Hashes MUST NOT be treated as identity.
|
||
|
|
|
||
|
|
### 6.2 Hash Recasting
|
||
|
|
|
||
|
|
Hash recasting (changing hash functions or seeds) is permitted if:
|
||
|
|
|
||
|
|
1. It is deterministic
|
||
|
|
2. It does not change Canonical Keys
|
||
|
|
3. It does not affect index semantics
|
||
|
|
|
||
|
|
Recasting is equivalent to rebuilding acceleration structures.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. SIMD Execution
|
||
|
|
|
||
|
|
SIMD operations MAY be used to:
|
||
|
|
|
||
|
|
* Evaluate filters
|
||
|
|
* Compare routing keys
|
||
|
|
* Accelerate scans
|
||
|
|
|
||
|
|
Rules:
|
||
|
|
|
||
|
|
* SIMD must operate only on immutable data
|
||
|
|
* SIMD must not short-circuit semantic checks
|
||
|
|
* SIMD must preserve deterministic behavior
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Multi-Dimensional Routing Examples (Normative)
|
||
|
|
|
||
|
|
### 8.1 Artifact Index
|
||
|
|
|
||
|
|
* Canonical Key: `Reference`
|
||
|
|
* Routing Key components:
|
||
|
|
|
||
|
|
* `H(Reference)`
|
||
|
|
* `type_tag` (if present)
|
||
|
|
* `has_typetag`
|
||
|
|
|
||
|
|
### 8.2 TGK Edge Index
|
||
|
|
|
||
|
|
* Canonical Key: `CanonicalEdgeKey`
|
||
|
|
* Routing Key components:
|
||
|
|
|
||
|
|
* `H(CanonicalEdgeKey)`
|
||
|
|
* `edge_type_key`
|
||
|
|
* Direction or role (optional)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 9. Snapshot Interaction
|
||
|
|
|
||
|
|
Acceleration structures:
|
||
|
|
|
||
|
|
* MUST respect snapshot visibility rules
|
||
|
|
* MUST operate over the same sealed segments visible to the snapshot
|
||
|
|
* MUST NOT bypass tombstones or shadowing
|
||
|
|
|
||
|
|
Snapshot cuts apply **after** routing and filtering.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 10. Normative Invariants
|
||
|
|
|
||
|
|
1. Canonical Keys define identity and correctness
|
||
|
|
2. Routing Keys are advisory only
|
||
|
|
3. Filters may never introduce false negatives
|
||
|
|
4. Sharding is observationally invisible
|
||
|
|
5. Hashes are not identity
|
||
|
|
6. SIMD is an execution strategy, not a semantic construct
|
||
|
|
7. All acceleration is deterministic per snapshot
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 11. Non-Goals
|
||
|
|
|
||
|
|
ASL/INDEX-ACCEL/1 does not define:
|
||
|
|
|
||
|
|
* Specific filter algorithms
|
||
|
|
* Memory layout
|
||
|
|
* CPU instruction selection
|
||
|
|
* Encoding formats
|
||
|
|
* Federation policies
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 12. Summary
|
||
|
|
|
||
|
|
ASL/INDEX-ACCEL/1 establishes a strict contract:
|
||
|
|
|
||
|
|
> All acceleration exists to make the index faster, never different.
|
||
|
|
|
||
|
|
It formalizes Canonical vs Routing keys and constrains filters, sharding, hashing, and SIMD so that correctness is preserved under all optimizations.
|