5.8 KiB
Below is a formal draft of TGK-INDEX, written to exactly mirror the ASL-INDEX-ACCEL framework and to cleanly integrate TGK into the overall model without inventing new semantics.
TGK-INDEX
Trace Graph Kernel Index Semantics and Acceleration
1. Purpose
TGK-INDEX defines the indexing semantics for Trace Graph Kernel (TGK) edges, which represent stored projections derived from:
- PEL execution
- Execution receipts
- Provenance and trace material
This document specifies:
- Canonical identity of TGK edges
- Snapshot-relative visibility
- Index lookup semantics
- Interaction with acceleration mechanisms defined in ASL-INDEX-ACCEL
TGK-INDEX defines what edges exist and how they are observed, not how they are accelerated.
2. Scope
This specification applies to:
- All TGK edge storage
- Edge lookup and traversal
- Stored projections over ASL artifacts and PEL executions
It does not define:
- PEL execution semantics
- Provenance interpretation
- Federation policies
- Storage encoding (see ENC-* documents)
- Acceleration mechanisms (see ASL-INDEX-ACCEL)
3. TGK Edge Model
3.1 TGK Edge
A TGK Edge represents a directed, immutable relationship between two nodes.
Nodes MAY represent:
- Artifacts
- PEL executions
- Receipts
- Abstract graph nodes defined by higher layers
Edges are created only by deterministic projection.
3.2 Canonical Edge Key
Each TGK edge has a Canonical Edge Key, which uniquely identifies the edge.
The Canonical Edge Key MUST include:
- Source node identifier
- Destination node identifier
- Projection context (e.g. PEL execution or receipt identity)
- Edge direction (if not implied)
Properties:
- Defines semantic identity
- Used for equality, shadowing, and tombstones
- Immutable once created
- Fully compared on lookup match
4. Edge Type Key
4.1 Definition
Each TGK edge MAY carry an Edge Type Key, which classifies the edge.
Properties:
- Immutable once edge is created
- Optional, but strongly encouraged
- Does NOT participate in canonical identity
- Used for routing, filtering, and query acceleration
Formal rule:
Edge Type Key is a classification attribute, not an identity attribute.
4.2 Absence Encoding
If an edge has no Edge Type Key, this absence MUST be explicitly encoded and observable to the index.
5. Snapshot Semantics
5.1 Snapshot-Relative Visibility
TGK edges are snapshot-relative.
An edge is visible in snapshot S if and only if:
- The edge creation log entry has
LogSeq ≤ S - The edge is not shadowed by a later tombstone with
LogSeq ≤ S
5.2 Determinism
Given the same snapshot and input state:
- The visible TGK edge set MUST be identical
- Lookup and traversal MUST be deterministic
6. TGK Index Semantics
6.1 Logical Index Definition
The TGK logical index maps:
(snapshot, CanonicalEdgeKey) → EdgeRecord | ⊥
Rules:
- Newer entries shadow older ones
- Tombstones shadow edges
- Ordering is defined by log sequence
6.2 Lookup by Attributes
Lookup MAY constrain:
- Source node
- Destination node
- Edge Type Key
- Projection context
Such constraints are advisory and MAY be accelerated but MUST be validated by full edge record comparison.
7. Acceleration and Routing
7.1 Canonical vs Routing Keys
TGK indexing follows ASL-INDEX-ACCEL.
- Canonical identity is defined solely by Canonical Edge Key
- Routing Keys are derived and advisory
Routing Keys MAY incorporate:
- Hash of Canonical Edge Key
- Edge Type Key
- Direction or role
7.2 Filters
Filters:
- Are built over Routing Keys
- May include Edge Type Key
- MUST NOT introduce false negatives
- MUST be verified by full edge comparison
7.3 Sharding
Sharding:
- Is observationally invisible
- MAY be based on Routing Keys
- MUST preserve logical index equivalence
7.4 SIMD Execution
SIMD MAY be used to accelerate:
- Filter evaluation
- Routing key comparison
- Edge scanning
SIMD MUST NOT affect semantics.
8. Relationship to ASL Index
TGK indexing:
- Reuses ASL snapshot and log ordering semantics
- May share physical storage and segments with ASL artifacts
- Is governed by the same checkpoint and recovery model
TGK edges MAY reference ASL artifacts across snapshots, subject to provenance constraints.
9. Garbage Collection and Retention
- TGK edges MUST NOT be collected while referenced by any retained snapshot
- Tombstoned edges MAY be reclaimed once unreachable
- Provenance requirements MAY pin edges beyond snapshot reachability
GC policies are store-defined but MUST preserve snapshot safety.
10. Federation Considerations (Non-Normative)
In federated environments:
- Edge visibility MAY differ by domain
- Edge Type Keys MAY be namespaced
- Canonical Edge Keys MUST remain globally stable
Federation rules are defined outside this document.
11. Normative Invariants
- Canonical Edge Key defines identity
- Edge Type Key is not part of identity
- Snapshot visibility is log-sequence based
- Acceleration is advisory only
- Filters must not produce false negatives
- Sharding is observationally invisible
- Determinism is required per snapshot
12. Summary
TGK-INDEX defines a snapshot-safe, deterministic graph index that:
- Treats edges as immutable, indexed entities
- Separates identity from classification
- Scales to billions of edges
- Integrates cleanly with ASL indexing and acceleration
By strictly reusing the Canonical vs Routing framework, TGK avoids semantic drift and ensures that graph queries remain correct regardless of acceleration strategy.
If you want, the next logical step would be to:
- Draft ENC-TGK-INDEX (exact on-disk encoding)
- Or write a unified query model spanning ASL + TGK lookups