5 KiB
ENC-TGK-INDEX
Encoding Specification for TGK Edge Index References
1. Purpose
ENC-TGK-INDEX defines the on-disk encoding for Trace Graph Kernel (TGK) index records, which serve as references to TGK-CORE edges.
- It never encodes edge structure (
from[]/to[]) - It supports filters, sharding, and routing per ASL-INDEX-ACCEL
- Snapshot and log-sequence semantics are maintained for deterministic recovery
2. Layering Principle
- TGK-CORE / ENC-TGK-CORE: authoritative edge structure (
from[] → to[]) - TGK-INDEX: defines canonical keys, routing keys, acceleration logic
- ENC-TGK-INDEX: stores references to TGK-CORE edges and acceleration metadata
Normative statement:
ENC-TGK-INDEX encodes only references to TGK-CORE edges and MUST NOT re-encode or reinterpret edge structure.
3. Segment Layout
Segments are immutable and snapshot-bound:
+-----------------------------+
| Segment Header |
+-----------------------------+
| Routing Filters |
+-----------------------------+
| TGK Index Records |
+-----------------------------+
| Optional Acceleration Data |
+-----------------------------+
| Segment Footer |
+-----------------------------+
- Segment atomicity is enforced
- Footer checksum guarantees completeness
4. Segment Header
struct tgk_index_segment_header {
uint32_t magic; // 'TGKI'
uint16_t version; // encoding version
uint16_t flags; // segment flags
uint64_t segment_id; // unique per dataset
uint64_t logseq_min; // inclusive
uint64_t logseq_max; // inclusive
uint64_t record_count; // number of index records
uint64_t record_area_offset; // bytes from segment start
uint64_t footer_offset; // bytes from segment start
};
logseq_min/logseq_maxenforce snapshot visibility
5. Routing Filters
Filters are optional but recommended:
struct tgk_index_filter_header {
uint16_t filter_type; // e.g., BLOOM, XOR, RIBBON
uint16_t version;
uint32_t flags;
uint64_t size_bytes; // length of filter payload
};
-
Filters operate on routing keys, not canonical edge IDs
-
Routing keys may include:
- Edge type key
- Projection context
- Direction or role
-
False positives allowed; false negatives forbidden
6. TGK Index Record
Each record references a single TGK-CORE edge:
struct tgk_index_record {
uint64_t logseq; // creation log sequence
uint64_t tgk_edge_id; // reference to ENC-TGK-CORE edge
uint32_t edge_type_key; // optional classification
uint8_t has_edge_type; // 0 or 1
uint8_t role; // optional: from / to / both
uint16_t flags; // tombstone, reserved
};
tgk_edge_idis the canonical key- No
from[]/to[]fields exist here - Edge identity is solely TGK-CORE edge ID
Flags:
| Flag | Meaning |
|---|---|
TGK_INDEX_TOMBSTONE |
Shadows previous record |
TGK_INDEX_RESERVED |
Future use |
7. Optional Node-Projection Records (Acceleration Only)
For node-centric queries, optional records may map:
struct tgk_node_edge_ref {
uint64_t logseq;
uint64_t node_id;
uint64_t tgk_edge_id;
uint8_t position; // from or to
};
- Derivable from TGK-CORE edges
- Optional; purely for acceleration
- Must not affect semantics
8. Sharding and SIMD
- Shard assignment: via routing keys, not index semantics
- SIMD-optimized arrays may exist in optional acceleration sections
- Must be deterministic and immutable
- Must follow ASL-INDEX-ACCEL invariants
9. Snapshot Interaction
At snapshot S:
- Segment visible if
logseq_min ≤ S - Record visible if
logseq ≤ S - Tombstones shadow earlier records
Lookup Algorithm:
- Filter by snapshot
- Evaluate routing/filter keys (advisory)
- Confirm canonical key match with
tgk_edge_id
10. Segment Footer
struct tgk_index_segment_footer {
uint64_t checksum; // covers header + filters + records
uint64_t record_bytes; // size of record area
uint64_t filter_bytes; // size of filter area
};
- Ensures atomicity and completeness
11. Normative Invariants
- Edge identity = TGK-CORE edge ID
- Edge Type Key is not part of identity
- Filters are advisory only
- Sharding is observationally invisible
- Index records are immutable
- Snapshot visibility strictly follows
logseq - Determinism guaranteed per snapshot
12. Summary
ENC-TGK-INDEX:
- References TGK-CORE edges without re-encoding structure
- Supports snapshot-safe, deterministic lookup
- Enables filter, shard, and SIMD acceleration
- Preserves TGK-CORE semantics strictly
This design fully respects layering and prevents accidental semantic duplication, while allowing scalable, high-performance indexing.