5.7 KiB
Absolutely — here’s a draft for ENC-ASL-TGK-INDEX, carefully merging ASL artifact indexes and TGK edge indexes while respecting the separation of concerns and snapshot determinism.
This design keeps ENC-ASL-CORE and ENC-TGK-CORE authoritative, and only merges index references and acceleration structures.
ENC-ASL-TGK-INDEX
Merged On-Disk Index for ASL Artifacts and TGK Edges
1. Purpose
ENC-ASL-TGK-INDEX defines a unified on-disk index that:
- References ASL artifacts (ENC-ASL-CORE)
- References TGK edges (ENC-TGK-CORE)
- Supports routing keys, filters, sharding, SIMD acceleration per ASL-INDEX-ACCEL
- Preserves snapshot safety, log-sequence ordering, and immutability
Semantic data lives in the respective CORE layers; this index layer only stores references.
2. Layering Principle
| Layer | Responsibility |
|---|---|
| ENC-ASL-CORE | Artifact structure and type tags |
| ENC-TGK-CORE | Edge structure (from[] → to[]) |
| TGK-INDEX / ASL-INDEX | Canonical & routing keys, index semantics |
| ENC-ASL-TGK-INDEX | On-disk references and acceleration metadata |
Invariant: This index never re-encodes artifacts or edges.
3. Segment Layout
Segments are append-only and snapshot-bound:
+-----------------------------+
| Segment Header |
+-----------------------------+
| Routing Filters |
+-----------------------------+
| ASL Artifact Index Records |
+-----------------------------+
| TGK Edge Index Records |
+-----------------------------+
| Optional Acceleration Data |
+-----------------------------+
| Segment Footer |
+-----------------------------+
- Segment atomicity enforced
- Footer checksum guarantees integrity
4. Segment Header
struct asl_tgk_index_segment_header {
uint32_t magic; // 'ATXI'
uint16_t version;
uint16_t flags;
uint64_t segment_id;
uint64_t logseq_min;
uint64_t logseq_max;
uint64_t asl_record_count;
uint64_t tgk_record_count;
uint64_t record_area_offset;
uint64_t footer_offset;
};
logseq_*enforce snapshot visibility- Separate counts for ASL and TGK entries
5. Routing Filters
Filters may be segmented by type:
- ASL filters: artifact hash + type tag
- TGK filters: canonical edge ID + edge type key + optional role
struct asl_tgk_filter_header {
uint16_t filter_type; // e.g., BLOOM, XOR
uint16_t version;
uint32_t flags;
uint64_t size_bytes; // length of filter payload
};
- Filters are advisory; false positives allowed, false negatives forbidden
- Must be deterministic per snapshot
6. ASL Artifact Index Record
struct asl_index_record {
uint64_t logseq;
uint64_t artifact_id; // ENC-ASL-CORE reference
uint32_t type_tag; // optional
uint8_t has_type_tag; // 0 or 1
uint16_t flags; // tombstone, reserved
};
artifact_id= canonical identity- No artifact payload here
7. TGK Edge Index Record
struct tgk_index_record {
uint64_t logseq;
uint64_t tgk_edge_id; // ENC-TGK-CORE reference
uint32_t edge_type_key; // optional
uint8_t has_edge_type;
uint8_t role; // optional from/to/both
uint16_t flags; // tombstone, reserved
};
tgk_edge_id= canonical TGK-CORE edge ID- No node lists stored in index
8. Optional Node-Projection Records
For acceleration:
struct node_edge_ref {
uint64_t logseq;
uint64_t node_id; // from/to node
uint64_t tgk_edge_id;
uint8_t position; // from or to
};
- Fully derivable from TGK-CORE edges
- Optional; purely for lookup speed
9. Sharding and SIMD
- Shard assignment is routing key based (ASL artifact or TGK edge)
- SIMD arrays may store precomputed routing keys for fast filter evaluation
- Must follow ASL-INDEX-ACCEL invariants: deterministic, immutable, snapshot-safe
10. Snapshot Interaction
At snapshot S:
- Segment visible if
logseq_min ≤ S - ASL or TGK record visible if
logseq ≤ S - Tombstones shadow earlier records
- Filters may be used as advisory before canonical verification
11. Segment Footer
struct asl_tgk_index_segment_footer {
uint64_t checksum; // covers header, filters, records
uint64_t asl_record_bytes;
uint64_t tgk_record_bytes;
uint64_t filter_bytes;
};
- Ensures atomicity and completeness
12. Normative Invariants
- ASL artifact identity = ENC-ASL-CORE artifact ID
- TGK edge identity = ENC-TGK-CORE edge ID
- Edge type tag and artifact type tag do not affect canonical identity
- Filters are advisory only; no false negatives
- Sharding is observationally invisible
- Index records are immutable once written
- Snapshot visibility strictly follows
logseq - Determinism guaranteed per snapshot
13. Summary
ENC-ASL-TGK-INDEX merges ASL artifacts and TGK edges into a single, snapshot-safe, acceleration-friendly index layer:
- Keeps core semantics authoritative
- Enables high-performance lookups using routing, sharding, SIMD, and filters
- Preserves immutability and determinism
- Fully compatible with ASL-INDEX-ACCEL principles
This design supports billions of references while avoiding semantic collisions between ASL and TGK layers.
If you want, the next step could be drafting a unified query execution model over this merged index, connecting artifact lookups and TGK graph traversals in a snapshot-safe, deterministic way.
Do you want me to do that next?