178 lines
4.6 KiB
Markdown
178 lines
4.6 KiB
Markdown
Absolutely — here’s a **draft for a unified query execution model** over the merged ENC-ASL-TGK-INDEX. It treats ASL artifacts and TGK edges uniformly for queries, respects snapshots, and preserves determinism.
|
||
|
||
---
|
||
|
||
# Unified Query Execution Model (ASL + TGK)
|
||
|
||
---
|
||
|
||
## 1. Purpose
|
||
|
||
This model defines how **queries over ASL artifacts and TGK edges** are executed:
|
||
|
||
* Snapshot-safe
|
||
* Deterministic per log sequence
|
||
* Able to leverage acceleration structures (filters, routing, SIMD)
|
||
* Able to support DAG program projections and trace graph traversals
|
||
|
||
It does **not** redefine core semantics:
|
||
|
||
* ENC-ASL-CORE defines artifacts
|
||
* ENC-TGK-CORE defines edges
|
||
* ENC-ASL-TGK-INDEX defines references and acceleration
|
||
|
||
---
|
||
|
||
## 2. Query Abstraction
|
||
|
||
A **query** Q is defined as:
|
||
|
||
```
|
||
Q = {
|
||
snapshot: S,
|
||
constraints: C, // filters on artifacts, edges, or nodes
|
||
projections: P, // select returned fields
|
||
traversal: optional, // TGK edge expansion
|
||
aggregation: optional // count, union, etc.
|
||
}
|
||
```
|
||
|
||
* **snapshot**: the log sequence cutoff
|
||
* **constraints**: logical predicate over index fields (artifact type, edge type, node ID)
|
||
* **projections**: the output columns
|
||
* **traversal**: optional TGK graph expansion
|
||
* **aggregation**: optional summarization
|
||
|
||
---
|
||
|
||
## 3. Execution Stages
|
||
|
||
### 3.1 Index Scan
|
||
|
||
1. Determine **segments visible** for snapshot `S`
|
||
2. For each segment:
|
||
|
||
* Use **filters** to eliminate segments/records (advisory)
|
||
* Decode **ASL artifact references** and **TGK edge references**
|
||
* Skip tombstoned or shadowed records
|
||
|
||
### 3.2 Constraint Evaluation
|
||
|
||
* Evaluate **canonical constraints**:
|
||
|
||
* Artifact ID, type tag
|
||
* Edge ID, edge type, role
|
||
* Node ID (from/to)
|
||
* Filters are advisory; exact check required
|
||
|
||
### 3.3 Traversal Expansion (Optional)
|
||
|
||
For TGK edges:
|
||
|
||
1. Expand edges from a set of nodes
|
||
2. Apply **snapshot constraints** to prevent including edges outside S
|
||
3. Produce DAG projections or downstream artifact IDs
|
||
|
||
### 3.4 Projection and Aggregation
|
||
|
||
* Apply **projection fields** as requested
|
||
* Optionally aggregate or reduce results
|
||
* Maintain **deterministic order** by logseq ascending, then canonical key
|
||
|
||
---
|
||
|
||
## 4. Routing and SIMD Acceleration
|
||
|
||
* SIMD may evaluate **multiple routing keys in parallel**
|
||
* Routing keys are precomputed in ENC-ASL-TGK-INDEX optional sections
|
||
* Acceleration **cannot change semantics**
|
||
* Parallel scans **must be deterministic**: order of records in output = logseq + canonical key
|
||
|
||
---
|
||
|
||
## 5. Snapshot Semantics
|
||
|
||
* Segment is visible if `segment.logseq_min ≤ S`
|
||
* Record is visible if `record.logseq ≤ S`
|
||
* Tombstones shadow earlier records
|
||
* Deterministic filtering required
|
||
|
||
---
|
||
|
||
## 6. Traversal Semantics (TGK edges)
|
||
|
||
* Given a set of start nodes `N_start`:
|
||
|
||
* Fetch edges with `from[] ∩ N_start ≠ ∅` (or `to[]` depending on direction)
|
||
* Each edge expanded **once per logseq**
|
||
* Expansion obeys snapshot S
|
||
* Edge properties (type, role) used in filtering but not for identity
|
||
|
||
* Optional recursion depth `d` may be specified for DAG traversal
|
||
|
||
---
|
||
|
||
## 7. Unified Query API (Conceptual)
|
||
|
||
```
|
||
result_set = query(
|
||
snapshot=S,
|
||
artifact_constraints={type_tag=42},
|
||
edge_constraints={edge_type=7, role=FROM},
|
||
start_nodes=[node1, node2],
|
||
projections=[artifact_id, tgk_edge_id, node_id],
|
||
traversal_depth=3,
|
||
aggregation='count'
|
||
)
|
||
```
|
||
|
||
* Returns combined **artifact + TGK edge references**
|
||
* Traversal automatically expands TGK edges
|
||
* Aggregation and projection deterministic
|
||
|
||
---
|
||
|
||
## 8. Determinism Guarantees
|
||
|
||
1. **Same snapshot + same constraints → identical results**
|
||
2. **Logseq ascending + canonical key tie-breaks**
|
||
3. Filters, shards, SIMD do **not affect result set**
|
||
4. Traversal expansion deterministic per DAG rules
|
||
|
||
---
|
||
|
||
## 9. Garbage Collection Safety
|
||
|
||
* Records and edges **must not be removed** if they appear in snapshot `S` or any **reachable traversal**
|
||
* Optional: **tombstone records** prevent accidental resurrection
|
||
|
||
---
|
||
|
||
## 10. Federation Considerations (Non-normative)
|
||
|
||
* Domains may restrict **visibility of artifacts/edges**
|
||
* Queries may include **domain filters**
|
||
* Canonical IDs remain stable across domains
|
||
|
||
---
|
||
|
||
## 11. Summary
|
||
|
||
The unified query execution model:
|
||
|
||
* Provides **single API over ASL + TGK references**
|
||
* Fully respects **snapshot determinism**
|
||
* Supports **DAG traversal over TGK edges**
|
||
* Leverages **ENC-ASL-TGK-INDEX acceleration**
|
||
* Guarantees **semantic consistency** without re-encoding core objects
|
||
|
||
---
|
||
|
||
Next step could be:
|
||
|
||
* **Formal execution plan specification**: physical operators, filter pushdown, traversal operators
|
||
* **C-struct definitions for query results**, mapping directly to ENC-ASL-TGK-INDEX
|
||
|
||
Do you want me to draft that next?
|
||
|