amduat-api/notes/Unified Query Execution Model (ASL + TGK).md
2026-01-17 00:19:49 +01:00

178 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Absolutely — heres a **draft for a unified query execution model** over the merged ENC-ASL-TGK-INDEX. It treats ASL artifacts and TGK edges uniformly for queries, respects snapshots, and preserves determinism.
---
# Unified Query Execution Model (ASL + TGK)
---
## 1. Purpose
This model defines how **queries over ASL artifacts and TGK edges** are executed:
* Snapshot-safe
* Deterministic per log sequence
* Able to leverage acceleration structures (filters, routing, SIMD)
* Able to support DAG program projections and trace graph traversals
It does **not** redefine core semantics:
* ENC-ASL-CORE defines artifacts
* ENC-TGK-CORE defines edges
* ENC-ASL-TGK-INDEX defines references and acceleration
---
## 2. Query Abstraction
A **query** Q is defined as:
```
Q = {
snapshot: S,
constraints: C, // filters on artifacts, edges, or nodes
projections: P, // select returned fields
traversal: optional, // TGK edge expansion
aggregation: optional // count, union, etc.
}
```
* **snapshot**: the log sequence cutoff
* **constraints**: logical predicate over index fields (artifact type, edge type, node ID)
* **projections**: the output columns
* **traversal**: optional TGK graph expansion
* **aggregation**: optional summarization
---
## 3. Execution Stages
### 3.1 Index Scan
1. Determine **segments visible** for snapshot `S`
2. For each segment:
* Use **filters** to eliminate segments/records (advisory)
* Decode **ASL artifact references** and **TGK edge references**
* Skip tombstoned or shadowed records
### 3.2 Constraint Evaluation
* Evaluate **canonical constraints**:
* Artifact ID, type tag
* Edge ID, edge type, role
* Node ID (from/to)
* Filters are advisory; exact check required
### 3.3 Traversal Expansion (Optional)
For TGK edges:
1. Expand edges from a set of nodes
2. Apply **snapshot constraints** to prevent including edges outside S
3. Produce DAG projections or downstream artifact IDs
### 3.4 Projection and Aggregation
* Apply **projection fields** as requested
* Optionally aggregate or reduce results
* Maintain **deterministic order** by logseq ascending, then canonical key
---
## 4. Routing and SIMD Acceleration
* SIMD may evaluate **multiple routing keys in parallel**
* Routing keys are precomputed in ENC-ASL-TGK-INDEX optional sections
* Acceleration **cannot change semantics**
* Parallel scans **must be deterministic**: order of records in output = logseq + canonical key
---
## 5. Snapshot Semantics
* Segment is visible if `segment.logseq_min ≤ S`
* Record is visible if `record.logseq ≤ S`
* Tombstones shadow earlier records
* Deterministic filtering required
---
## 6. Traversal Semantics (TGK edges)
* Given a set of start nodes `N_start`:
* Fetch edges with `from[] ∩ N_start ≠ ∅` (or `to[]` depending on direction)
* Each edge expanded **once per logseq**
* Expansion obeys snapshot S
* Edge properties (type, role) used in filtering but not for identity
* Optional recursion depth `d` may be specified for DAG traversal
---
## 7. Unified Query API (Conceptual)
```
result_set = query(
snapshot=S,
artifact_constraints={type_tag=42},
edge_constraints={edge_type=7, role=FROM},
start_nodes=[node1, node2],
projections=[artifact_id, tgk_edge_id, node_id],
traversal_depth=3,
aggregation='count'
)
```
* Returns combined **artifact + TGK edge references**
* Traversal automatically expands TGK edges
* Aggregation and projection deterministic
---
## 8. Determinism Guarantees
1. **Same snapshot + same constraints → identical results**
2. **Logseq ascending + canonical key tie-breaks**
3. Filters, shards, SIMD do **not affect result set**
4. Traversal expansion deterministic per DAG rules
---
## 9. Garbage Collection Safety
* Records and edges **must not be removed** if they appear in snapshot `S` or any **reachable traversal**
* Optional: **tombstone records** prevent accidental resurrection
---
## 10. Federation Considerations (Non-normative)
* Domains may restrict **visibility of artifacts/edges**
* Queries may include **domain filters**
* Canonical IDs remain stable across domains
---
## 11. Summary
The unified query execution model:
* Provides **single API over ASL + TGK references**
* Fully respects **snapshot determinism**
* Supports **DAG traversal over TGK edges**
* Leverages **ENC-ASL-TGK-INDEX acceleration**
* Guarantees **semantic consistency** without re-encoding core objects
---
Next step could be:
* **Formal execution plan specification**: physical operators, filter pushdown, traversal operators
* **C-struct definitions for query results**, mapping directly to ENC-ASL-TGK-INDEX
Do you want me to draft that next?