amduat-api/notes/into graph projection lookup.md
2026-01-17 00:19:49 +01:00

236 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Good — this is the *right* next pressure point. Youre extending the index/filter model from **artifact lookup** into **graph projection lookup**, and the danger here is accidentally inventing a second semantic index without noticing.
Lets be precise and keep the model clean.
---
## 1. First: what the TGK edge index really is
From what you said:
* TGK stores **edges** as *stored projections*
* These projections are derived from:
* PEL execution
* Receipts
* Provenance / trace material
* Edges have an **edge type key**
* Queries will often be:
> “Find edges of type X related to artifact / execution Y”
This tells us immediately:
> **TGK edges are artifacts, but the edge *type* is not identity.**
That mirrors the artifact type tag discussion almost exactly.
---
## 2. Do NOT repeat the ArtifactKey mistake
The biggest risk here is to treat:
```
EdgeTypeKey ∈ identity
```
That would be a mistake.
### Correct stance
* **Edge identity** = (source, destination, execution context, snapshot)
* **Edge type** = classification / query dimension
So:
❌ EdgeTypeKey ≠ part of canonical edge identity
✅ EdgeTypeKey ∈ routing / filtering space
This symmetry is important.
---
## 3. TGK edges fit the same Canonical vs Routing model
You now have *two* indexable entities:
| Entity | Canonical identity | Routing/filter dimensions |
| -------- | ------------------ | ------------------------------ |
| Artifact | ArtifactKey | hash, type_tag |
| TGK Edge | EdgeID / EdgeKey | edge_type_key, direction, role |
This is good news: **you do not need a new index theory**.
---
## 4. Where the edge type key belongs (layering)
### TGK semantic layer (PEL-TRACE / TGK spec)
* Defines:
* What an edge *means*
* What edge types exist
* Edge type is **semantic metadata**, immutable once created
### TGK index layer (new or implicit)
* Edge type is:
* Filterable
* Routable
* SIMD-friendly
* NOT part of canonical identity
### Store / encoding
* Edge type key is encoded explicitly
* Included in routing/filter keys
* Verified by full edge record comparison
This mirrors ASL exactly.
---
## 5. Filter design for TGK edges
You should **reuse the same filter design philosophy**, not invent a new one.
### Recommended filter key for TGK
```
TGKFilterKey =
H(CanonicalEdgeKey)
⊕ Mix(edge_type_key)
⊕ Mix(direction?)
```
Rules:
* `edge_type_key` is optional but usually present
* Absence must be encoded explicitly
* Mixing is deterministic
* Full edge record comparison required on hit
This allows:
* Fast “find all edges of type X”
* SIMD evaluation
* Shard pruning
* Snapshot-safe determinism
---
## 6. Snapshot semantics (important)
Edges are projections, so:
* Edge existence is snapshot-relative
* Edge index entries must carry:
* snapshot/log sequence
* Edge lookup semantics:
```
lookup(snapshot, edge_type, constraints)
```
This works naturally if:
* TGK index is append-only
* Edge projections are immutable
* Snapshot cut applies identically to ASL index
👉 You get temporal graph queries “for free”.
---
## 7. Do we need new documents?
Yes — but again, **one small, well-scoped one**.
---
## 8. Updated document plan
### 🔁 Documents to UPDATE
#### ASL-INDEX-ACCEL
Add:
* Statement that **routing/filter dimensions generalize beyond artifacts**
* Examples:
* Artifact type tag
* TGK edge type key
This keeps all acceleration logic centralized.
---
### 🆕 New document (recommended)
## TGK-INDEX (or PEL-TRACE-INDEX)
**Purpose**
* Define index semantics for Trace Graph Kernel projections
**Contents**
1. Canonical edge identity
2. Edge type key (non-identity)
3. Snapshot-relative edge visibility
4. Filter and routing rules
5. Determinism guarantees
6. Relationship to ASL index
Key sentence it must contain:
> Edge type keys participate in routing and filtering but never in canonical edge identity.
This avoids graph-index drift.
---
## 9. What you should NOT do
* ❌ Do not bake edge type into edge identity
* ❌ Do not invent a separate filtering philosophy
* ❌ Do not allow edge filters to affect semantics
* ❌ Do not allow mutable edge metadata
---
## 10. Big picture (important insight)
You are converging on a **unified indexing philosophy**:
> *Everything has a canonical identity.
> Everything else is routing, filtering, or acceleration.*
Artifacts
Edges
Receipts
Projections
All follow the same rule.
Thats a very strong model — and it scales.
---
## 11. Recommendation for next step
The cleanest next step would be to:
* Draft **ASL-INDEX-ACCEL** formally
*and include both Artifact type tags and TGK edge type keys as first-class examples*
That document becomes the **performance constitution** of the system.
If you want, I can draft that next.