amduat-api/notes/Federation and Cross-Node Deterministic Replay Specification.md
2026-01-17 00:19:49 +01:00

194 lines
5.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Absolutely — heres a formal draft for **federation propagation rules and cross-node deterministic replay** for ASL + TGK + PEL. This ensures deterministic, snapshot-safe execution across multiple nodes while preserving artifact provenance and DAG integrity.
---
# Federation and Cross-Node Deterministic Replay Specification
---
## 1. Purpose
This specification defines:
1. How **artifacts, PERs, and TGK edges** are propagated across federation nodes.
2. How **deterministic replay** is guaranteed on remote nodes.
3. How **snapshots, shards, and logs** are used to synchronize state.
4. Rules for **conflict resolution, tombstone handling, and provenance integrity**.
---
## 2. Key Concepts
* **Node**: Independent system with access to ASL/TGK storage and PEL execution environment.
* **Federation**: Set of nodes that share artifacts, execution receipts, and TGK edges.
* **ArtifactKey**: Canonical identifier for artifacts or PERs.
* **SnapshotID**: Unique identifier of a ZFS snapshot (per pool or globally assigned).
* **Log Sequence (logseq)**: Monotonic sequence ensuring ordering for deterministic replay.
* **Execution Receipt (PER)**: Artifact describing the deterministic output of a PEL program.
---
## 3. Propagation Rules
### 3.1 Artifact & PER Propagation
1. **New artifacts or PERs** are assigned a **global canonical ArtifactKey**.
2. Each node maintains a **local shard mapping**; shard boundaries may differ per node.
3. Artifacts are propagated via **snapshot-delta sync**:
* Only artifacts **logseq > last replicated logseq** are transmitted.
* Each artifact includes:
* `ArtifactKey`
* `logseq`
* `type_tag` (optional)
* Payload checksum (hash)
4. PER artifacts are treated the same as raw artifacts but may include additional **PEL DAG metadata**.
---
### 3.2 TGK Edge Propagation
1. TGK edges reference canonical ArtifactKeys and NodeIDs.
2. Each edge includes:
* From nodes list
* To nodes list
* Edge type key
* Roles (from/to/both)
* logseq
3. Edges are propagated **incrementally**, respecting snapshot boundaries.
4. Deterministic ordering:
* Edges sorted by `(logseq, canonical_edge_id)` on transmit
* Replay nodes consume edges in the same order
---
### 3.3 Snapshot and Log Management
* Each node maintains:
1. **Last applied snapshot** per federation peer
2. **Sequential write log** for artifacts and edges
* Replay on a remote node:
1. Apply artifacts and edges sequentially from log
2. Only apply artifacts **≤ target snapshot**
3. Merge multiple logs deterministically via `(logseq, canonical_id)` tie-breaker
---
## 4. Conflict Resolution
1. **ArtifactKey collisions**:
* If hash matches existing artifact → discard duplicate
* If hash differs → flag conflict, require manual reconciliation or automated deterministic resolution
2. **TGK edge conflicts**:
* Multiple edges with same `from/to/type` but different logseq → pick latest ≤ snapshot
* Shadowed edges handled via **TombstoneShadow operator**
3. **PER replay conflicts**:
* Identical PEL DAG + identical inputs → skip execution
* Divergent inputs → log error, optionally recompute
---
## 5. Deterministic Replay Algorithm
```c
void FederationReplay(log_buffer_t *incoming_log, snapshot_range_t target_snapshot) {
// Sort incoming log deterministically
sort(incoming_log, by_logseq_then_canonical_id);
for (uint64_t i = 0; i < incoming_log->count; i++) {
record_t rec = incoming_log->records[i];
// Skip artifacts beyond target snapshot
if (rec.logseq > target_snapshot.logseq_max) continue;
// Apply artifact or TGK edge
if (rec.type == ARTIFACT || rec.type == PER) {
ApplyArtifact(rec);
} else if (rec.type == TGK_EDGE) {
ApplyTGKEdge(rec);
}
// Shadow tombstones deterministically
if (rec.is_tombstone) {
ApplyTombstone(rec.canonical_id, rec.logseq);
}
}
}
```
* Guarantees **deterministic replay** across nodes.
* Uses **logseq + canonical ID ordering** for tie-breaking.
---
## 6. Shard-Local Execution
* After federation sync, **local shards** may differ.
* Execution plan operators (SegmentScan, IndexFilter, TGKTraversal) operate **on local shards**.
* Global determinism maintained by:
* Deterministic merge of shards
* Snapshot constraints
* Canonical ordering of artifacts and edges
---
## 7. Provenance and Audit
* Each node maintains:
* **Snapshot provenance table**: snapshot ID → list of applied artifacts/PERs
* **Federation log table**: peer node → last applied logseq
* Deterministic execution allows **replay and auditing**:
* Verify that `final_output` is identical across nodes
* Provenance tables ensure **full traceability**
---
## 8. Multi-Node DAG Execution
1. PEL programs may span **multiple nodes**:
* Inputs and intermediate PERs propagated deterministically
* DAG nodes executed locally when all inputs are available
2. Determinism guaranteed because:
* Inputs constrained by snapshot + logseq
* Operators are deterministic
* Merge, shadowing, and projection preserve canonical ordering
---
## 9. Summary
Federation and cross-node deterministic replay:
* Uses **logseq + canonical IDs** for deterministic ordering
* Supports **PER and TGK artifacts** across nodes
* Enforces **snapshot constraints**
* Enables **federated PEL program execution**
* Preserves **provenance, tombstones, and deterministic DAG evaluation**
* Compatible with SIMD/shard acceleration and ENC-ASL-TGK-INDEX memory layout
---
Next step could be **drafting a formal overall architecture diagram** showing:
* PEL programs
* ASL/TGK storage
* Execution plan operators
* Shard/SIMD execution
* Federation propagation and replay paths
Do you want me to draft that architecture diagram next?