amduat-api/notes/Federation and Cross-Node Deterministic Replay Specification.md

Absolutely — here’s a formal draft for **federation propagation rules and cross-node deterministic replay** for ASL + TGK + PEL. This ensures deterministic, snapshot-safe execution across multiple nodes while preserving artifact provenance and DAG integrity.

---

# Federation and Cross-Node Deterministic Replay Specification

---

## 1. Purpose

This specification defines:

1. How **artifacts, PERs, and TGK edges** are propagated across federation nodes.
2. How **deterministic replay** is guaranteed on remote nodes.
3. How **snapshots, shards, and logs** are used to synchronize state.
4. Rules for **conflict resolution, tombstone handling, and provenance integrity**.

---

## 2. Key Concepts

* **Node**: Independent system with access to ASL/TGK storage and PEL execution environment.
* **Federation**: Set of nodes that share artifacts, execution receipts, and TGK edges.
* **ArtifactKey**: Canonical identifier for artifacts or PERs.
* **SnapshotID**: Unique identifier of a ZFS snapshot (per pool or globally assigned).
* **Log Sequence (logseq)**: Monotonic sequence ensuring ordering for deterministic replay.
* **Execution Receipt (PER)**: Artifact describing the deterministic output of a PEL program.

---

## 3. Propagation Rules

### 3.1 Artifact & PER Propagation

1. **New artifacts or PERs** are assigned a **global canonical ArtifactKey**.
2. Each node maintains a **local shard mapping**; shard boundaries may differ per node.
3. Artifacts are propagated via **snapshot-delta sync**:

   * Only artifacts **logseq > last replicated logseq** are transmitted.
   * Each artifact includes:

     * `ArtifactKey`
     * `logseq`
     * `type_tag` (optional)
     * Payload checksum (hash)
4. PER artifacts are treated the same as raw artifacts but may include additional **PEL DAG metadata**.

---

### 3.2 TGK Edge Propagation

1. TGK edges reference canonical ArtifactKeys and NodeIDs.
2. Each edge includes:

   * From nodes list
   * To nodes list
   * Edge type key
   * Roles (from/to/both)
   * logseq
3. Edges are propagated **incrementally**, respecting snapshot boundaries.
4. Deterministic ordering:

   * Edges sorted by `(logseq, canonical_edge_id)` on transmit
   * Replay nodes consume edges in the same order

---

### 3.3 Snapshot and Log Management

* Each node maintains:

  1. **Last applied snapshot** per federation peer
  2. **Sequential write log** for artifacts and edges
* Replay on a remote node:

  1. Apply artifacts and edges sequentially from log
  2. Only apply artifacts **≤ target snapshot**
  3. Merge multiple logs deterministically via `(logseq, canonical_id)` tie-breaker

---

## 4. Conflict Resolution

1. **ArtifactKey collisions**:

   * If hash matches existing artifact → discard duplicate
   * If hash differs → flag conflict, require manual reconciliation or automated deterministic resolution
2. **TGK edge conflicts**:

   * Multiple edges with same `from/to/type` but different logseq → pick latest ≤ snapshot
   * Shadowed edges handled via **TombstoneShadow operator**
3. **PER replay conflicts**:

   * Identical PEL DAG + identical inputs → skip execution
   * Divergent inputs → log error, optionally recompute

---

## 5. Deterministic Replay Algorithm

```c
void FederationReplay(log_buffer_t *incoming_log, snapshot_range_t target_snapshot) {
    // Sort incoming log deterministically
    sort(incoming_log, by_logseq_then_canonical_id);

    for (uint64_t i = 0; i < incoming_log->count; i++) {
        record_t rec = incoming_log->records[i];

        // Skip artifacts beyond target snapshot
        if (rec.logseq > target_snapshot.logseq_max) continue;

        // Apply artifact or TGK edge
        if (rec.type == ARTIFACT || rec.type == PER) {
            ApplyArtifact(rec);
        } else if (rec.type == TGK_EDGE) {
            ApplyTGKEdge(rec);
        }

        // Shadow tombstones deterministically
        if (rec.is_tombstone) {
            ApplyTombstone(rec.canonical_id, rec.logseq);
        }
    }
}
```

* Guarantees **deterministic replay** across nodes.
* Uses **logseq + canonical ID ordering** for tie-breaking.

---

## 6. Shard-Local Execution

* After federation sync, **local shards** may differ.
* Execution plan operators (SegmentScan, IndexFilter, TGKTraversal) operate **on local shards**.
* Global determinism maintained by:

  * Deterministic merge of shards
  * Snapshot constraints
  * Canonical ordering of artifacts and edges

---

## 7. Provenance and Audit

* Each node maintains:

  * **Snapshot provenance table**: snapshot ID → list of applied artifacts/PERs
  * **Federation log table**: peer node → last applied logseq
* Deterministic execution allows **replay and auditing**:

  * Verify that `final_output` is identical across nodes
  * Provenance tables ensure **full traceability**

---

## 8. Multi-Node DAG Execution

1. PEL programs may span **multiple nodes**:

   * Inputs and intermediate PERs propagated deterministically
   * DAG nodes executed locally when all inputs are available
2. Determinism guaranteed because:

   * Inputs constrained by snapshot + logseq
   * Operators are deterministic
   * Merge, shadowing, and projection preserve canonical ordering

---

## 9. Summary

Federation and cross-node deterministic replay:

* Uses **logseq + canonical IDs** for deterministic ordering
* Supports **PER and TGK artifacts** across nodes
* Enforces **snapshot constraints**
* Enables **federated PEL program execution**
* Preserves **provenance, tombstones, and deterministic DAG evaluation**
* Compatible with SIMD/shard acceleration and ENC-ASL-TGK-INDEX memory layout

---

Next step could be **drafting a formal overall architecture diagram** showing:

* PEL programs
* ASL/TGK storage
* Execution plan operators
* Shard/SIMD execution
* Federation propagation and replay paths

Do you want me to draft that architecture diagram next?