amduat-api/notes/Unified ASL + TGK + PEL System Specification.md

221 lines
6.5 KiB
Markdown
Raw Normal View History

NOTE: Superseded by tier1 TGK/1 and vendor/amduat/tier1/tgk-1-core.md; retained for historical context.
# Unified ASL + TGK + PEL System Specification (Master Reference)
---
NOTE: Integrated into `tier1/asl-system-1.md`. This note is retained for
historical context and may drift.
## 1. Introduction
This document specifies a unified system for deterministic, federated, snapshot-safe storage and execution of artifacts, execution receipts (PERs), and TGK edges. The system integrates:
* **ASL (Artifact Storage Layer)**
* **TGK (Trace Graph Kernel)**
* **PEL (Program Execution Layer)**
* **Indexing, Shard/SIMD acceleration**
* **Federation and deterministic replay**
The system supports **billions of artifacts and edges**, deterministic DAG execution, and cross-node provenance.
---
## 2. Core Concepts
| Concept | Description |
| ------------ | ------------------------------------------------------------------------------------------------------------- |
| Artifact | Basic unit stored in ASL; may include optional `type_tag` and `has_type_tag`. |
| PER | PEL Execution Receipt; artifact describing deterministic output of a PEL program. |
| TGK Edge | Represents a directed relation between artifacts/PERs. Stores `from_nodes`, `to_nodes`, `edge_type`, `roles`. |
| Snapshot | ZFS snapshot, defines read visibility and deterministic execution boundary. |
| Logseq | Monotonic sequence number for deterministic ordering. |
| Shard | Subset of artifacts/edges partitioned for SIMD/parallel execution. |
| Canonical ID | Unique identifier per artifact, PER, or TGK edge. |
---
## 3. ASL-CORE & ASL-STORE-INDEX
### 3.1 ASL-CORE
* Defines **artifact semantics**:
* Optional `type_tag` (32-bit) with `has_type_tag` (8-bit toggle)
* Artifacts are immutable once written
* PERs are treated as artifacts
### 3.2 ASL-STORE-INDEX
* Manages **artifact blocks**, including:
* Small vs. large blocks (packaging)
* Block sealing, retention, snapshot safety
* Index structure:
* **Shard-local**, supports **billion-scale lookups**
* Bloom filters for quick membership queries
* Sharding and SIMD acceleration for memory-efficient lookups
* Record Layout (C struct):
```c
typedef struct {
uint64_t artifact_key;
uint64_t block_id;
uint32_t offset;
uint32_t length;
uint32_t type_tag;
uint8_t has_type_tag;
} artifact_index_entry_t;
```
---
## 4. ENC-ASL-TGK-INDEX
* Defines **encoding for artifacts, PERs, and TGK edges** in storage.
* TGK edges stored as:
```c
typedef struct {
uint64_t canonical_edge_id;
uint64_t from_nodes[MAX_FROM];
uint64_t to_nodes[MAX_TO];
uint32_t edge_type;
uint8_t roles;
uint64_t logseq;
} tgk_edge_record_t;
```
* Supports deterministic traversal, snapshot bounds, and SIMD filtering.
---
## 5. PEL Integration
### 5.1 PEL Program DAG
* Deterministic DAG with:
* Inputs: artifacts or PERs
* Computation nodes: concat, slice, primitive ops
* Outputs: artifacts or PERs
* Guarantees snapshot-bound determinism:
* Inputs: `logseq ≤ snapshot_max`
* Outputs: `logseq = max(input_logseq) + 1`
### 5.2 Execution Plan Mapping
| PEL Node | Execution Plan Operator |
| -------------- | ---------------------------- |
| Input Artifact | SegmentScan |
| Concat/Slice | Projection |
| TGK Projection | TGKTraversal |
| Aggregate | Aggregation |
| PER Output | SegmentScan (fed downstream) |
---
## 6. Execution Plan Operators
* **SegmentScan**: scan artifacts/PERs within snapshot
* **IndexFilter**: SIMD-accelerated filtering by type_tag, edge_type, role
* **Merge**: deterministic merge across shards
* **TGKTraversal**: depth-limited deterministic DAG traversal
* **Projection**: select fields
* **Aggregation**: count, sum, union
* **TombstoneShadow**: applies tombstones and ensures snapshot safety
---
## 7. Shard & SIMD Execution
* Artifacts/edges partitioned by shard
* SIMD applied per shard for filters and traversal
* Deterministic merge across shards ensures global ordering
* Buffers structured for memory alignment:
```c
struct shard_buffer {
uint64_t *artifact_ids;
uint64_t *tgk_edge_ids;
uint32_t *type_tags;
uint8_t *roles;
uint64_t count;
snapshot_range_t snapshot;
};
```
---
## 8. Federation & Cross-Node Deterministic Replay
* **Propagation rules**:
* Only new artifacts/PERs/edges (`logseq > last_applied`) transmitted
* Delta replication per snapshot
* **Replay rules**:
* Sort by `(logseq, canonical_id)` for deterministic application
* Apply tombstones/shadowing
* Preserve snapshot boundaries
* **Conflict resolution**:
* ArtifactKey collisions: duplicate hash → ignore, differing hash → flag
* Edge conflicts: latest logseq ≤ snapshot
* PER conflicts: identical inputs → skip execution
---
## 9. Provenance & Audit
* **Provenance table**: snapshot → artifacts/PERs applied
* **Federation log table**: peer node → last applied logseq
* **Deterministic replay** guarantees identical final outputs across nodes
---
## 10. Data Flow Summary
```
PEL DAG Inputs --> Execute PEL Program --> Generate PERs
| |
v v
ASL/TGK Shard Buffers (SIMD-aligned, snapshot-safe)
|
v
Execution Plan Operators (SegmentScan, IndexFilter, Merge, TGKTraversal, TombstoneShadow)
|
v
Final Output (artifacts + PERs + TGK projections)
|
v
Federation Layer (propagation & deterministic replay across nodes)
```
---
## 11. Snapshot & Log Integration
* All operations are **snapshot-bounded**.
* **ZFS snapshots** + append-only sequential logs provide:
* Checkpointing
* Deterministic replay
* Garbage collection of unreachable artifacts while preserving provenance
---
## 12. Summary
This unified system specification ensures:
* **Deterministic execution** (PEL + index + TGK)
* **Snapshot-safe operations**
* **Shard/SIMD acceleration**
* **Federated, replayable, cross-node consistency**
* **Integration of PER artifacts with TGK edges**
* **Provenance and auditability at scale**