amduat-api/notes/Unified ASL + TGK + PEL System Specification.md
2026-01-17 07:32:14 +01:00

6.4 KiB

Unified ASL + TGK + PEL System Specification (Master Reference)


NOTE: Integrated into tier1/asl-system-1.md. This note is retained for historical context and may drift.

1. Introduction

This document specifies a unified system for deterministic, federated, snapshot-safe storage and execution of artifacts, execution receipts (PERs), and TGK edges. The system integrates:

  • ASL (Artifact Storage Layer)
  • TGK (Trace Graph Kernel)
  • PEL (Program Execution Layer)
  • Indexing, Shard/SIMD acceleration
  • Federation and deterministic replay

The system supports billions of artifacts and edges, deterministic DAG execution, and cross-node provenance.


2. Core Concepts

Concept Description
Artifact Basic unit stored in ASL; may include optional type_tag and has_type_tag.
PER PEL Execution Receipt; artifact describing deterministic output of a PEL program.
TGK Edge Represents a directed relation between artifacts/PERs. Stores from_nodes, to_nodes, edge_type, roles.
Snapshot ZFS snapshot, defines read visibility and deterministic execution boundary.
Logseq Monotonic sequence number for deterministic ordering.
Shard Subset of artifacts/edges partitioned for SIMD/parallel execution.
Canonical ID Unique identifier per artifact, PER, or TGK edge.

3. ASL-CORE & ASL-STORE-INDEX

3.1 ASL-CORE

  • Defines artifact semantics:

    • Optional type_tag (32-bit) with has_type_tag (8-bit toggle)
    • Artifacts are immutable once written
    • PERs are treated as artifacts

3.2 ASL-STORE-INDEX

  • Manages artifact blocks, including:

    • Small vs. large blocks (packaging)
    • Block sealing, retention, snapshot safety
  • Index structure:

    • Shard-local, supports billion-scale lookups
    • Bloom filters for quick membership queries
    • Sharding and SIMD acceleration for memory-efficient lookups
  • Record Layout (C struct):

typedef struct {
    uint64_t artifact_key;
    uint64_t block_id;
    uint32_t offset;
    uint32_t length;
    uint32_t type_tag;
    uint8_t  has_type_tag;
} artifact_index_entry_t;

4. ENC-ASL-TGK-INDEX

  • Defines encoding for artifacts, PERs, and TGK edges in storage.
  • TGK edges stored as:
typedef struct {
    uint64_t canonical_edge_id;
    uint64_t from_nodes[MAX_FROM];
    uint64_t to_nodes[MAX_TO];
    uint32_t edge_type;
    uint8_t roles;
    uint64_t logseq;
} tgk_edge_record_t;
  • Supports deterministic traversal, snapshot bounds, and SIMD filtering.

5. PEL Integration

5.1 PEL Program DAG

  • Deterministic DAG with:

    • Inputs: artifacts or PERs
    • Computation nodes: concat, slice, primitive ops
    • Outputs: artifacts or PERs
  • Guarantees snapshot-bound determinism:

    • Inputs: logseq ≤ snapshot_max
    • Outputs: logseq = max(input_logseq) + 1

5.2 Execution Plan Mapping

PEL Node Execution Plan Operator
Input Artifact SegmentScan
Concat/Slice Projection
TGK Projection TGKTraversal
Aggregate Aggregation
PER Output SegmentScan (fed downstream)

6. Execution Plan Operators

  • SegmentScan: scan artifacts/PERs within snapshot
  • IndexFilter: SIMD-accelerated filtering by type_tag, edge_type, role
  • Merge: deterministic merge across shards
  • TGKTraversal: depth-limited deterministic DAG traversal
  • Projection: select fields
  • Aggregation: count, sum, union
  • TombstoneShadow: applies tombstones and ensures snapshot safety

7. Shard & SIMD Execution

  • Artifacts/edges partitioned by shard
  • SIMD applied per shard for filters and traversal
  • Deterministic merge across shards ensures global ordering
  • Buffers structured for memory alignment:
struct shard_buffer {
    uint64_t *artifact_ids;
    uint64_t *tgk_edge_ids;
    uint32_t  *type_tags;
    uint8_t   *roles;
    uint64_t   count;
    snapshot_range_t snapshot;
};

8. Federation & Cross-Node Deterministic Replay

  • Propagation rules:

    • Only new artifacts/PERs/edges (logseq > last_applied) transmitted
    • Delta replication per snapshot
  • Replay rules:

    • Sort by (logseq, canonical_id) for deterministic application
    • Apply tombstones/shadowing
    • Preserve snapshot boundaries
  • Conflict resolution:

    • ArtifactKey collisions: duplicate hash → ignore, differing hash → flag
    • Edge conflicts: latest logseq ≤ snapshot
    • PER conflicts: identical inputs → skip execution

9. Provenance & Audit

  • Provenance table: snapshot → artifacts/PERs applied
  • Federation log table: peer node → last applied logseq
  • Deterministic replay guarantees identical final outputs across nodes

10. Data Flow Summary

PEL DAG Inputs --> Execute PEL Program --> Generate PERs
            |                                 |
            v                                 v
   ASL/TGK Shard Buffers (SIMD-aligned, snapshot-safe)
            |
            v
  Execution Plan Operators (SegmentScan, IndexFilter, Merge, TGKTraversal, TombstoneShadow)
            |
            v
          Final Output (artifacts + PERs + TGK projections)
            |
            v
   Federation Layer (propagation & deterministic replay across nodes)

11. Snapshot & Log Integration

  • All operations are snapshot-bounded.

  • ZFS snapshots + append-only sequential logs provide:

    • Checkpointing
    • Deterministic replay
    • Garbage collection of unreachable artifacts while preserving provenance

12. Summary

This unified system specification ensures:

  • Deterministic execution (PEL + index + TGK)
  • Snapshot-safe operations
  • Shard/SIMD acceleration
  • Federated, replayable, cross-node consistency
  • Integration of PER artifacts with TGK edges
  • Provenance and auditability at scale