6.3 KiB
6.3 KiB
Unified ASL + TGK + PEL System Specification (Master Reference)
1. Introduction
This document specifies a unified system for deterministic, federated, snapshot-safe storage and execution of artifacts, execution receipts (PERs), and TGK edges. The system integrates:
- ASL (Artifact Storage Layer)
- TGK (Trace Graph Kernel)
- PEL (Program Execution Layer)
- Indexing, Shard/SIMD acceleration
- Federation and deterministic replay
The system supports billions of artifacts and edges, deterministic DAG execution, and cross-node provenance.
2. Core Concepts
| Concept | Description |
|---|---|
| Artifact | Basic unit stored in ASL; may include optional type_tag and has_type_tag. |
| PER | PEL Execution Receipt; artifact describing deterministic output of a PEL program. |
| TGK Edge | Represents a directed relation between artifacts/PERs. Stores from_nodes, to_nodes, edge_type, roles. |
| Snapshot | ZFS snapshot, defines read visibility and deterministic execution boundary. |
| Logseq | Monotonic sequence number for deterministic ordering. |
| Shard | Subset of artifacts/edges partitioned for SIMD/parallel execution. |
| Canonical ID | Unique identifier per artifact, PER, or TGK edge. |
3. ASL-CORE & ASL-STORE-INDEX
3.1 ASL-CORE
-
Defines artifact semantics:
- Optional
type_tag(32-bit) withhas_type_tag(8-bit toggle) - Artifacts are immutable once written
- PERs are treated as artifacts
- Optional
3.2 ASL-STORE-INDEX
-
Manages artifact blocks, including:
- Small vs. large blocks (packaging)
- Block sealing, retention, snapshot safety
-
Index structure:
- Shard-local, supports billion-scale lookups
- Bloom filters for quick membership queries
- Sharding and SIMD acceleration for memory-efficient lookups
-
Record Layout (C struct):
typedef struct {
uint64_t artifact_key;
uint64_t block_id;
uint32_t offset;
uint32_t length;
uint32_t type_tag;
uint8_t has_type_tag;
} artifact_index_entry_t;
4. ENC-ASL-TGK-INDEX
- Defines encoding for artifacts, PERs, and TGK edges in storage.
- TGK edges stored as:
typedef struct {
uint64_t canonical_edge_id;
uint64_t from_nodes[MAX_FROM];
uint64_t to_nodes[MAX_TO];
uint32_t edge_type;
uint8_t roles;
uint64_t logseq;
} tgk_edge_record_t;
- Supports deterministic traversal, snapshot bounds, and SIMD filtering.
5. PEL Integration
5.1 PEL Program DAG
-
Deterministic DAG with:
- Inputs: artifacts or PERs
- Computation nodes: concat, slice, primitive ops
- Outputs: artifacts or PERs
-
Guarantees snapshot-bound determinism:
- Inputs:
logseq ≤ snapshot_max - Outputs:
logseq = max(input_logseq) + 1
- Inputs:
5.2 Execution Plan Mapping
| PEL Node | Execution Plan Operator |
|---|---|
| Input Artifact | SegmentScan |
| Concat/Slice | Projection |
| TGK Projection | TGKTraversal |
| Aggregate | Aggregation |
| PER Output | SegmentScan (fed downstream) |
6. Execution Plan Operators
- SegmentScan: scan artifacts/PERs within snapshot
- IndexFilter: SIMD-accelerated filtering by type_tag, edge_type, role
- Merge: deterministic merge across shards
- TGKTraversal: depth-limited deterministic DAG traversal
- Projection: select fields
- Aggregation: count, sum, union
- TombstoneShadow: applies tombstones and ensures snapshot safety
7. Shard & SIMD Execution
- Artifacts/edges partitioned by shard
- SIMD applied per shard for filters and traversal
- Deterministic merge across shards ensures global ordering
- Buffers structured for memory alignment:
struct shard_buffer {
uint64_t *artifact_ids;
uint64_t *tgk_edge_ids;
uint32_t *type_tags;
uint8_t *roles;
uint64_t count;
snapshot_range_t snapshot;
};
8. Federation & Cross-Node Deterministic Replay
-
Propagation rules:
- Only new artifacts/PERs/edges (
logseq > last_applied) transmitted - Delta replication per snapshot
- Only new artifacts/PERs/edges (
-
Replay rules:
- Sort by
(logseq, canonical_id)for deterministic application - Apply tombstones/shadowing
- Preserve snapshot boundaries
- Sort by
-
Conflict resolution:
- ArtifactKey collisions: duplicate hash → ignore, differing hash → flag
- Edge conflicts: latest logseq ≤ snapshot
- PER conflicts: identical inputs → skip execution
9. Provenance & Audit
- Provenance table: snapshot → artifacts/PERs applied
- Federation log table: peer node → last applied logseq
- Deterministic replay guarantees identical final outputs across nodes
10. Data Flow Summary
PEL DAG Inputs --> Execute PEL Program --> Generate PERs
| |
v v
ASL/TGK Shard Buffers (SIMD-aligned, snapshot-safe)
|
v
Execution Plan Operators (SegmentScan, IndexFilter, Merge, TGKTraversal, TombstoneShadow)
|
v
Final Output (artifacts + PERs + TGK projections)
|
v
Federation Layer (propagation & deterministic replay across nodes)
11. Snapshot & Log Integration
-
All operations are snapshot-bounded.
-
ZFS snapshots + append-only sequential logs provide:
- Checkpointing
- Deterministic replay
- Garbage collection of unreachable artifacts while preserving provenance
12. Summary
This unified system specification ensures:
- Deterministic execution (PEL + index + TGK)
- Snapshot-safe operations
- Shard/SIMD acceleration
- Federated, replayable, cross-node consistency
- Integration of PER artifacts with TGK edges
- Provenance and auditability at scale