# ENC/PEL-TRACE-DAG/1 — Canonical Encoding for DAG Execution Traces Status: Approved Owner: Niklas Rydberg Version: 0.1.0 SoT: Yes Last Updated: 2025-11-16 Linked Phase Pack: N/A Tags: [binary-minimalism, traceability] **Document ID:** `ENC/PEL-TRACE-DAG/1` **Profile ID:** `PEL_ENC_TRACE_DAG_V1 = 0x0102` **Layer:** Scheme Encoding Profile (Trace) **Depends on (normative):** * `ASL/1-CORE v0.3.2` — value model (`Artifact`, `TypeTag`, `Reference`, integers, `OctetString`) * `ENC/ASL1-CORE v1.0.3` — canonical encodings for `Artifact` and `Reference` * `PEL/1-CORE v0.1.0` — primitive execution layer (ExecutionStatus, ExecutionErrorSummary) * `PEL/PROGRAM-DAG/1 v0.2.0` — DAG Program scheme (Program, Node, NodeId, canonical node order) * `PEL/TRACE-DAG/1 v0.1.0` — DAG execution trace profile (logical data model) **Integrates with (informative):** * `PEL/1-SURF v0.1.0` — store-backed execution surface * `HASH/ASL1 v0.2.3` — ASL1 hash family for trace artifact identity * TypeTag registry (for `TYPE_TAG_PEL_TRACE_DAG_1`) > The Profile ID `PEL_ENC_TRACE_DAG_V1` is a configuration label. > It is **not** embedded into trace payloads. Encoders and decoders select this encoding profile by context (scheme descriptor, engine/store configuration), not per value. © 2025 Niklas Rydberg. ## License Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications. Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0. --- ## 0. Overview `ENC/PEL-TRACE-DAG/1` defines the **canonical binary encoding** of the `TraceDAGValue` structure specified in `PEL/TRACE-DAG/1`, and all of its nested components: * `TraceDAGValue` * `NodeTraceDAG` * `DiagnosticEntry` This encoding: * is **injective** — distinct logical trace values → distinct byte strings; * is **stable and deterministic** — same value → same bytes across implementations and time; * is **streaming-friendly** — encoders/decoders can operate in a single forward pass; * embeds ASL/1 `Reference` values using their canonical `ReferenceBytes` encoding (`ENC/ASL1-CORE v1`) inside length-prefixed frames. The encoded payload (`TraceDAGBytes`) is used as the `bytes` field of a trace Artifact for the `PEL/TRACE-DAG/1` profile, with: ```text Artifact.type_tag = TYPE_TAG_PEL_TRACE_DAG_1 Artifact.bytes = TraceDAGBytes ```` Trace Artifact identity is then derived by hashing canonical `ArtifactBytes` with `"ASL1"` hash algorithms (typically `HASH-ASL1-256`). --- ## 1. Scope & Layering ### 1.1 Purpose This specification defines: * The **binary layout** of: * `TraceDAGBytes` * `NodeTraceDAGBytes` * `DiagnosticEntryBytes` * An internal “encoded Reference” wrapper for use inside trace payloads * The canonical **field ordering**, integer widths, and list framing. It does **not** define: * The logical semantics of traces — those are in `PEL/TRACE-DAG/1`. * The ASL/1 `Artifact` / `Reference` encodings — these are in `ENC/ASL1-CORE v1.0.3`. * How traces are produced or when they are enabled — that is governed by `PEL/1-CORE`, `PEL/1-SURF`, and policy profiles. ### 1.2 Layering constraints In line with `SUBSTRATE/STACK-OVERVIEW`: * `ENC/PEL-TRACE-DAG/1` is a **scheme-specific encoding profile**. * It MUST NOT redefine: * `Artifact`, `TypeTag`, `Reference`, `HashId` (`ASL/1-CORE`), * the `TraceDAGValue` logical structure (`PEL/TRACE-DAG/1`). * It is **storage-neutral** and **policy-neutral**. * It defines exactly one canonical encoding for `TraceDAGValue` values in this scheme. --- ## 2. Conventions The RFC 2119 terms **MUST**, **SHOULD**, **MAY**, etc. are normative. ### 2.1 Integer encodings All multi-byte integers are encoded as **big-endian** (network byte order), as in `ENC/ASL1-CORE`: * `u8` — 1 byte * `u16` — 2 bytes * `u32` — 4 bytes * `u64` — 8 bytes Only **fixed-width** integers are used in this specification. ### 2.2 Lists A list of values of some type `T` is encoded as: ```text List = count (u32) element_0 element_1 ... element_{count-1} ``` * `count` is the number of elements (MAY be zero). * Elements are encoded in order, using the canonical encoding for `T`. ### 2.3 UTF-8 string `Utf8String` is encoded as: ```text Utf8String = length (u32) bytes[0..length-1] ``` * `length` is the number of bytes. * `bytes` MUST be well-formed UTF-8. * There is no terminator or padding. ### 2.4 Octet blob (32-bit length) For diagnostics and other opaque fields, we use a generic blob: ```text Blob32 = length (u32) bytes[0..length-1] ``` `bytes` is an arbitrary `OctetString`; interpretation is profile-specific. `length` MAY be zero. ### 2.5 Embedded Reference Within this encoding, `Reference` values are embedded using a length-prefixed wrapper over canonical `ReferenceBytes` from `ENC/ASL1-CORE v1.0.3`. We define: ```text EncodedRef = ref_len (u32) ref_bytes (byte[0..ref_len-1]) // canonical ReferenceBytes ``` Where: * `ref_bytes` MUST be the canonical `ReferenceBytes` encoding for a `Reference` value, as defined in `ENC/ASL1-CORE v1.0.3`: ```text ReferenceBytes :: hash_id (u16) digest (byte[...]) // remaining bytes ``` * `ref_len` MUST be the exact length (in bytes) of `ref_bytes` (MUST be ≥ 2). Decoders MUST: * Read `ref_len (u32)`, then `ref_bytes[0..ref_len-1]`. * Decode `ref_bytes` as `ReferenceBytes` per `ENC/ASL1-CORE v1.0.3`. * Reject encodings where: * `ref_len < 2`, or * `ref_bytes` is not a valid `ReferenceBytes` sequence (e.g., truncated). #### 2.5.1 Optional EncodedRef For optional `Reference` fields, we use: ```text OptionalEncodedRef = has_ref (u8) [ EncodedRef ] // only if has_ref = 0x01 ``` * `has_ref = 0x00` → no value present; no `EncodedRef` follows. * `has_ref = 0x01` → exactly one `EncodedRef` follows. Other `has_ref` values MUST be treated as encoding errors. --- ## 3. Logical Model Reference This section restates the logical structures from `PEL/TRACE-DAG/1` (source of truth) in condensed form. ### 3.1 DiagnosticEntry ```text DiagnosticEntry { code: uint32 // diagnostic or error code message: OctetString // typically UTF-8 text; interpretation is profile-specific } ``` ### 3.2 NodeTraceStatus ```text NodeTraceStatus = uint8 NodeTraceStatus { NODE_OK = 0 NODE_FAILED = 1 NODE_SKIPPED = 2 } ``` ### 3.3 NodeTraceDAG ```text NodeTraceDAG { node_id: NodeId // uint32 op_name: string op_version: uint32 status: NodeTraceStatus status_code: uint32 // 0 = success; non-zero = op-specific failure code output_refs: list diagnostics: list } ``` ### 3.4 TraceDAGValue ```text TraceDAGValue { pel1_version: uint16 // MUST be 1 for this version scheme_ref: Reference // SchemeRef_DAG_1 program_ref: Reference // Program Artifact reference status: ExecutionStatus summary: ExecutionErrorSummary { kind: ExecutionErrorKind status_code: uint32 } exec_result_ref: optional Reference input_refs: list params_ref: optional Reference node_traces: list // one per Node in canonical node order } ``` All semantics (how these fields are populated for different run outcomes) are defined in `PEL/TRACE-DAG/1`. --- ## 4. Encoding ### 4.1 DiagnosticEntry encoding Logical: ```text DiagnosticEntry { code: uint32 message: OctetString } ``` Canonical encoding: ```text DiagnosticEntryBytes :: code (u32) message (Blob32) ``` Where `Blob32` is as defined in §2.4. * `code (u32)` encodes the diagnostic or error code. * `message` is an opaque byte blob; profile users MAY agree to use UTF-8 here, but this encoding does not enforce it. Decoders MUST: * Read `code (u32)`. * Read `message` as `Blob32`. * Treat truncated blobs as encoding errors. --- ### 4.2 NodeTraceDAG encoding Logical: ```text NodeTraceDAG { node_id: NodeId op_name: string op_version: uint32 status: NodeTraceStatus status_code: uint32 output_refs: list diagnostics: list } ``` Canonical encoding: ```text NodeTraceDAGBytes :: node_id (u32) op_name (Utf8String) op_version (u32) status (u8) // NodeTraceStatus status_code (u32) output_ref_count (u32) output_refs (EncodedRef[0..output_ref_count-1]) diag_count (u32) diagnostics (DiagnosticEntryBytes[0..diag_count-1]) ``` Field semantics: 1. `node_id (u32)` * Encodes `NodeTraceDAG.node_id`. 2. `op_name (Utf8String)` * Encodes `NodeTraceDAG.op_name` as UTF-8 (see §2.3). 3. `op_version (u32)` * Encodes `NodeTraceDAG.op_version`. 4. `status (u8)` * Encodes `NodeTraceStatus`: ```text 0x00 -> NODE_OK 0x01 -> NODE_FAILED 0x02 -> NODE_SKIPPED ``` * Other values MUST be treated as encoding errors. 5. `status_code (u32)` * MUST be: * `0` if `status = NODE_OK` or `NODE_SKIPPED`. * non-zero if `status = NODE_FAILED`. * Conformance to these rules is a semantic requirement, not a decoding requirement. Decoders MAY choose to validate and reject inconsistent encodings. 6. `output_ref_count (u32)` and `output_refs` * Number of output references for this node. * Each entry is an `EncodedRef` (§2.5). 7. `diag_count (u32)` and `diagnostics` * Number of diagnostic entries. * Each entry is encoded using `DiagnosticEntryBytes` (§4.1). #### 4.2.1 Canonical ordering of node_traces In `TraceDAGBytes`, the `node_traces` list (see §4.4) MUST: * contain each `NodeTraceDAG` exactly once for each Program `Node`, * appear in the canonical node order defined by `PEL/PROGRAM-DAG/1` (canonical topological order with `NodeId` as tie-breaker). Encoders MUST enforce this; decoders MAY assume it. --- ### 4.3 Summary and status encoding From `TraceDAGValue`: ```text summary: ExecutionErrorSummary { kind: ExecutionErrorKind status_code: uint32 } status: ExecutionStatus ``` These are encoded in `TraceDAGBytes` as: ```text status (u8) // ExecutionStatus summary_kind (u8) // ExecutionErrorKind summary_status_code (u32) ``` The exact value sets of `ExecutionStatus` and `ExecutionErrorKind` are defined in `PEL/1-CORE` and `PEL/TRACE-DAG/1`; this spec treats them as small enums in `u8` space. Decoders MUST: * Read `status (u8)`, `summary_kind (u8)`, `summary_status_code (u32)` as raw fields. * Treat `status` and `summary_kind` values outside the agreed ranges as encoding errors or map them to an “unknown” variant at the semantic layer. --- ### 4.4 TraceDAGValue encoding Logical: ```text TraceDAGValue { pel1_version: uint16 scheme_ref: Reference program_ref: Reference status: ExecutionStatus summary: ExecutionErrorSummary exec_result_ref: optional Reference input_refs: list params_ref: optional Reference node_traces: list } ``` Canonical encoding: ```text TraceDAGBytes :: pel1_version (u16) scheme_ref (EncodedRef) program_ref (EncodedRef) status (u8) // ExecutionStatus summary_kind (u8) // ExecutionErrorKind summary_status_code (u32) // ExecutionErrorSummary.status_code has_exec_result (u8) [ exec_result (EncodedRef) ] // if has_exec_result == 0x01 input_ref_count (u32) input_refs (EncodedRef[0..input_ref_count-1]) has_params_ref (u8) [ params_ref (EncodedRef) ] // if has_params_ref == 0x01 node_trace_count (u32) node_traces (NodeTraceDAGBytes[0..node_trace_count-1]) ``` Field semantics: 1. `pel1_version (u16)` * MUST be `1` for traces produced under `PEL/TRACE-DAG/1 v0.1.0`. * Decoders: * MUST accept `pel1_version = 1`. * MUST treat other values as encoding errors for this profile revision. 2. `scheme_ref (EncodedRef)` * Encodes the `Reference` to the scheme descriptor; for this profile MUST be `SchemeRef_DAG_1`. 3. `program_ref (EncodedRef)` * Encodes the `Reference` of the Program Artifact executed in this run. 4. `status`, `summary_kind`, `summary_status_code (u32)` * As described in §4.3. 5. `has_exec_result (u8)` and `exec_result (EncodedRef)` * Encodes `exec_result_ref : optional Reference`: * `has_exec_result = 0x00` → absent, no `exec_result` bytes follow. * `has_exec_result = 0x01` → exactly one `EncodedRef` follows, encoding `exec_result_ref`. * Other values MUST be treated as encoding errors. 6. `input_ref_count (u32)` and `input_refs (EncodedRef[..])` * Number and encoded values of the `input_refs` list. * Encodes `TraceDAGValue.input_refs` in order. 7. `has_params_ref (u8)` and `params_ref (EncodedRef)` * Encodes `params_ref : optional Reference`: * `has_params_ref = 0x00` → absent. * `has_params_ref = 0x01` → present, encoded as `EncodedRef`. * Other values MUST be treated as encoding errors. 8. `node_trace_count (u32)` and `node_traces (NodeTraceDAGBytes[..])` * Number and encoded values of `node_traces`. * **Canonical requirement:** encoders MUST set `node_trace_count` equal to the number of `Node`s in the Program (for runs where at least one node is attempted), and MUST encode node traces in canonical node order (§4.2.1). --- ## 5. Canonicality & Injectivity ### 5.1 Injectivity The mapping: ```text TraceDAGValue -> TraceDAGBytes ``` defined by this profile MUST be **injective**: * If `T1 != T2` as logical `TraceDAGValue` instances (per `PEL/TRACE-DAG/1`), then their encodings MUST differ: ```text T1 != T2 ⇒ TraceDAGBytes(T1) != TraceDAGBytes(T2) ``` This is ensured by: * fixed field ordering and explicit presence flags, * deterministic list ordering, * inclusion of all logically relevant fields. ### 5.2 Stability The same logical trace value MUST always yield the same `TraceDAGBytes` across: * implementations, * platforms, * time. Encoders MUST NOT: * reorder any list elements (e.g., `input_refs`, `node_traces`, `output_refs`, `diagnostics`), * introduce alternative encodings for integers or strings, * omit or reorder fields. ### 5.3 Node ordering For runs where at least one node is attempted and the Program is structurally valid: * `node_traces` MUST have exactly one entry per Program `Node` (per `PEL/PROGRAM-DAG/1`) in canonical node order. If the Program cannot be decoded or is structurally invalid: * `node_traces` MAY be empty; if non-empty, any node entries MUST still follow canonical node order for the subset present. --- ## 6. Trace Artifact Binding ### 6.1 TypeTag Trace Artifacts for this profile MUST be ASL/1 Artifacts with: ```text Artifact { bytes = TraceDAGBytes type_tag = TYPE_TAG_PEL_TRACE_DAG_1 } ``` Where: * `TYPE_TAG_PEL_TRACE_DAG_1` is a `TypeTag` with a concrete `tag_id` assigned in the global TypeTag registry for DAG traces. This encoding profile: * Refers to `TYPE_TAG_PEL_TRACE_DAG_1` symbolically. * Does not assign a numeric `tag_id`; that is handled in a separate registry. ### 6.2 Identity via ASL/1-CORE With `ENC/ASL1-CORE v1` and `"ASL1"` hashes (`HASH/ASL1`): 1. Canonical `ArtifactBytes` for a trace Artifact: ```text ArtifactBytes = encode_artifact_core_v1( Artifact{ bytes = TraceDAGBytes, type_tag = TYPE_TAG_PEL_TRACE_DAG_1 } ) ``` 2. Canonical `Reference` for the trace Artifact under some `HashId = HID`: ```text digest = H(ArtifactBytes) // H from HASH/ASL1, for HID reference = Reference { hash_id = HID, digest = digest } ``` All conformant implementations MUST agree on: * `TraceDAGBytes` for a given logical `TraceDAGValue`, * `ArtifactBytes` for the resulting trace Artifact, * the resulting `Reference` for any fixed `HashId` and hash algorithm. --- ## 7. Error Handling (Encoding Layer) Decoders for this profile MUST treat as **encoding errors**: 1. Truncated values: * Any attempt to read a declared integer, length-prefixed blob (`Blob32`, `Utf8String`, `EncodedRef`), or list entry that runs out of bytes. 2. Invalid `pel1_version`: * `pel1_version != 1`. 3. Invalid `NodeTraceStatus`: * `status` not in `{ 0x00, 0x01, 0x02 }`. 4. Invalid optional flags: * `has_exec_result` or `has_params_ref` not in `{ 0x00, 0x01 }`. 5. Invalid `EncodedRef`: * `ref_len < 2`, or * `ref_bytes` cannot be decoded as `ReferenceBytes` (per `ENC/ASL1-CORE v1.0.3`). 6. Invalid `Utf8String` in `op_name`: * `op_name` bytes not valid UTF-8. 7. Inconsistent list counts: * Not enough elements following a list count (e.g. `input_ref_count`, `node_trace_count`, `output_ref_count`, `diag_count`). Mapping from these encoding errors to external error codes (e.g. `ERR_PEL_TRACE_ENC_INVALID`) is implementation-specific. Semantic inconsistencies (e.g. mismatched `summary` vs `status`) are semantic-layer issues; decoders MAY validate them but are not required to at the encoding layer. --- ## 8. Streaming & Implementation Notes Implementation requirements: * **Single-pass encoding**: * Encoders MUST be able to generate `TraceDAGBytes` in a single forward pass over the logical `TraceDAGValue`, assuming they have the structure in memory. * They MAY need to precompute counts or sizes (e.g., `node_trace_count`), but this is standard. * **Single-pass decoding**: * Decoders MUST be able to decode `TraceDAGBytes` in a single forward pass, with no backtracking. * All length prefixes appear before their content. For large traces: * Implementations MAY: * stream `NodeTraceDAGBytes` entries to consumers as they decode, * stream diagnostic message blobs. * They MUST ensure that any observable behavior (including error reporting and any reconstructed `TraceDAGValue`) is independent of chunking or I/O strategy. --- ## 9. Conformance An implementation is **ENC/PEL-TRACE-DAG/1–conformant** if it: 1. **Implements the encoding layout** * Encodes and decodes `TraceDAGBytes` exactly as described in §4. * Treats `pel1_version = 1` as the only supported version for this profile revision. * Enforces validity of discriminants and presence flags at the encoding layer. 2. **Preserves canonical ordering** * When encoding, preserves: * order of `input_refs`, * canonical order of `node_traces` (per `PEL/PROGRAM-DAG/1`), * order of `output_refs` and `diagnostics` within each `NodeTraceDAG`. 3. **Uses canonical sub-encodings** * Uses `Utf8String` and `Blob32` exactly as in §2. * Uses `EncodedRef` as defined in §2.5, with `ReferenceBytes` from `ENC/ASL1-CORE v1.0.3`. 4. **Ensures injectivity & stability** * Ensures distinct logical `TraceDAGValue`s produce distinct `TraceDAGBytes`. * Ensures the same logical value always encodes to the same bytes (no configuration affecting layout). 5. **Binds to trace Artifacts correctly** * When forming trace Artifacts for `PEL/TRACE-DAG/1`, sets: * `Artifact.bytes = TraceDAGBytes` * `Artifact.type_tag = TYPE_TAG_PEL_TRACE_DAG_1` * Uses `ENC/ASL1-CORE v1` and `HASH/ASL1` for identity. Everything else — storage, transport, policy, and graph interpretation — is delegated to other specifications. --- ## 10. Informative Example > This example illustrates field layout only. > Hex and values are illustrative, not normative test vectors. Assume a simple run: * `pel1_version = 1` * `scheme_ref = S` (an ASL/1 Reference) * `program_ref = P` * `status = OK (0)` * `summary.kind = NONE (0)`, `summary.status_code = 0` * `exec_result_ref = R` * Inputs: three `Reference`s `[I0, I1, I2]` * No params (`params_ref = absent`) * Program has two nodes in canonical order, with traces: * Node 1: OK, produced one output `[O0]` * Node 2: OK, produced one output `[O1]` Simplified encoding sketch: ```text pel1_version = 0001 ; u16 scheme_ref = EncodedRef(S) ; 4-byte length + ReferenceBytes(S) program_ref = EncodedRef(P) status = 00 ; OK summary_kind = 00 ; NONE summary_status_code = 00000000 ; status_code = 0 has_exec_result = 01 exec_result = EncodedRef(R) input_ref_count = 00000003 input_refs = EncodedRef(I0) EncodedRef(I1) EncodedRef(I2) has_params_ref = 00 ; none node_trace_count = 00000002 ; NodeTrace #0 node_id = 00000001 op_name = 00000005 "add64" op_version = 00000001 status = 00 ; NODE_OK status_code = 00000000 output_ref_count = 00000001 output_refs = EncodedRef(O0) diag_count = 00000000 ; NodeTrace #1 node_id = 00000002 op_name = 00000005 "mul64" op_version = 00000001 status = 00 ; NODE_OK status_code = 00000000 output_ref_count = 00000001 output_refs = EncodedRef(O1) diag_count = 00000000 ``` Where each `EncodedRef(X)` is: ```text ref_len(X) (u32) || ReferenceBytes(X) ``` with `ReferenceBytes(X)` = `hash_id (u16)` + `digest` bytes per `ENC/ASL1-CORE v1`. All conformant encoders MUST produce the same `TraceDAGBytes` for this logical trace value; all conformant decoders MUST reconstruct the same `TraceDAGValue`. --- **End of `ENC/PEL-TRACE-DAG/1 v0.1.0 — Canonical Encoding for DAG Execution Traces`** --- ## Document History * **0.1.0 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.