735 lines
21 KiB
Markdown
735 lines
21 KiB
Markdown
|
|
# ENC/PEL-PROGRAM-DAG/1 — Canonical Encoding for DAG Programs
|
|||
|
|
|
|||
|
|
Status: Approved
|
|||
|
|
Owner: Niklas Rydberg
|
|||
|
|
Version: 0.2.0
|
|||
|
|
SoT: Yes
|
|||
|
|
Last Updated: 2025-11-16
|
|||
|
|
Linked Phase Pack: N/A
|
|||
|
|
Tags: [binary-minimalism, deterministic]
|
|||
|
|
|
|||
|
|
<!-- Source: /amduat/docs/new/enc-pel-program-dag-1.md | Canonical: /amduat/tier1/enc-pel-program-dag-1.md -->
|
|||
|
|
|
|||
|
|
**Document ID:** `ENC/PEL-PROGRAM-DAG/1`
|
|||
|
|
**Profile ID:** `PEL_ENC_PROGRAM_DAG_V1 = 0x0101`
|
|||
|
|
**Layer:** Scheme Encoding Profile (on top of ASL/1-CORE + PEL/PROGRAM-DAG/1)
|
|||
|
|
|
|||
|
|
**Depends on (normative):**
|
|||
|
|
|
|||
|
|
* `ASL/1-CORE v0.4.x` — value model (`Artifact`, `TypeTag`, `Reference`, integers, `OctetString`)
|
|||
|
|
* `ENC/ASL1-CORE v1.0.3` — canonical encoding conventions (integers, `OctetString`, streaming constraints)
|
|||
|
|
* `PEL/PROGRAM-DAG/1 v0.3.1` — DAG Program scheme (`Program` / `Node` model, semantics, canonical topological order)
|
|||
|
|
|
|||
|
|
**Integrates with (informative):**
|
|||
|
|
|
|||
|
|
* `PEL/1-CORE v0.3.x` — primitive execution layer (`Exec_s` / `Exec_DAG`)
|
|||
|
|
* `PEL/1-SURF v0.2.x` — store-backed execution surface
|
|||
|
|
* `HASH/ASL1 v0.2.4` — reference formation over canonical encodings
|
|||
|
|
* TypeTag registry (for `TYPE_TAG_PEL_PROGRAM_DAG_1`)
|
|||
|
|
* Operation registries (e.g. `OPREG/PEL1-KERNEL` and param profiles)
|
|||
|
|
|
|||
|
|
> **Note:** The Profile ID `PEL_ENC_PROGRAM_DAG_V1` is a configuration label.
|
|||
|
|
> It is **not** embedded in the encoded Program bytes. Selection of this encoding profile is done by context (scheme descriptor, store or engine configuration), not per value.
|
|||
|
|
|
|||
|
|
© 2025 Niklas Rydberg.
|
|||
|
|
|
|||
|
|
## License
|
|||
|
|
|
|||
|
|
Except where otherwise noted, this document (text and diagrams) is licensed under
|
|||
|
|
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
|||
|
|
|
|||
|
|
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
|||
|
|
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
|||
|
|
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
|||
|
|
specifications.
|
|||
|
|
|
|||
|
|
Code examples in this document are provided under the Apache License 2.0 unless
|
|||
|
|
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
|||
|
|
public domain under CC0 1.0.
|
|||
|
|
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 0. Overview
|
|||
|
|
|
|||
|
|
`ENC/PEL-PROGRAM-DAG/1` defines the **canonical binary encoding** of the `Program` structure defined in `PEL/PROGRAM-DAG/1`:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
Program {
|
|||
|
|
nodes: list<Node>
|
|||
|
|
roots: list<RootRef>
|
|||
|
|
}
|
|||
|
|
````
|
|||
|
|
|
|||
|
|
and its sub-structures:
|
|||
|
|
|
|||
|
|
* `OperationId`
|
|||
|
|
* `DagInputExternal`, `DagInputNode`, `DagInput`
|
|||
|
|
* `Node`
|
|||
|
|
* `RootRef`
|
|||
|
|
|
|||
|
|
This encoding:
|
|||
|
|
|
|||
|
|
* is **injective** with respect to the logical `Program` model — distinct Programs → distinct byte strings under this profile,
|
|||
|
|
* is **stable and deterministic** across implementations and time,
|
|||
|
|
* is **streaming-friendly** — encoders and decoders can operate in a single forward pass,
|
|||
|
|
* fixes a **canonical topological ordering** for `nodes` (matching the scheme spec).
|
|||
|
|
|
|||
|
|
The result is used as the payload (`Artifact.bytes`) of Program Artifacts for the `PEL/PROGRAM-DAG/1` scheme, with:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
Artifact.type_tag = TYPE_TAG_PEL_PROGRAM_DAG_1
|
|||
|
|
Artifact.bytes = ProgramBytes
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Identity of Program Artifacts is then derived via `HASH/ASL1` over `ArtifactBytes` as usual.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Scope & Layering
|
|||
|
|
|
|||
|
|
### 1.1 Purpose
|
|||
|
|
|
|||
|
|
This specification defines:
|
|||
|
|
|
|||
|
|
* The concrete binary layout of:
|
|||
|
|
|
|||
|
|
* `ProgramBytes`
|
|||
|
|
* `NodeBytes`
|
|||
|
|
* `DagInputBytes`
|
|||
|
|
* `RootRefBytes`
|
|||
|
|
* `OperationIdBytes`
|
|||
|
|
|
|||
|
|
* Canonicalization rules:
|
|||
|
|
|
|||
|
|
* Node ordering (canonical topological order),
|
|||
|
|
* fixed field ordering,
|
|||
|
|
* integer widths and encodings.
|
|||
|
|
|
|||
|
|
It does **not** define:
|
|||
|
|
|
|||
|
|
* The logical semantics of Programs (DAG evaluation, error statuses, etc.) — those are in `PEL/PROGRAM-DAG/1` and `PEL/1-CORE`.
|
|||
|
|
* The ASL/1 `Artifact` or `Reference` layouts — those are in `ASL/1-CORE` and `ENC/ASL1-CORE`.
|
|||
|
|
* How Programs are used in store-backed execution — that belongs to `PEL/1-SURF`.
|
|||
|
|
|
|||
|
|
### 1.2 Layering constraints
|
|||
|
|
|
|||
|
|
In line with `SUBSTRATE/STACK-OVERVIEW`:
|
|||
|
|
|
|||
|
|
* `ENC/PEL-PROGRAM-DAG/1` is a **scheme-specific encoding profile**.
|
|||
|
|
|
|||
|
|
* It MUST NOT redefine:
|
|||
|
|
|
|||
|
|
* `Artifact`, `TypeTag`, `Reference`, or `HashId` (from `ASL/1-CORE`),
|
|||
|
|
* the logical `Program` / `Node` / `DagInput` / `RootRef` model (from `PEL/PROGRAM-DAG/1`).
|
|||
|
|
|
|||
|
|
* It is **storage-neutral** and **policy-neutral**.
|
|||
|
|
|
|||
|
|
* It defines exactly one canonical encoding for `Program` values for this scheme under the profile ID `PEL_ENC_PROGRAM_DAG_V1`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Conventions
|
|||
|
|
|
|||
|
|
RFC 2119 terms (**MUST**, **SHOULD**, **MAY**, etc.) are normative.
|
|||
|
|
|
|||
|
|
### 2.1 Integer encodings
|
|||
|
|
|
|||
|
|
All multi-byte integers are encoded as **big-endian** (network byte order), as in `ENC/ASL1-CORE`:
|
|||
|
|
|
|||
|
|
* `u8` — 1 byte
|
|||
|
|
* `u16` — 2 bytes
|
|||
|
|
* `u32` — 4 bytes
|
|||
|
|
* `u64` — 8 bytes
|
|||
|
|
|
|||
|
|
Only **fixed-width** integers are used in this specification.
|
|||
|
|
|
|||
|
|
### 2.2 Utf8String
|
|||
|
|
|
|||
|
|
This specification defines a canonical `Utf8String` encoding:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
Utf8String = length (u32) || bytes[0..length-1]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
* `length` is the number of bytes of UTF-8 data.
|
|||
|
|
* `length` MAY be zero.
|
|||
|
|
* Decoders MUST validate that the byte sequence is well-formed UTF-8.
|
|||
|
|
* There is no padding or terminator.
|
|||
|
|
|
|||
|
|
All strings in this profile (`OperationId.name`) are encoded as `Utf8String`.
|
|||
|
|
|
|||
|
|
### 2.3 Parameter bytes
|
|||
|
|
|
|||
|
|
For operation parameters, this profile uses a compact `ParamsBytes` encoding:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
ParamsBytes = length (u32) || bytes[0..length-1]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
* `bytes` is an opaque blob whose interpretation is defined per operation in the operation registry.
|
|||
|
|
* `length` MAY be zero (empty params).
|
|||
|
|
|
|||
|
|
> This differs from the general `OctetString` encoding (which uses `u64` length and is defined in `ENC/ASL1-CORE`).
|
|||
|
|
> Using `u32` length here is acceptable because this structure lives *inside* the Program Artifact payload, not as an ASL/1 top-level value.
|
|||
|
|
|
|||
|
|
#### 2.3.1 Parameter profiles and canonicality
|
|||
|
|
|
|||
|
|
When a `Program` is interpreted under a concrete operation registry + parameter profile set:
|
|||
|
|
|
|||
|
|
* For each operation `(name, version)`:
|
|||
|
|
|
|||
|
|
* There MUST be a well-defined abstract parameter model (`ParamsValue`).
|
|||
|
|
* There MUST be exactly one canonical encode/decode pair:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
encode_params_op : ParamsValue -> ParamsBytes
|
|||
|
|
decode_params_op : ParamsBytes -> ParamsValue | error
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
* All conformant implementations of that operation MUST:
|
|||
|
|
|
|||
|
|
* decode `ParamsBytes` into the same `ParamsValue`, or
|
|||
|
|
* deterministically detect decoding failures.
|
|||
|
|
|
|||
|
|
Kernel parameter profiles (e.g. `OPREG/PEL1-KERNEL-PARAMS/1`) MUST ensure:
|
|||
|
|
|
|||
|
|
* `encode_params_op` followed by `decode_params_op` round-trips exactly.
|
|||
|
|
* Any `ParamsBytes` that fails `decode_params_op` MUST be treated as a **program-level validation error** (`INVALID_PROGRAM`) under `PEL/PROGRAM-DAG/1`, not as a runtime failure.
|
|||
|
|
|
|||
|
|
### 2.4 Lists
|
|||
|
|
|
|||
|
|
A list of values of some type `T` is encoded as:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
List<T> = count (u32) || element_0 || element_1 || ... || element_{count-1}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
* `count` is the number of elements (MAY be zero).
|
|||
|
|
* Elements are encoded in order, using the canonical encoding of `T`.
|
|||
|
|
|
|||
|
|
### 2.5 Encoding version field
|
|||
|
|
|
|||
|
|
`ProgramBytes` includes a `program_version (u16)` field:
|
|||
|
|
|
|||
|
|
* In this profile, `program_version` MUST be `1`.
|
|||
|
|
* Any future incompatible change to the layout of `ProgramBytes` under the same profile ID MUST be reflected by a new `program_version` value (and corresponding decoder support).
|
|||
|
|
* Adding fields in a backward-compatible, strictly-append-only way SHOULD be done via a new encoding profile rather than overloading `program_version = 1`.
|
|||
|
|
|
|||
|
|
Decoders MUST reject any `program_version` they do not implement.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Logical Model Reference
|
|||
|
|
|
|||
|
|
For convenience, the logical types from `PEL/PROGRAM-DAG/1` are restated informally (normative source remains that document):
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
OperationId {
|
|||
|
|
name: string
|
|||
|
|
version: uint32
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
DagInputExternal {
|
|||
|
|
input_index: uint32
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
DagInputNode {
|
|||
|
|
node_id: NodeId // uint32
|
|||
|
|
output_index: uint32
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
DagInput =
|
|||
|
|
DagInputExternal
|
|||
|
|
| DagInputNode
|
|||
|
|
|
|||
|
|
Node {
|
|||
|
|
id: NodeId // uint32
|
|||
|
|
op: OperationId
|
|||
|
|
inputs: list<DagInput>
|
|||
|
|
params: Params // abstract; serialized as ParamsBytes in this profile
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
RootRef {
|
|||
|
|
node_id: NodeId // uint32
|
|||
|
|
output_index: uint32
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
Program {
|
|||
|
|
nodes: list<Node>
|
|||
|
|
roots: list<RootRef>
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
NodeId = uint32
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
`PEL/PROGRAM-DAG/1` further defines:
|
|||
|
|
|
|||
|
|
* Structural validity rules (unique NodeIds, acyclicity, etc.).
|
|||
|
|
* Canonical topological order of Nodes.
|
|||
|
|
|
|||
|
|
This encoding profile assumes the logical model and validity rules as given there. It **does not** re-check them at the encoding-layer; that is scheme-level responsibility.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Program Encoding
|
|||
|
|
|
|||
|
|
### 4.1 Program header and overall layout
|
|||
|
|
|
|||
|
|
The canonical encoding of a `Program` value is:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
ProgramBytes ::
|
|||
|
|
program_version (u16)
|
|||
|
|
node_count (u32)
|
|||
|
|
nodes (NodeBytes[0..node_count-1])
|
|||
|
|
root_count (u32)
|
|||
|
|
roots (RootRefBytes[0..root_count-1])
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Constraints:
|
|||
|
|
|
|||
|
|
* `program_version` MUST currently be `1`.
|
|||
|
|
|
|||
|
|
Decoders:
|
|||
|
|
|
|||
|
|
* MUST accept `program_version = 1`.
|
|||
|
|
* MUST reject any other value as an unsupported encoding version.
|
|||
|
|
|
|||
|
|
* `node_count` and `root_count` are the number of elements in the corresponding lists.
|
|||
|
|
|
|||
|
|
* `nodes` MUST be encoded in the **canonical topological order** defined in `PEL/PROGRAM-DAG/1 §4`. Encoders MUST perform this ordering; decoders MAY rely on it but are not required to re-check DAG properties.
|
|||
|
|
|
|||
|
|
* `roots` MUST be encoded in the same order as the logical `Program.roots` list.
|
|||
|
|
|
|||
|
|
### 4.2 Node encoding
|
|||
|
|
|
|||
|
|
Each `Node` is encoded as:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
NodeBytes ::
|
|||
|
|
node_id (u32)
|
|||
|
|
op_name (Utf8String)
|
|||
|
|
op_version (u32)
|
|||
|
|
input_count (u32)
|
|||
|
|
inputs (DagInputBytes[0..input_count-1])
|
|||
|
|
params_len (u32)
|
|||
|
|
params_bytes (byte[0..params_len-1])
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Field meanings:
|
|||
|
|
|
|||
|
|
1. `node_id (u32)`
|
|||
|
|
|
|||
|
|
* Encodes `Node.id`.
|
|||
|
|
* MUST be unique across all Nodes in a Program (scheme-level requirement).
|
|||
|
|
|
|||
|
|
2. `op_name (Utf8String)`
|
|||
|
|
|
|||
|
|
* Encodes `OperationId.name` as UTF-8.
|
|||
|
|
|
|||
|
|
3. `op_version (u32)`
|
|||
|
|
|
|||
|
|
* Encodes `OperationId.version`.
|
|||
|
|
|
|||
|
|
4. `input_count (u32)`
|
|||
|
|
|
|||
|
|
* Number of input references consumed by this Node.
|
|||
|
|
|
|||
|
|
5. `inputs`
|
|||
|
|
|
|||
|
|
* Exactly `input_count` entries, each encoded as `DagInputBytes` (see §4.3) in order.
|
|||
|
|
|
|||
|
|
6. `params_len (u32)` and `params_bytes`
|
|||
|
|
|
|||
|
|
* `params_len` = length of the operation-specific parameter blob.
|
|||
|
|
* `params_bytes` is an opaque blob whose interpretation is defined by the operation’s parameter profile.
|
|||
|
|
* A `params_len` of `0` encodes an empty parameter blob.
|
|||
|
|
|
|||
|
|
> **Injectivity requirement (Node)**
|
|||
|
|
> For a fixed interpretation of `ParamsBytes` and `OperationId`, distinct logical `Node` values (differing in any field) MUST produce distinct `NodeBytes`.
|
|||
|
|
> Given `NodeBytes` and the corresponding operation registry, a canonical decoder MUST reconstruct exactly the same logical `Node`.
|
|||
|
|
|
|||
|
|
### 4.3 DagInput encoding
|
|||
|
|
|
|||
|
|
Each `DagInput` is encoded as a tagged union:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
DagInputBytes ::
|
|||
|
|
kind (u8)
|
|||
|
|
payload(...)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Where `kind` is:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
0x00 => DagInputExternal
|
|||
|
|
0x01 => DagInputNode
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### 4.3.1 DagInputExternal
|
|||
|
|
|
|||
|
|
For:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
DagInputExternal {
|
|||
|
|
input_index: uint32
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Encoding:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
kind = 0x00
|
|||
|
|
input_index (u32)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
`input_index` is the 0-based index into the `inputs` list passed to `Exec_DAG`.
|
|||
|
|
|
|||
|
|
#### 4.3.2 DagInputNode
|
|||
|
|
|
|||
|
|
For:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
DagInputNode {
|
|||
|
|
node_id: NodeId // uint32
|
|||
|
|
output_index: uint32
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Encoding:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
kind = 0x01
|
|||
|
|
node_id (u32)
|
|||
|
|
output_index (u32)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
`node_id` MUST refer to some `Node.id` in the same Program (scheme-level validity).
|
|||
|
|
`output_index` is the 0-based index into that Node’s output list.
|
|||
|
|
|
|||
|
|
#### 4.3.3 Decoder behavior
|
|||
|
|
|
|||
|
|
Decoders MUST:
|
|||
|
|
|
|||
|
|
* Treat any `kind` value other than `0x00` or `0x01` as an encoding error (invalid `DagInput`).
|
|||
|
|
* For `kind = 0x00`, read exactly one `u32` as `input_index`.
|
|||
|
|
* For `kind = 0x01`, read exactly two `u32` values (`node_id`, `output_index`).
|
|||
|
|
* Reject truncated encodings (insufficient bytes for the payload).
|
|||
|
|
|
|||
|
|
Structural validity of indices (e.g., `node_id` existence, output arity) is enforced at the scheme level, not the encoding layer.
|
|||
|
|
|
|||
|
|
### 4.4 RootRef encoding
|
|||
|
|
|
|||
|
|
For:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
RootRef {
|
|||
|
|
node_id: NodeId
|
|||
|
|
output_index: uint32
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Encoding:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
RootRefBytes ::
|
|||
|
|
node_id (u32)
|
|||
|
|
output_index (u32)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
* `RootRefBytes` is identical to the payload of `DagInputNode`, but without a `kind` byte.
|
|||
|
|
Roots are always Node outputs, so no variant tag is needed.
|
|||
|
|
|
|||
|
|
* The `roots` list in `ProgramBytes` MUST encode each `RootRef` in the logical order of `Program.roots`.
|
|||
|
|
|
|||
|
|
Decoders MUST reject truncated entries (insufficient bytes for both `u32` values).
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Canonicality Requirements
|
|||
|
|
|
|||
|
|
### 5.1 Node ordering in Program
|
|||
|
|
|
|||
|
|
Encoders MUST:
|
|||
|
|
|
|||
|
|
* encode `Program.nodes` in the **canonical topological order** defined by `PEL/PROGRAM-DAG/1 §4`:
|
|||
|
|
|
|||
|
|
* Dependencies appear before dependents.
|
|||
|
|
* Ties are broken by smallest `NodeId` (numeric, ascending).
|
|||
|
|
|
|||
|
|
* ensure that the `node_count` written in `ProgramBytes` equals the number of encoded `NodeBytes`.
|
|||
|
|
|
|||
|
|
Decoders:
|
|||
|
|
|
|||
|
|
* MAY assume the encoded order corresponds to the canonical topological order.
|
|||
|
|
* MAY perform additional checks (e.g., verifying acyclicity), but this is not required for basic decoding.
|
|||
|
|
|
|||
|
|
### 5.2 Field ordering
|
|||
|
|
|
|||
|
|
Field ordering in all structures is fixed and MUST NOT vary:
|
|||
|
|
|
|||
|
|
* `ProgramBytes` — `program_version`, `node_count`, `nodes…`, `root_count`, `roots…`
|
|||
|
|
* `NodeBytes` — `node_id`, `op_name`, `op_version`, `input_count`, `inputs…`, `params_len`, `params_bytes…`
|
|||
|
|
* `DagInputBytes` — `kind`, then variant-specific payload.
|
|||
|
|
* `RootRefBytes` — `node_id`, `output_index`.
|
|||
|
|
|
|||
|
|
Any deviation MUST be treated as an encoding error.
|
|||
|
|
|
|||
|
|
### 5.3 Injectivity and stability
|
|||
|
|
|
|||
|
|
The mapping:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
Program -> ProgramBytes
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
defined by this profile MUST be:
|
|||
|
|
|
|||
|
|
* **Injective** — if `P1 != P2` as logical `Program` values (per `PEL/PROGRAM-DAG/1`), then `ProgramBytes(P1) != ProgramBytes(P2)`.
|
|||
|
|
|
|||
|
|
* **Stable** — the same logical `Program` MUST encode to the same `ProgramBytes` across:
|
|||
|
|
|
|||
|
|
* different implementations,
|
|||
|
|
* platforms,
|
|||
|
|
* executions,
|
|||
|
|
* and times,
|
|||
|
|
|
|||
|
|
given the same version of this encoding profile and the same underlying operation/param profiles.
|
|||
|
|
|
|||
|
|
Encoders MUST NOT:
|
|||
|
|
|
|||
|
|
* reorder Nodes other than by the canonical topological order,
|
|||
|
|
* reorder inputs within a Node,
|
|||
|
|
* reorder roots,
|
|||
|
|
* introduce alternative encodings for integers, strings, or params.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Program Artifact Binding
|
|||
|
|
|
|||
|
|
### 6.1 TypeTag
|
|||
|
|
|
|||
|
|
Program Artifacts for this scheme MUST be encoded as:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
Artifact {
|
|||
|
|
bytes = ProgramBytes
|
|||
|
|
type_tag = TYPE_TAG_PEL_PROGRAM_DAG_1
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Where:
|
|||
|
|
|
|||
|
|
* `TYPE_TAG_PEL_PROGRAM_DAG_1` is a `TypeTag` with a concrete `tag_id` assigned in the global TypeTag registry.
|
|||
|
|
|
|||
|
|
This encoding profile:
|
|||
|
|
|
|||
|
|
* uses `TYPE_TAG_PEL_PROGRAM_DAG_1` symbolically, and
|
|||
|
|
* does not assign a specific numeric `tag_id`; that is done in a registry document.
|
|||
|
|
|
|||
|
|
### 6.2 Identity via ASL/1-CORE and HASH/ASL1
|
|||
|
|
|
|||
|
|
Given `ENC/ASL1-CORE v1` as the canonical encoding for `Artifact` and some chosen ASL1 hash algorithm `H` (e.g. `HASH-ASL1-256` under `HashId = 0x0001`):
|
|||
|
|
|
|||
|
|
1. The canonical `ArtifactBytes` for a Program Artifact is given by `ENC/ASL1-CORE v1`:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
ArtifactBytes =
|
|||
|
|
encode_artifact_core_v1(
|
|||
|
|
Artifact{ bytes = ProgramBytes, type_tag = TYPE_TAG_PEL_PROGRAM_DAG_1 }
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. The canonical `Reference` for that Artifact under `HashId = HID` is:
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
digest = H(ArtifactBytes)
|
|||
|
|
reference = Reference { hash_id = HID, digest = digest }
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
All conformant implementations MUST agree on:
|
|||
|
|
|
|||
|
|
* `ProgramBytes` for a given logical `Program`,
|
|||
|
|
* `ArtifactBytes` for the Program Artifact,
|
|||
|
|
* the resulting `Reference` for any fixed `(HashId, H)`.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Error Handling (Encoding Level)
|
|||
|
|
|
|||
|
|
This encoding profile defines only **structural encoding errors**. Handling of scheme-level validity errors (`INVALID_PROGRAM`, etc.) is done by `PEL/PROGRAM-DAG/1` and `PEL/1-CORE`.
|
|||
|
|
|
|||
|
|
Decoders MUST treat as encoding errors:
|
|||
|
|
|
|||
|
|
1. **Truncated fields**
|
|||
|
|
|
|||
|
|
* Not enough bytes to read any declared integer, string, list, or params blob.
|
|||
|
|
|
|||
|
|
2. **Unsupported `program_version`**
|
|||
|
|
|
|||
|
|
* `program_version != 1`.
|
|||
|
|
|
|||
|
|
3. **Invalid `DagInput.kind`**
|
|||
|
|
|
|||
|
|
* `kind` is not `0x00` or `0x01`.
|
|||
|
|
|
|||
|
|
4. **Invalid `Utf8String`**
|
|||
|
|
|
|||
|
|
* `op_name` bytes are not valid UTF-8.
|
|||
|
|
|
|||
|
|
5. **Inconsistent list lengths**
|
|||
|
|
|
|||
|
|
* Fewer or more `NodeBytes` than indicated by `node_count`.
|
|||
|
|
* Fewer or more `RootRefBytes` than indicated by `root_count`.
|
|||
|
|
|
|||
|
|
These are **encoding-layer** issues. The exact error codes surfaced to callers (e.g., `ERR_PEL_ENC_INVALID`) are implementation-specific but MUST result in rejection of the Program bytes as malformed under this encoding profile.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. Streaming and Implementation Notes
|
|||
|
|
|
|||
|
|
Implementations MUST be able to:
|
|||
|
|
|
|||
|
|
* Encode any `Program` using a single forward pass over the canonical node order:
|
|||
|
|
|
|||
|
|
* compute canonical topological order first (requires holding `Program` structure),
|
|||
|
|
* then write fields in the order defined above.
|
|||
|
|
|
|||
|
|
* Decode any `ProgramBytes` sequentially:
|
|||
|
|
|
|||
|
|
* no backtracking or multi-pass parsing is required,
|
|||
|
|
* all length prefixes appear before their content.
|
|||
|
|
|
|||
|
|
For very large Programs:
|
|||
|
|
|
|||
|
|
* Implementations MAY:
|
|||
|
|
|
|||
|
|
* stream Nodes one by one into an internal representation,
|
|||
|
|
* stream `params_bytes` to a buffer or directly into an operation-registry decoder.
|
|||
|
|
|
|||
|
|
* They MUST ensure that any observable behavior (including error reporting) is independent of chunking or I/O strategy: two conformant decoders seeing the same `ProgramBytes` MUST reconstruct the same logical `Program` or the same encoding error.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Conformance
|
|||
|
|
|
|||
|
|
An implementation is **ENC/PEL-PROGRAM-DAG/1–conformant** if it:
|
|||
|
|
|
|||
|
|
1. **Implements `ProgramBytes` encoding/decoding**
|
|||
|
|
|
|||
|
|
* Encodes and decodes `ProgramBytes` exactly as defined in §4.
|
|||
|
|
* Treats `program_version = 1` as the only supported version.
|
|||
|
|
* Treats deviations (unknown version, malformed fields) as encoding errors.
|
|||
|
|
|
|||
|
|
2. **Respects canonical ordering**
|
|||
|
|
|
|||
|
|
* When encoding `Program` values, orders `nodes` in the canonical topological order defined in `PEL/PROGRAM-DAG/1`.
|
|||
|
|
* Preserves the logical order of `roots`.
|
|||
|
|
|
|||
|
|
3. **Uses canonical field encodings**
|
|||
|
|
|
|||
|
|
* Uses `u32` lengths for lists and params as specified.
|
|||
|
|
* Uses `Utf8String` for operation names.
|
|||
|
|
* Uses `u8` discriminants and the specified payload layout for `DagInput`.
|
|||
|
|
|
|||
|
|
4. **Preserves injectivity and stability**
|
|||
|
|
|
|||
|
|
* Ensures distinct logical Programs (per `PEL/PROGRAM-DAG/1`) produce distinct `ProgramBytes`.
|
|||
|
|
* Ensures the same logical Program consistently produces the same `ProgramBytes` under this profile.
|
|||
|
|
|
|||
|
|
5. **Binds to Program Artifacts correctly**
|
|||
|
|
|
|||
|
|
* When forming Program Artifacts for `PEL/PROGRAM-DAG/1`, sets:
|
|||
|
|
|
|||
|
|
* `Artifact.bytes = ProgramBytes`
|
|||
|
|
* `Artifact.type_tag = TYPE_TAG_PEL_PROGRAM_DAG_1`
|
|||
|
|
|
|||
|
|
* Uses `ENC/ASL1-CORE v1` and `HASH/ASL1` for ASL/1 identity.
|
|||
|
|
|
|||
|
|
Everything else — storage, transport, operation registries, traces — is outside the scope of this encoding profile, provided it does not contradict the requirements above.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. Informative Example
|
|||
|
|
|
|||
|
|
> This example is non-normative and uses abbreviated hex.
|
|||
|
|
> It illustrates only the field layout, not exact ASCII/UTF-8 bytes.
|
|||
|
|
|
|||
|
|
Consider a tiny Program:
|
|||
|
|
|
|||
|
|
* Nodes:
|
|||
|
|
|
|||
|
|
1. `N0` — `id = 1`
|
|||
|
|
|
|||
|
|
* `op = (name = "add64", version = 1)`
|
|||
|
|
* `inputs = [ DagInputExternal{input_index = 0}, DagInputExternal{input_index = 1} ]`
|
|||
|
|
* `params =` empty
|
|||
|
|
|
|||
|
|
2. `N1` — `id = 2`
|
|||
|
|
|
|||
|
|
* `op = (name = "mul64", version = 1)`
|
|||
|
|
* `inputs = [ DagInputNode{node_id = 1, output_index = 0}, DagInputExternal{input_index = 2} ]`
|
|||
|
|
* `params =` empty
|
|||
|
|
|
|||
|
|
* Roots:
|
|||
|
|
|
|||
|
|
* `RootRef{ node_id = 2, output_index = 0 }`
|
|||
|
|
|
|||
|
|
Canonical topological order:
|
|||
|
|
|
|||
|
|
* Node 1 has only external inputs → first.
|
|||
|
|
* Node 2 depends on Node 1 → second.
|
|||
|
|
|
|||
|
|
ProgramBytes (pseudo-annotated):
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
program_version = 0001 ; u16 = 1
|
|||
|
|
|
|||
|
|
node_count = 00000002 ; 2 nodes
|
|||
|
|
|
|||
|
|
; Node 0 (id = 1)
|
|||
|
|
node_id = 00000001
|
|||
|
|
op_name = 00000005 "add64" ; length=5, bytes 'a','d','d','6','4'
|
|||
|
|
op_version = 00000001
|
|||
|
|
input_count = 00000002
|
|||
|
|
; input 0: external(0)
|
|||
|
|
kind = 00
|
|||
|
|
input_index = 00000000
|
|||
|
|
; input 1: external(1)
|
|||
|
|
kind = 00
|
|||
|
|
input_index = 00000001
|
|||
|
|
params_len = 00000000 ; empty params
|
|||
|
|
|
|||
|
|
; Node 1 (id = 2)
|
|||
|
|
node_id = 00000002
|
|||
|
|
op_name = 00000005 "mul64"
|
|||
|
|
op_version = 00000001
|
|||
|
|
input_count = 00000002
|
|||
|
|
; input 0: node(1,0)
|
|||
|
|
kind = 01
|
|||
|
|
node_id = 00000001
|
|||
|
|
output_index = 00000000
|
|||
|
|
; input 1: external(2)
|
|||
|
|
kind = 00
|
|||
|
|
input_index = 00000002
|
|||
|
|
params_len = 00000000 ; empty params
|
|||
|
|
|
|||
|
|
root_count = 00000001
|
|||
|
|
; root 0: (node 2, output 0)
|
|||
|
|
node_id = 00000002
|
|||
|
|
output_index = 00000000
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
These bytes become `Artifact.bytes` for a Program Artifact with `type_tag = TYPE_TAG_PEL_PROGRAM_DAG_1`. All conformant encoders under `PEL_ENC_PROGRAM_DAG_V1` MUST produce the same byte sequence for this logical Program.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**End of `ENC/PEL-PROGRAM-DAG/1 v0.2.0 — Canonical Encoding for DAG Programs`**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Document History
|
|||
|
|
|
|||
|
|
* **0.2.0 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.
|