amduat/tier1/opreg-pel1-kernel.md
2025-12-19 19:22:40 +01:00

742 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# OPREG/PEL1-KERNEL — Kernel Operation Registry for PEL/1
Status: Approved
Owner: Niklas Rydberg
Version: 0.1.1
SoT: Yes
Last Updated: 2025-11-16
Linked Phase Pack: N/A
Tags: [registry, execution]
<!-- Source: /amduat/docs/new/opreg-pel1-kernel.md | Canonical: /amduat/tier1/opreg-pel1-kernel.md -->
**Document ID:** `OPREG/PEL1-KERNEL`
**Layer:** L1 Profile (Operation Registry for `PEL/1-CORE` + `PEL/PROGRAM-DAG/1`)
**Depends on (normative):**
* `ASL/1-CORE v0.3.x``Artifact`, `TypeTag`, `Reference`, `HashId`
* `PEL/1-CORE v0.1.x` — primitive execution layer core
* `PEL/PROGRAM-DAG/1 v0.2.x` — DAG scheme for PEL
* `HASH/ASL1 v0.2.x` — ASL1 hash family (for `HASH-ASL1-256`)
**Integrates with (informative):**
* `SUBSTRATE/STACK-OVERVIEW v0.1.x`
* `ENC/PEL-PROGRAM-DAG/1` (canonical encoding of Program)
* `PEL/TRACE-DAG/1` (optional trace profile)
* Higher-level operation registries (domain-specific ops)
© 2025 Niklas Rydberg.
## License
Except where otherwise noted, this document (text and diagrams) is licensed under
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
Universal (CC0) to enable unrestricted reuse in implementations and derivative
specifications.
Code examples in this document are provided under the Apache License 2.0 unless
explicitly stated otherwise. Test vectors, where present, are dedicated to the
public domain under CC0 1.0.
---
## 0. Purpose and Non-Goals
### 0.1 Purpose
`OPREG/PEL1-KERNEL` defines a **minimal, globally stable set of PEL operations** that every “kernel-capable” PEL engine is expected to implement:
* They operate on **ASL/1 Artifacts** (bytes + optional type tag).
* They are used in **`PEL/PROGRAM-DAG/1`** programs as `OperationId { name, version }`.
* They are **pure and deterministic**: same inputs → same outputs, independent of engine or environment.
* They explicitly define:
* **Arity** (number of inputs),
* **Parameter model** (logical Params value),
* **Output shape** (number and form of outputs),
* **Runtime error conditions** and associated `status_code` values for `PEL/PROGRAM-DAG/1`.
These operations are intentionally **low-level** and **byte-centric**; richer semantics (JSON, typed records, domain-specific logic) belong in separate registries.
### 0.2 Non-goals
This registry does **not** define:
* Any storage or transport API (`ASL/1-STORE`).
* Any encoding of Programs or Params into bytes (`ENC/PEL-PROGRAM-DAG/1`, param-encoding profiles).
* Any certification or fact semantics (`CIL/1`, `FER/1`, `FCT/1`).
* Provenance graph edges (`TGK/1`).
* Human-readable diagnostics payloads (see §2.4).
---
## 1. Conventions and Context
### 1.1 Base types
From `ASL/1-CORE`:
```text
Artifact {
bytes: OctetString
type_tag: optional TypeTag
}
TypeTag {
tag_id: uint32
}
Reference {
hash_id: HashId
digest: OctetString
}
HashId = uint16
```
From `PEL/1-CORE` and `PEL/PROGRAM-DAG/1` (simplified):
```text
OperationId {
name: string
version: uint32
}
ExecutionStatus = uint8 // e.g. OK, INVALID_PROGRAM, INVALID_INPUTS, RUNTIME_FAILED
ExecutionErrorKind = uint8 // e.g. NONE, PROGRAM, INPUTS, RUNTIME
ExecutionErrorSummary {
kind: ExecutionErrorKind
status_code: uint32
}
DiagnosticEntry {
code: uint32
message: OctetString
}
ExecutionResultValue {
pel1_version : uint16
status : ExecutionStatus
scheme_ref : SchemeRef
summary : ExecutionErrorSummary
diagnostics : list<DiagnosticEntry>
}
```
`PEL/PROGRAM-DAG/1` defines `Exec_DAG` as:
```text
Exec_DAG(
program: Artifact,
inputs: list<Artifact>,
params: optional Artifact
) -> (outputs: list<Artifact>, result: ExecutionResultValue)
```
and defines that each Node evaluates an `OperationId` with a logical interface:
```text
Op(name, version)(
inputs: list<Artifact>,
params: ParamsValue
) -> Ok(list<Artifact>) | Err(status_code: uint32)
```
The overall `ExecutionResultValue.summary.status_code` for `RUNTIME_FAILED` is taken from the `status_code` returned by the failing operation.
### 1.2 Status and error mapping
This registry only defines **runtime error codes** (used when `Exec_DAG` sets `status = RUNTIME_FAILED`).
Global outcome statuses:
```text
ExecutionStatus {
OK = 0
INVALID_PROGRAM = 2
INVALID_INPUTS = 3
RUNTIME_FAILED = 4
}
```
Error summary kind:
```text
ExecutionErrorKind {
NONE = 0
PROGRAM = 1
INPUTS = 2
RUNTIME = 3
}
```
Mapping (from `PEL/PROGRAM-DAG/1`):
* `status = OK``kind = NONE`, `status_code = 0`
* `status = INVALID_PROGRAM``kind = PROGRAM`, `status_code = 2`
* `status = INVALID_INPUTS``kind = INPUTS`, `status_code = 3`
* `status = RUNTIME_FAILED``kind = RUNTIME`, `status_code = op-specific (> 0)`
This registry **only** defines the operation-specific `status_code` values that may appear when `status = RUNTIME_FAILED`.
### 1.3 Kernel status_code layout
For kernel ops we reserve a simple scheme for `status_code` on runtime failure:
```text
status_code = (kernel_op_code << 16) | error_index
```
Where:
* `kernel_op_code` is a 16-bit numeric code assigned per operation in this registry.
* `error_index` is a small (non-zero) 16-bit integer enumerating distinct error causes per op.
This ensures:
* No collision between error codes of different operations.
* Easy offline decoding of `status_code` into `(op, reason)`.
Concrete `kernel_op_code` assignments are given in §3.
### 1.4 Params and encodings
Each operation defines a **logical Params type** (e.g. `SliceParams { offset: u64; length: u64 }`).
This registry does **not** define byte-level encodings of Params; those are defined in a companion profile (e.g. `OPREG/PEL1-KERNEL-PARAMS/1`). This document is the **semantic** registry.
Conformance requirements:
* For each operation, there MUST exist exactly one canonical encoding and decoding for its Params type.
* All engines claiming to implement the operation MUST use that same encoding and decoding.
* If Params decoding fails, the operation MUST treat the Node as either:
* `INVALID_PROGRAM` (preferred for static malformations), or
* `RUNTIME_FAILED` with a specific `status_code` (if the registry so specifies).
For this initial kernel set, we treat **Param decoding errors as INVALID_PROGRAM**, not as runtime failures.
### 1.5 Diagnostics
To keep `ExecutionResultValue` stable and simple, kernel operations:
For kernel operations, the operation semantics MUST always return an empty diagnostics list, and the schemes Exec_DAG implementation MUST NOT add additional diagnostics when a failing Node is a kernel op.
Human-readable error information is expected to be carried in:
* separate trace artifacts (`PEL/TRACE-DAG/1`), or
* external logs and observability systems, not in `ExecutionResultValue.diagnostics`.
---
## 2. Common Kernel Operation Conventions
All kernel operations in this registry share these properties:
1. **Purity and determinism**
* They operate only on:
* the input `Artifact.bytes` and `Artifact.type_tag`,
* their decoded Params,
* standard pure functions (e.g. integer arithmetic, hashing as per `HASH/ASL1`).
* They MUST NOT:
* read clocks or random sources,
* perform network or filesystem I/O,
* depend on global mutable state.
2. **Type tags**
* Unless otherwise stated, operations **preserve the input type tag** when transforming a single input.
* For operations with multiple inputs, if they require consistent type tags, this is checked at runtime and may yield a runtime error.
* Operations MAY produce Artifacts with `type_tag = None` for “raw bytes” outputs.
3. **Arity and static vs dynamic errors**
* Each operation specifies `min_inputs` and `max_inputs`.
* Violations of these arity constraints are **static** (depend only on the Program) and MUST be treated as `INVALID_PROGRAM`, not `RUNTIME_FAILED`.
* Runtime errors are reserved for **data-dependent** conditions (e.g. out-of-bounds slice based on actual input length).
4. **Success vs failure**
* On success: operation returns `Ok(list<Artifact>)`, and `Exec_DAG` keeps `status = OK` (unless a different Node fails later).
* On failure: operation returns `Err(status_code)`, and `Exec_DAG` stops evaluation and sets:
```text
status = RUNTIME_FAILED
summary.kind = RUNTIME
summary.status_code = status_code
diagnostics = []
```
---
## 3. Kernel Operation Index
We define four kernel operations:
| Kernel Op Code | OperationId.name | version | Summary |
| -------------: | ----------------------- | :------ | --------------------------------------- |
| `0x0001` | `"pel.bytes.concat"` | `1` | Concatenate N artifacts |
| `0x0002` | `"pel.bytes.slice"` | `1` | Take a byte slice of one artifact |
| `0x0003` | `"pel.bytes.const"` | `1` | Produce a constant artifact from params |
| `0x0004` | `"pel.bytes.hash.asl1"` | `1` | Hash an artifacts bytes with ASL1 |
All operation names are case-sensitive UTF-8 strings.
Each operations `OperationId` is:
```text
OperationId {
name: <as in table>
version: 1
}
```
`kernel_op_code` in the `status_code` formula (§1.3) is the hex code in the first column.
---
## 4. Operation Specifications
### 4.1 `pel.bytes.concat` v1 (code 0x0001)
**OperationId**
```text
name = "pel.bytes.concat"
version = 1
kernel_op_code = 0x0001
```
**Intent**
Concatenate the byte payloads of N input Artifacts (N ≥ 1) into a single output Artifact. All input type tags MUST be identical (including “no type tag”).
#### 4.1.1 Arity and Params
* `min_inputs = 1`
* `max_inputs = unbounded` (any positive number)
* Params: **none** (`Unit`)
Static errors (handled as `INVALID_PROGRAM`):
* `inputs.length == 0`.
* Params not decodable as `Unit` (i.e. any non-empty Params according to the canonical encoding).
#### 4.1.2 Semantics
Given:
```text
inputs = [A0, A1, ..., A_{n-1}], n >= 1
params = ()
```
Let:
```text
Ti = Ai.type_tag
Bi = Ai.bytes
```
1. **Type tag consistency check (runtime)**
* If there exist `i, j` such that `Ti` and `Tj` are not equal in the `ASL/1-CORE` sense (i.e. one is absent and the other present, or both present but with different `tag_id`):
* Operation returns `Err(status_code = 0x0001_0001)`.
2. **Concatenation**
* Define:
```text
B_out = B0 || B1 || ... || B_{n-1} // byte-wise concatenation
T_out = T0 // they are all equal by step 1
```
* Output list is a single Artifact `C`:
```text
C.bytes = B_out
C.type_tag = T_out
```
* Operation returns `Ok([C])`.
This operation does not impose any explicit limit on the concatenated length; overflow or resource exhaustion is outside the PEL semantic layer.
#### 4.1.3 Runtime error codes
For `pel.bytes.concat` v1, runtime errors (producing `RUNTIME_FAILED`) are:
| Name | Condition | `status_code` |
| ------------------- | ---------------------------------- | ------------- |
| `TYPE_TAG_MISMATCH` | Any pair of input type tags differ | `0x0001_0001` |
On any such error:
* Operation returns `Err(status_code)` as above.
* `Exec_DAG` sets `status = RUNTIME_FAILED`, `summary.status_code = status_code`, `diagnostics = []`.
---
### 4.2 `pel.bytes.slice` v1 (code 0x0002)
**OperationId**
```text
name = "pel.bytes.slice"
version = 1
kernel_op_code = 0x0002
```
**Intent**
Take a contiguous slice from a single input Artifacts bytes.
#### 4.2.1 Params: `SliceParams`
Logical Params:
```text
SliceParams {
offset: uint64 // byte offset, 0-based
length: uint64 // number of bytes to include
}
```
* `offset` and `length` are non-negative.
* Their canonical encoding/decoding is defined in a param-encoding profile; invalid encodings MUST result in `INVALID_PROGRAM`.
#### 4.2.2 Arity
* `min_inputs = 1`
* `max_inputs = 1`
Arity violations → `INVALID_PROGRAM`.
#### 4.2.3 Semantics
Given:
```text
inputs = [A]
params = SliceParams { offset, length }
```
Let:
```text
B = A.bytes // length = L
T = A.type_tag
L = |B|
o = offset
= length
```
1. **Range check (runtime)**
* If `o > L` or `o + > L` (with arithmetic in unbounded integers):
* Operation returns `Err(status_code = 0x0002_0001)`.
2. **Slicing**
* Define:
```text
B_out = B[o .. o+] // bytes starting at index o
```
* Output Artifact `C`:
```text
C.bytes = B_out
C.type_tag = T
```
* Operation returns `Ok([C])`.
Note: `o == L` and ` == 0` is allowed and yields an empty-byte output.
#### 4.2.4 Runtime error codes
For `pel.bytes.slice` v1:
| Name | Condition | `status_code` |
| --------------------- | ------------------------------------------------------- | ------------- |
| `RANGE_OUT_OF_BOUNDS` | `offset > len(bytes)` or `offset + length > len(bytes)` | `0x0002_0001` |
On such error, `Exec_DAG` sets `status = RUNTIME_FAILED`, `summary.status_code = 0x0002_0001`, `diagnostics = []`.
---
### 4.3 `pel.bytes.const` v1 (code 0x0003)
**OperationId**
```text
name = "pel.bytes.const"
version = 1
kernel_op_code = 0x0003
```
**Intent**
Produce a constant Artifact specified entirely by Params, with no data dependencies. This is a way to embed small literal values directly in a Program.
#### 4.3.1 Params: `ConstParams`
Logical Params:
```text
ConstParams {
bytes: OctetString // payload bytes
has_tag: bool // whether a type tag is present
tag_id: uint32 optional // only meaningful if has_tag = true
}
```
Semantics:
* If `has_tag == false`:
* Output Artifact has `type_tag = None`.
* If `has_tag == true`:
* Output Artifact has `type_tag = Some(TypeTag{ tag_id })`.
Param encoding/decoding is defined in a param-encoding profile; malformed encodings ⇒ `INVALID_PROGRAM`.
#### 4.3.2 Arity
* `min_inputs = 0`
* `max_inputs = 0`
Any non-empty `inputs` list is a static error (`INVALID_PROGRAM`).
#### 4.3.3 Semantics
Given:
```text
inputs = []
params = ConstParams { bytes = B, has_tag, tag_id? }
```
Then:
* If `has_tag` is false:
```text
C.bytes = B
C.type_tag = None
```
* If `has_tag` is true:
```text
C.bytes = B
C.type_tag = Some(TypeTag{ tag_id })
```
* Output list is `[C]`.
* Operation returns `Ok([C])`.
There are no data-dependent runtime errors: this operation **always succeeds** given valid Params.
#### 4.3.4 Runtime error codes
*None.*
All failures are static (bad Params encoding, wrong arity) and must be treated as `INVALID_PROGRAM`.
---
### 4.4 `pel.bytes.hash.asl1` v1 (code 0x0004)
**OperationId**
```text
name = "pel.bytes.hash.asl1"
version = 1
kernel_op_code = 0x0004
```
**Intent**
Compute an ASL1-family hash (`HASH/ASL1`) over the raw bytes of a single input Artifact.
This operation is **not** about ASL/1 identity (which uses `ArtifactBytes` via `ENC/ASL1-CORE`), but about hashing arbitrary byte payloads for protocol or application use.
#### 4.4.1 Params: `HashParams`
Logical Params:
```text
HashParams {
hash_id: HashId // must be a valid ASL1 HashId
}
```
For this version:
* `hash_id` MUST be `0x0001` (i.e. `HASH-ASL1-256`).
* Any other `hash_id` MUST be treated as a **static error**`INVALID_PROGRAM`.
Rationale: this ensures all conformant engines agree on the algorithm set for this op. Future versions (e.g. `pel.bytes.hash.asl1` v2) MAY support additional `HashId`s.
Param encoding/decoding is defined elsewhere; malformed encodings ⇒ `INVALID_PROGRAM`.
#### 4.4.2 Arity
* `min_inputs = 1`
* `max_inputs = 1`
Arity violations ⇒ `INVALID_PROGRAM`.
#### 4.4.3 Semantics
Given:
```text
inputs = [A]
params = HashParams { hash_id = 0x0001 }
```
Let:
```text
B = A.bytes
H = HASH-ASL1-256 // SHA-256 as defined in HASH/ASL1 for HashId 0x0001
```
Compute:
```text
digest = H(B) // 32-byte digest
```
Then:
* Output Artifact `C`:
```text
C.bytes = digest // exactly 32 bytes
C.type_tag = None // raw bytes digest, no type tag
```
* Output list is `[C]`.
* Operation returns `Ok([C])`.
There are no data-dependent runtime errors; hashing is assumed total. Any internal errors (e.g. memory failure) are outside PEL semantics.
#### 4.4.4 Runtime error codes
*None.*
All failures (unsupported `hash_id`, bad Params, wrong arity) are static and must be treated as `INVALID_PROGRAM`.
---
## 5. Conformance
An engine is **OPREG/PEL1-KERNELconformant** if and only if:
1. **Operation availability**
* It exposes the four operations defined in §3 with exactly the specified `OperationId { name, version }`.
2. **Arity and Params**
* For each operation, it enforces `min_inputs`/`max_inputs` as specified.
* It implements the defined logical Params types and uses the canonical param encoding/decoding for each.
* It treats:
* arity violations, and
* invalid or undecodable Params
as `INVALID_PROGRAM` per `PEL/PROGRAM-DAG/1` (i.e. `Exec_DAG` produces `status = INVALID_PROGRAM`, `summary.status_code = 2`).
3. **Runtime semantics**
* For all supported operations:
* Given the same input Artifacts and Params, all conformant engines produce identical output Artifacts (same `bytes`, same `type_tag`) and identical `status_code` on failure.
* Runtime failure conditions (e.g. slice out-of-bounds, type tag mismatch) are detected exactly as specified and mapped to the correct `status_code` using the `kernel_op_code`/`error_index` scheme.
4. **Status and diagnostics mapping**
* When a kernel operation returns `Ok`, `Exec_DAG` MUST NOT change `status` or `summary` (beyond normal success semantics).
* When a kernel operation returns `Err(status_code)`:
* `Exec_DAG` MUST set:
```text
status = RUNTIME_FAILED
summary.kind = RUNTIME
summary.status_code = status_code
diagnostics = []
```
* `Exec_DAG` MUST NOT mutate other fields of `ExecutionResultValue` except as defined in `PEL/PROGRAM-DAG/1` (e.g. to capture which Node failed, via trace profiles).
5. **Purity**
* Kernel operations MUST not perform external I/O or observe environment state; they must behave as pure functions of their inputs and Params.
* Any caching or performance optimizations MUST NOT change observable behavior at the level of `Artifact` values and `status_code`.
6. **Layering**
* The engine does not depend on `ASL/1-STORE`, `TGK/1`, `CIL/1`, `FER/1`, `FCT/1`, or `OI/1` on the PEL core hot path. It may use those layers around PEL, but not as part of the operation semantics.
---
## 6. Change Log (informative)
**v0.1.1 (2025-11-15)**
* Removed `pel.bytes.clone/1` from the kernel operation index.
* Reassigned `kernel_op_code` values to remove the `0x01` gap:
* `pel.bytes.concat/1`: `0x0002``0x0001` (and `TYPE_TAG_MISMATCH` from `0x0002_0001``0x0001_0001`).
* `pel.bytes.slice/1`: `0x0003``0x0002` (and `RANGE_OUT_OF_BOUNDS` from `0x0003_0001``0x0002_0001`).
* `pel.bytes.const/1`: `0x0004``0x0003`.
* `pel.bytes.hash.asl1/1`: `0x0005``0x0004`.
* Updated §3, §4, and §5 to reflect the new kernel op set and codes.
**v0.1.0 (2025-11-15)**
* Initial definition of `OPREG/PEL1-KERNEL` with five kernel operations:
* `pel.bytes.clone/1`
* `pel.bytes.concat/1`
* `pel.bytes.slice/1`
* `pel.bytes.const/1`
* `pel.bytes.hash.asl1/1`
* Established `status_code = (kernel_op_code << 16) | error_index` convention.
* Restricted `pel.bytes.hash.asl1/1` to `HashId = 0x0001` (HASH-ASL1-256) for cross-implementation determinism.
* Required kernel operations to leave `ExecutionResultValue.diagnostics` empty; richer diagnostics to be handled by trace/overlay profiles.
---
## Document History
* **0.1.1 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.