Adding TGK specifications.
This commit is contained in:
parent
1e88925ece
commit
47e3ccc382
738
tier1/enc-tgk1-edge-1.md
Normal file
738
tier1/enc-tgk1-edge-1.md
Normal file
|
|
@ -0,0 +1,738 @@
|
|||
# ENC/TGK1-EDGE/1 — Canonical Encoding for TGK EdgeArtifacts
|
||||
|
||||
Status: Approved
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.1.0
|
||||
SoT: Yes
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [binary-minimalism, traceability]
|
||||
|
||||
<!-- Source: /amduat/docs/new/enc-tgk1-edge1.md | Canonical: /amduat/tier1/enc-tgk1-edge-1.md -->
|
||||
|
||||
**Document ID:** `ENC/TGK1-EDGE/1`
|
||||
**Profile ID:** `TGK1_EDGE_ENC_V1 = 0x0201` (symbolic; concrete assignment lives in encoding-profile registry)
|
||||
**Layer:** Edge Encoding Profile (on top of ASL/1-CORE + TGK/1-CORE)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE v0.4.x` — value model (`Artifact`, `TypeTag`, `Reference`, `HashId`, identity model)
|
||||
* `ENC/ASL1-CORE v1.x` — canonical encodings for `Artifact` and `Reference`
|
||||
* `TGK/1-CORE v0.7.x` — trace graph kernel (`Node`, `EdgeBody`, `EdgeTypeId`, edgehood invariants)
|
||||
|
||||
**Integrates with (informative):**
|
||||
|
||||
* `HASH/ASL1 v0.2.x` — ASL1 hash family for `EdgeRef` identity
|
||||
* `ASL/1-STORE v0.4.x` — content-addressable store holding EdgeArtifacts
|
||||
* `SUBSTRATE/STACK-OVERVIEW v0.2.x` — stack layering discipline
|
||||
* TGK type catalogs (e.g. `TGK/TYPES-CORE`) — `EdgeTypeId` semantics
|
||||
* Future TGK profiles (`TGK/STORE/1`, `TGK/PROV/1`) that interpret edges
|
||||
|
||||
> The Profile ID `TGK1_EDGE_ENC_V1` is a configuration label.
|
||||
> It is **not** embedded into edge payloads. Encoders and decoders select this encoding by context (type tag + profile configuration), not per value.
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 0. Overview
|
||||
|
||||
`ENC/TGK1-EDGE/1` defines the **canonical, streaming-friendly, injective binary encoding** of the `EdgeBody` structure from `TGK/1-CORE`:
|
||||
|
||||
```text
|
||||
EdgeBody {
|
||||
type: EdgeTypeId // uint32
|
||||
from: Node[] // Node = Reference
|
||||
to: Node[]
|
||||
payload: Reference
|
||||
}
|
||||
```
|
||||
|
||||
and its embedding as TGK **EdgeArtifacts**:
|
||||
|
||||
```text
|
||||
Artifact {
|
||||
bytes = EdgeBytes // this profile
|
||||
type_tag = TYPE_TAG_TGK1_EDGE_V1
|
||||
}
|
||||
```
|
||||
|
||||
where `EdgeBytes` is a single `OctetString` (sequence of bytes) used as `Artifact.bytes`.
|
||||
|
||||
Under this profile:
|
||||
|
||||
* `EdgeBytes` is the canonical representation of an `EdgeBody`.
|
||||
* Edge identity is the ASL/1 `Reference` over the EdgeArtifact (`EdgeRef`), derived via `ENC/ASL1-CORE` + `HASH/ASL1`.
|
||||
* The encoding is:
|
||||
|
||||
* **Injective** — distinct `EdgeBody` values → distinct `EdgeBytes`.
|
||||
* **Deterministic & stable** — same `EdgeBody` → same `EdgeBytes` across implementations and time.
|
||||
* **Streaming-friendly** — encoders, decoders, and hashers can operate in a single forward-only pass.
|
||||
|
||||
In line with `TGK/1-CORE`:
|
||||
|
||||
* Each EdgeArtifact encodes **exactly one** logical edge (one `EdgeBody`).
|
||||
* All TGK edges are represented as ordinary ASL/1 Artifacts plus their ASL `Reference` identities; this profile introduces no additional identity or node/edge ID layer.
|
||||
|
||||
> **Non-goal:** This profile does **not** define what any particular `EdgeTypeId` “means”, nor how graphs are stored, indexed, or traversed. Those behaviors are defined by `TGK/1-CORE`, TGK type catalogs, and higher-layer profiles.
|
||||
|
||||
---
|
||||
|
||||
## 1. Scope & Layering
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
This specification defines:
|
||||
|
||||
* The **binary layout** of:
|
||||
|
||||
* `EdgeBytes` — canonical encoding for `EdgeBody`.
|
||||
* `EncodedRef` — an internal wrapper for embedding ASL `Reference`s.
|
||||
|
||||
* Canonical field ordering and integer widths.
|
||||
|
||||
* How `EdgeBytes` are bound into EdgeArtifacts and converted into `EdgeRef` identity.
|
||||
|
||||
It does **not** define:
|
||||
|
||||
* TGK graph semantics or provenance algorithms (`TGK/1-CORE`, `TGK/PROV/1`).
|
||||
* Store or transport APIs (`ASL/1-STORE`, deployment profiles).
|
||||
* Edge-type catalogs (`TGK/TYPES-*`) or policy.
|
||||
|
||||
### 1.2 Layering constraints
|
||||
|
||||
In line with `SUBSTRATE/STACK-OVERVIEW` and `TGK/1-CORE`:
|
||||
|
||||
* `ENC/TGK1-EDGE/1` is a **TGK edge-encoding profile**, not a kernel primitive.
|
||||
|
||||
* It MUST NOT:
|
||||
|
||||
* redefine `Artifact`, `Reference`, `HashId`, or `TypeTag` (from `ASL/1-CORE`);
|
||||
* redefine `Node`, `EdgeBody`, or `EdgeTypeId` (from `TGK/1-CORE`);
|
||||
* embed store, provenance, or policy semantics into its layout.
|
||||
|
||||
* It defines exactly one canonical encoding for `EdgeBody` values under the profile ID `TGK1_EDGE_ENC_V1`.
|
||||
|
||||
TGK/1-CORE sees this profile as providing a partial function:
|
||||
|
||||
```text
|
||||
decode_edge_payload_TGK1_EDGE :
|
||||
OctetString -> EdgeBody | error
|
||||
```
|
||||
|
||||
that is:
|
||||
|
||||
* **partial** — may fail with an error for some inputs;
|
||||
* **deterministic** — a pure function of its input bytes, with no dependence on environment or mutable state;
|
||||
* **side-effect free** — decoding does not consult stores, catalogs, or policy.
|
||||
|
||||
Artifacts whose `type_tag` selects this profile use `decode_edge_payload_TGK1_EDGE` as their TGK edge decoder in the sense of `TGK/1-CORE §3.2`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Conventions
|
||||
|
||||
### 2.1 RFC 2119 terms
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **SHOULD**, **MAY**, etc. are to be interpreted as described in RFC 2119.
|
||||
|
||||
### 2.2 Integer encodings
|
||||
|
||||
All multi-byte integers are encoded as **big-endian** (network byte order), as in `ENC/ASL1-CORE`:
|
||||
|
||||
* `u8` — 1 byte
|
||||
* `u16` — 2 bytes
|
||||
* `u32` — 4 bytes
|
||||
* `u64` — 8 bytes
|
||||
|
||||
Only fixed-width integers are used.
|
||||
|
||||
### 2.3 Lists
|
||||
|
||||
A list of values of some type `T` is encoded as:
|
||||
|
||||
```text
|
||||
List<T> ::
|
||||
count (u32)
|
||||
element_0
|
||||
element_1
|
||||
...
|
||||
element_{count-1}
|
||||
```
|
||||
|
||||
* `count` is the number of elements (MAY be zero).
|
||||
* Elements are encoded in order using the canonical encoding of `T`.
|
||||
|
||||
### 2.4 Embedded Reference (`EncodedRef`)
|
||||
|
||||
Within `EdgeBytes`, ASL/1 `Reference` values are embedded using a length-prefixed wrapper over canonical `ReferenceBytes` from `ENC/ASL1-CORE`:
|
||||
|
||||
```text
|
||||
EncodedRef ::
|
||||
ref_len (u32)
|
||||
ref_bytes (byte[0..ref_len-1]) // canonical ReferenceBytes
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
* `ref_bytes` MUST be the canonical `ReferenceBytes` encoding of some `Reference` value under `ENC/ASL1-CORE v1.x`:
|
||||
|
||||
```text
|
||||
ReferenceBytes ::
|
||||
hash_id (u16)
|
||||
digest (byte[...]) // remaining bytes in the frame
|
||||
```
|
||||
|
||||
* `ref_len` MUST be the exact byte length of `ref_bytes` and MUST be ≥ 2.
|
||||
|
||||
Decoders MUST:
|
||||
|
||||
1. Read `ref_len (u32)`.
|
||||
2. Read exactly `ref_len` bytes as `ref_bytes`.
|
||||
3. Decode `ref_bytes` as `ReferenceBytes` per `ENC/ASL1-CORE v1.x`.
|
||||
4. Reject encodings where:
|
||||
|
||||
* `ref_len < 2`, or
|
||||
* `ref_bytes` is not a valid `ReferenceBytes` sequence (e.g. truncated or improperly framed in its context).
|
||||
|
||||
If the implementation also implements `HASH/ASL1` and recognizes the decoded `hash_id`, it MUST apply any length checks required by `ENC/ASL1-CORE` / `HASH/ASL1` for that `HashId` (e.g. fixed digest length). Failures MUST be treated as encoding/integrity errors.
|
||||
|
||||
`EncodedRef` is purely an internal framing wrapper for this profile; it introduces no additional semantics beyond “a `Reference` encoded canonically and length-prefixed so it can be embedded in larger structures”.
|
||||
|
||||
This pattern mirrors `EncodedRef` from `ENC/PEL-TRACE-DAG/1` for cross-profile consistency.
|
||||
|
||||
### 2.5 Encoding version field (`edge_version`)
|
||||
|
||||
`EdgeBytes` includes an `edge_version (u16)` field:
|
||||
|
||||
* For `TGK1_EDGE_ENC_V1`, encoders **MUST** always write `edge_version = 1`.
|
||||
* Decoders for this profile:
|
||||
|
||||
* **MUST** accept `edge_version = 1`; and
|
||||
* **MUST** treat any other value as “**not this encoding**” and fail decoding.
|
||||
|
||||
Within this profile, `edge_version` is a **guard word**, not an evolution mechanism:
|
||||
|
||||
* This document will never assign any other meaning than “constant value 1” to `edge_version` for `TGK1_EDGE_ENC_V1`.
|
||||
* Values other than `1` simply indicate that the bytes are not an `EdgeBytes` value for this profile.
|
||||
|
||||
Any incompatible change to the `EdgeBytes` layout MUST be expressed as a **new encoding profile** (e.g. `TGK1_EDGE_ENC_V2` with its own Profile ID, and almost certainly a new `TypeTag`), not by reusing this profile with `edge_version = 2`.
|
||||
|
||||
Append-only extensions that would change the canonical mapping from `EdgeBody` to bytes are also out of scope for this profile; they belong in new profiles. Canonical `EdgeBody → EdgeBytes` mapping for `TGK1_EDGE_ENC_V1` is fixed and permanently tied to `edge_version = 1`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Logical Model Reference (from TGK/1-CORE)
|
||||
|
||||
> **Source of truth:** `TGK/1-CORE`.
|
||||
> This section is an informative restatement; in any conflict, `TGK/1-CORE` governs.
|
||||
|
||||
### 3.1 Node
|
||||
|
||||
```text
|
||||
Node := Reference // ASL/1 Reference
|
||||
```
|
||||
|
||||
Nodes are graph vertices identified solely by their `Reference` value.
|
||||
|
||||
### 3.2 EdgeTypeId
|
||||
|
||||
```text
|
||||
EdgeTypeId = uint32
|
||||
```
|
||||
|
||||
Semantics of particular `EdgeTypeId` values are defined by TGK type catalogs and profiles, not by this document.
|
||||
|
||||
### 3.3 EdgeBody
|
||||
|
||||
```text
|
||||
EdgeBody {
|
||||
type: EdgeTypeId
|
||||
from: Node[] // ordered, MAY be empty
|
||||
to: Node[] // ordered, MAY be empty
|
||||
payload: Reference // always present
|
||||
}
|
||||
```
|
||||
|
||||
Relevant invariant from `TGK/1-CORE`:
|
||||
|
||||
> **TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1**
|
||||
> For a well-formed `EdgeBody`, at least one of `from` or `to` **MUST** be non-empty.
|
||||
> An `EdgeBody` with both `from = []` and `to = []` is invalid and MUST NOT be produced or accepted as a TGK edge.
|
||||
|
||||
Other notes from `TGK/1-CORE`:
|
||||
|
||||
* Duplicates within `from` or `to` are allowed.
|
||||
* `payload` may also appear in `from` or `to`.
|
||||
* Semantics of such patterns, if any, are profile-specific.
|
||||
|
||||
`ENC/TGK1-EDGE/1` encodes exactly these fields and MUST NOT introduce additional logical data at the `EdgeBody` level.
|
||||
|
||||
---
|
||||
|
||||
## 4. EdgeBody Encoding
|
||||
|
||||
### 4.1 Overall layout: `EdgeBytes`
|
||||
|
||||
The canonical encoding of an `EdgeBody` under `TGK1_EDGE_ENC_V1` is a single self-contained byte sequence:
|
||||
|
||||
```text
|
||||
EdgeBytes ::
|
||||
edge_version (u16)
|
||||
type_id (u32) // EdgeTypeId
|
||||
from_count (u32)
|
||||
from_nodes (EncodedRef[0..from_count-1])
|
||||
to_count (u32)
|
||||
to_nodes (EncodedRef[0..to_count-1])
|
||||
payload_ref (EncodedRef)
|
||||
```
|
||||
|
||||
`EdgeBytes` is treated as an indivisible frame. When embedded in larger structures or protocols, the enclosing layer is responsible for providing the frame boundaries (e.g. via a length-prefix or message framing).
|
||||
|
||||
Field roles:
|
||||
|
||||
1. **edge_version (u16)**
|
||||
|
||||
* Guard word for this encoding profile.
|
||||
* For `TGK1_EDGE_ENC_V1`, encoders **MUST** set `edge_version = 1` for all values.
|
||||
* Decoders for this profile:
|
||||
|
||||
* **MUST** accept `edge_version = 1`; and
|
||||
* **MUST** treat any other value as “not a `TGK1_EDGE_ENC_V1` edge payload” and fail decoding.
|
||||
|
||||
`edge_version` is not a version knob for evolving `TGK1_EDGE_ENC_V1`; it is a constant sanity check to quickly reject mismatched bytes.
|
||||
|
||||
2. **type_id (u32)**
|
||||
|
||||
* Encodes `EdgeBody.type : EdgeTypeId`.
|
||||
* The meaning of each `EdgeTypeId` value is external to this spec.
|
||||
|
||||
3. **from_count (u32)** and **from_nodes**
|
||||
|
||||
* `from_count` is the length of `EdgeBody.from`.
|
||||
* `from_nodes` is a list of `from_count` `EncodedRef` entries, each encoding a `Node` (i.e. a `Reference`).
|
||||
* Order MUST match the logical `from` list; duplicates are allowed; MAY be zero-length.
|
||||
|
||||
4. **to_count (u32)** and **to_nodes**
|
||||
|
||||
* `to_count` is the length of `EdgeBody.to`.
|
||||
* `to_nodes` is a list of `to_count` `EncodedRef` entries.
|
||||
* Order MUST match the logical `to` list; duplicates are allowed; MAY be zero-length.
|
||||
|
||||
5. **payload_ref (EncodedRef)**
|
||||
|
||||
* Encodes `EdgeBody.payload : Reference`.
|
||||
* Always present and encoded as a single `EncodedRef`.
|
||||
|
||||
### 4.2 Encoding procedure (normative)
|
||||
|
||||
Let `E` be a logical `EdgeBody` value. The canonical encoding function:
|
||||
|
||||
```text
|
||||
encode_edgebody_tgk1_v1 : EdgeBody -> EdgeBytes
|
||||
```
|
||||
|
||||
is defined as:
|
||||
|
||||
1. Set `edge_version = 1`.
|
||||
|
||||
2. Emit `edge_version` as `u16`.
|
||||
|
||||
3. Emit `E.type` as `type_id (u32)`.
|
||||
|
||||
4. Let `from_count = len(E.from)`; emit `from_count (u32)`.
|
||||
|
||||
5. For each `Node` in `E.from` in order:
|
||||
|
||||
* Let `R` be that `Node` (an ASL `Reference` value).
|
||||
* Encode `R` as canonical `ReferenceBytes` using `ENC/ASL1-CORE v1.x`.
|
||||
* Wrap as `EncodedRef` (see §2.4) and append.
|
||||
|
||||
6. Let `to_count = len(E.to)`; emit `to_count (u32)`.
|
||||
|
||||
7. For each `Node` in `E.to` in order:
|
||||
|
||||
* Encode as `EncodedRef` as above and append.
|
||||
|
||||
8. Encode `E.payload` as canonical `ReferenceBytes`, wrap as `EncodedRef`, and append as `payload_ref`.
|
||||
|
||||
9. Enforce the TGK non-empty endpoint invariant at encoding time:
|
||||
|
||||
* If `from_count == 0` **and** `to_count == 0`, the encoder MUST fail and MUST NOT produce `EdgeBytes` for this `EdgeBody` under this profile.
|
||||
|
||||
> **TGK1-EDGE-NONEMPTY/ENC/1**
|
||||
> Encoders for `TGK1_EDGE_ENC_V1` **MUST** reject any attempt to encode an `EdgeBody` with `from = []` and `to = []`.
|
||||
> Such a value is not a well-formed TGK edge per `TGK/1-CORE` and MUST NOT be emitted as an EdgeArtifact payload.
|
||||
|
||||
### 4.3 Decoding procedure (normative)
|
||||
|
||||
Given a byte slice known to contain exactly one `EdgeBytes` frame under this profile, the canonical decoding function:
|
||||
|
||||
```text
|
||||
decode_edgebody_tgk1_v1 : EdgeBytes -> EdgeBody | error
|
||||
```
|
||||
|
||||
is defined as:
|
||||
|
||||
1. Read `edge_version (u16)`.
|
||||
|
||||
* If `edge_version != 1`, fail with an encoding error (e.g. “not `TGK1_EDGE_ENC_V1`”).
|
||||
|
||||
2. Read `type_id (u32)`.
|
||||
|
||||
3. Read `from_count (u32)`.
|
||||
|
||||
* For `i = 0 .. from_count-1`, read and decode one `EncodedRef` as a `Reference` and append to `from_nodes`.
|
||||
|
||||
4. Read `to_count (u32)`.
|
||||
|
||||
* For `j = 0 .. to_count-1`, read and decode one `EncodedRef` and append to `to_nodes`.
|
||||
|
||||
5. Read `payload_ref` as a single `EncodedRef` and decode to `payload : Reference`.
|
||||
|
||||
6. If `from_count == 0` **and** `to_count == 0`, fail with an encoding error:
|
||||
|
||||
* This violates `TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1` and `TGK1-EDGE-NONEMPTY/ENC/1`.
|
||||
|
||||
7. If the decoding context expects an isolated `EdgeBytes` value:
|
||||
|
||||
* After step 5 (or 6), if any unread bytes remain in the slice, the decoder MUST treat this as an encoding error (trailing data).
|
||||
|
||||
8. Construct and return:
|
||||
|
||||
```text
|
||||
EdgeBody {
|
||||
type = EdgeTypeId(type_id)
|
||||
from = from_nodes
|
||||
to = to_nodes
|
||||
payload = payload
|
||||
}
|
||||
```
|
||||
|
||||
Decoders MUST additionally treat as encoding errors:
|
||||
|
||||
* truncated sequences (insufficient bytes for any declared field or `EncodedRef`);
|
||||
* invalid `EncodedRef` encodings (see §2.4);
|
||||
* any integer reads that cannot be completed because the input ends early.
|
||||
|
||||
`decode_edgebody_tgk1_v1` MUST be deterministic and MUST NOT depend on any external configuration beyond:
|
||||
|
||||
* the bytes in the `EdgeBytes` frame; and
|
||||
* the static definition of `ENC/ASL1-CORE v1.x` used to decode embedded `ReferenceBytes`.
|
||||
|
||||
Recognition of `type_id` values (as supported or not in a given ExecutionEnvironment) is handled by `TGK/1-CORE` and the local catalog. This profile always decodes the raw `EdgeBody` structure, regardless of whether the environment later chooses to treat it as an EdgeArtifact.
|
||||
|
||||
---
|
||||
|
||||
## 5. EdgeArtifact Binding & Profile Selection
|
||||
|
||||
### 5.1 EdgeArtifact shape
|
||||
|
||||
Under this profile, EdgeArtifacts MUST be ASL/1 Artifacts of the form:
|
||||
|
||||
```text
|
||||
Artifact {
|
||||
bytes = EdgeBytes
|
||||
type_tag = TYPE_TAG_TGK1_EDGE_V1
|
||||
}
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
* `TYPE_TAG_TGK1_EDGE_V1` is a `TypeTag` whose concrete `tag_id`:
|
||||
|
||||
* is assigned in the global TypeTag registry, and
|
||||
* is included in the environment’s `EDGE_TAG_SET` when this profile is active.
|
||||
|
||||
ExecutionEnvironments that wish to treat such Artifacts as TGK edges MUST:
|
||||
|
||||
* include `TYPE_TAG_TGK1_EDGE_V1.tag_id` in their configured `EDGE_TAG_SET`; and
|
||||
* register `TGK1_EDGE_ENC_V1` as the edge-encoding profile for that tag, so that `decode_edge_payload_TGK1_EDGE` is used for those Artifacts’ `bytes`.
|
||||
|
||||
This document treats `TYPE_TAG_TGK1_EDGE_V1` symbolically and does not assign a numeric `tag_id`.
|
||||
|
||||
### 5.2 Integration with TGK/1-CORE’s `decode_edge_payload_P`
|
||||
|
||||
For ExecutionEnvironments that activate `TGK1_EDGE_ENC_V1` for `TYPE_TAG_TGK1_EDGE_V1`, the corresponding `decode_edge_payload_P` function from `TGK/1-CORE §3.2` is:
|
||||
|
||||
```text
|
||||
decode_edge_payload_TGK1_EDGE(bytes: OctetString) -> EdgeBody | error
|
||||
```
|
||||
|
||||
defined by:
|
||||
|
||||
```text
|
||||
decode_edgebody_tgk1_v1(bytes)
|
||||
```
|
||||
|
||||
from §4.3.
|
||||
|
||||
Conformant implementations MUST:
|
||||
|
||||
* apply `decode_edge_payload_TGK1_EDGE` only to Artifacts whose `type_tag.tag_id` is configured to use this profile; and
|
||||
* treat any decoding failure as “not a valid edge payload for this profile”.
|
||||
|
||||
Multi-profile behavior (e.g., co-existence with other edge encodings) is governed by `TGK/1-CORE §3.2`. In particular:
|
||||
|
||||
* If more than one active profile successfully decodes the same `Artifact.bytes`, all such profiles MUST decode to the same logical `EdgeBody` value.
|
||||
* If two active profiles decode the same bytes to different `EdgeBody` values, the ExecutionEnvironment MUST NOT treat that Artifact as an EdgeArtifact until the conflict is resolved.
|
||||
|
||||
---
|
||||
|
||||
## 6. EdgeRef Identity via ASL/1-CORE
|
||||
|
||||
Given:
|
||||
|
||||
* `EdgeBytes` from §4;
|
||||
|
||||
* an `EdgeArtifact`:
|
||||
|
||||
```text
|
||||
A_edge = Artifact {
|
||||
bytes = EdgeBytes
|
||||
type_tag = TYPE_TAG_TGK1_EDGE_V1
|
||||
}
|
||||
```
|
||||
|
||||
* `ENC/ASL1-CORE v1.x` for canonical `ArtifactBytes`;
|
||||
|
||||
* a hash algorithm `H` with `HashId = HID` from `HASH/ASL1`,
|
||||
|
||||
the canonical `EdgeRef : Reference` (the edge identity) is:
|
||||
|
||||
```text
|
||||
ArtifactBytes = encode_artifact_core_v1(A_edge)
|
||||
digest = H(ArtifactBytes)
|
||||
EdgeRef = Reference { hash_id = HID, digest = digest }
|
||||
```
|
||||
|
||||
This profile does not introduce any new identity scheme. Edge identity is entirely determined by:
|
||||
|
||||
* the ASL/1 Artifact identity model,
|
||||
* the selected encoding profile (typically `ASL_ENC_CORE_V1`), and
|
||||
* the selected hash algorithm (`HASH/ASL1`).
|
||||
|
||||
---
|
||||
|
||||
## 7. Canonicality & Injectivity
|
||||
|
||||
### 7.1 Injectivity
|
||||
|
||||
> **TGK1-EDGE-INJECTIVE/ENC/1**
|
||||
> Under `TGK1_EDGE_ENC_V1`, the mapping:
|
||||
>
|
||||
> ```text
|
||||
> EdgeBody -> EdgeBytes
|
||||
> ```
|
||||
>
|
||||
> MUST be injective. That is, for any two `EdgeBody` values `E1` and `E2`:
|
||||
>
|
||||
> ```text
|
||||
> E1 != E2 ⇒ encode_edgebody_tgk1_v1(E1) != encode_edgebody_tgk1_v1(E2)
|
||||
> ```
|
||||
|
||||
This is ensured by:
|
||||
|
||||
* encoding all logical fields (`type`, `from`, `to`, `payload`);
|
||||
* preserving list order exactly;
|
||||
* using a fixed, explicit binary layout.
|
||||
|
||||
### 7.2 Stability
|
||||
|
||||
For the fixed profile `TGK1_EDGE_ENC_V1` (with the guard word `edge_version = 1`):
|
||||
|
||||
* The same logical `EdgeBody` MUST always encode to the same `EdgeBytes` across:
|
||||
|
||||
* implementations,
|
||||
* platforms,
|
||||
* executions,
|
||||
* and time.
|
||||
|
||||
Encoders MUST NOT:
|
||||
|
||||
* reorder elements of `from` or `to`;
|
||||
* alter integer widths or endianness;
|
||||
* introduce alternative layouts for any field;
|
||||
* use any `edge_version` other than `1`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Error Handling (Encoding Layer)
|
||||
|
||||
Decoders for this profile MUST treat as **encoding errors** (to be surfaced as some error category at the API boundary):
|
||||
|
||||
1. **Guard word mismatch**
|
||||
|
||||
* `edge_version != 1`.
|
||||
|
||||
2. **Truncated fields**
|
||||
|
||||
* Not enough bytes to read any declared field (`u16`, `u32`, `EncodedRef`, list elements).
|
||||
|
||||
3. **Invalid `EncodedRef`**
|
||||
|
||||
* `ref_len < 2`; or
|
||||
* `ref_bytes` is not a valid `ReferenceBytes` sequence per `ENC/ASL1-CORE v1.x`; or
|
||||
* (when `HASH/ASL1` is implemented and `hash_id` is known) the digest length implied by `ref_bytes` does not match the canonical length for that `HashId`.
|
||||
|
||||
4. **Empty endpoints**
|
||||
|
||||
* `from_count == 0` **and** `to_count == 0` (violation of `TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1`).
|
||||
|
||||
5. **Inconsistent list lengths**
|
||||
|
||||
* Fewer actual `EncodedRef` entries than indicated by `from_count` or `to_count`.
|
||||
|
||||
6. **Trailing data in isolated contexts**
|
||||
|
||||
* Additional bytes remaining after a full `EdgeBytes` value has been decoded, when the decoding context expects exactly one `EdgeBytes` frame.
|
||||
|
||||
Translating these into concrete error codes (e.g. `ERR_TGK1_EDGE_ENC_INVALID`) is implementation-specific, but MUST result in rejection of the payload as an `EdgeBytes` value under this profile.
|
||||
|
||||
Semantic errors about `EdgeTypeId` recognition or edge-type-specific constraints are handled by TGK catalogs and higher profiles, not at the encoding layer.
|
||||
|
||||
---
|
||||
|
||||
## 9. Streaming & Implementation Notes
|
||||
|
||||
Implementations MUST be able to encode and decode `EdgeBytes` in a **single forward-only pass**:
|
||||
|
||||
* All length prefixes (`from_count`, `to_count`, `ref_len`) precede their content.
|
||||
* Decoders MUST NOT require backtracking to interpret the structure.
|
||||
|
||||
For large edges (many endpoints):
|
||||
|
||||
* Encoders MAY stream `EncodedRef` entries as they are generated.
|
||||
* Decoders MAY stream `EncodedRef` entries to consumers or hashers as they are read.
|
||||
|
||||
Any such streaming strategy MUST be observationally equivalent to decoding the entire `EdgeBytes` into an `EdgeBody` in memory and MUST respect the canonical layout.
|
||||
|
||||
---
|
||||
|
||||
## 10. Conformance
|
||||
|
||||
An implementation is **ENC/TGK1-EDGE/1–conformant** if, for `TGK1_EDGE_ENC_V1`, it:
|
||||
|
||||
1. **Implements canonical EdgeBody encoding/decoding**
|
||||
|
||||
* Implements `encode_edgebody_tgk1_v1` and `decode_edgebody_tgk1_v1` exactly as specified in §4.
|
||||
* Always writes `edge_version = 1` when encoding.
|
||||
* Accepts only `edge_version = 1` and treats any other value as “not this encoding”.
|
||||
|
||||
2. **Uses `EncodedRef` correctly**
|
||||
|
||||
* Embeds `Reference` values via `EncodedRef` as in §2.4.
|
||||
* Uses canonical `ReferenceBytes` from `ENC/ASL1-CORE v1.x` when forming `ref_bytes`.
|
||||
* Applies `HASH/ASL1` length checks for known `HashId`s when available.
|
||||
|
||||
3. **Enforces TGK invariants at the encoding layer**
|
||||
|
||||
* Rejects encodings with both `from` and `to` empty (`TGK1-EDGE-NONEMPTY/ENC/1`).
|
||||
* Treats malformed payloads as encoding errors as per §8.
|
||||
|
||||
4. **Binds EdgeBytes into EdgeArtifacts correctly**
|
||||
|
||||
* When forming EdgeArtifacts, sets:
|
||||
|
||||
```text
|
||||
Artifact.bytes = EdgeBytes
|
||||
Artifact.type_tag = TYPE_TAG_TGK1_EDGE_V1
|
||||
```
|
||||
|
||||
* Does not embed additional logical data into the Artifact beyond `EdgeBody` and `type_tag`.
|
||||
|
||||
5. **Derives EdgeRef identity via ASL/1-CORE**
|
||||
|
||||
* Uses `ENC/ASL1-CORE v1` and `HASH/ASL1` for identity, as in §6.
|
||||
* Does not introduce alternative edge identity mechanisms at this layer.
|
||||
|
||||
6. **Integrates with TGK/1-CORE profile selection**
|
||||
|
||||
* Applies `decode_edge_payload_TGK1_EDGE` only to Artifacts whose `type_tag.tag_id` is configured for this profile.
|
||||
* Respects multi-profile behavior rules from `TGK/1-CORE §3.2` when other edge encodings are also active.
|
||||
|
||||
7. **Preserves injectivity and stability**
|
||||
|
||||
* Distinct `EdgeBody` values always produce distinct `EdgeBytes`.
|
||||
* The same `EdgeBody` always produces the same `EdgeBytes` under this profile.
|
||||
|
||||
Everything else — storage layout, access protocols, graph indexes, provenance algorithms, and edge-type semantics — is defined by `ASL/1-STORE`, `TGK/1-CORE`, TGK catalogs, and higher-layer profiles.
|
||||
|
||||
---
|
||||
|
||||
## 11. Informative Example (Sketch)
|
||||
|
||||
> Non-normative; values and hex are illustrative only.
|
||||
|
||||
Consider an edge:
|
||||
|
||||
```text
|
||||
EdgeBody {
|
||||
type = 0x00000010 // EDGE_EXECUTION (for example)
|
||||
from = [N_prog, N_input]
|
||||
to = [N_output]
|
||||
payload = R_receipt
|
||||
}
|
||||
```
|
||||
|
||||
Where `N_prog`, `N_input`, `N_output`, and `R_receipt` are `Reference` values with canonical `ReferenceBytes`:
|
||||
|
||||
```text
|
||||
Ref(N_prog) = ReferenceBytes(N_prog) // length = len_pg, bytes = bytes_pg
|
||||
Ref(N_input) = ReferenceBytes(N_input) // length = len_in, bytes = bytes_in
|
||||
Ref(N_output) = ReferenceBytes(N_output) // length = len_out, bytes = bytes_out
|
||||
Ref(R_receipt) = ReferenceBytes(R_receipt) // length = len_rc, bytes = bytes_rc
|
||||
```
|
||||
|
||||
Then `EdgeBytes` under this profile are:
|
||||
|
||||
```text
|
||||
edge_version = 0001 ; u16 (guard word)
|
||||
|
||||
type_id = 00000010 ; u32
|
||||
|
||||
from_count = 00000002 ; 2 sources
|
||||
from_nodes =
|
||||
000000?? bytes_pg ... ; EncodedRef(N_prog)
|
||||
000000?? bytes_in ... ; EncodedRef(N_input)
|
||||
|
||||
to_count = 00000001 ; 1 target
|
||||
to_nodes =
|
||||
000000?? bytes_out ... ; EncodedRef(N_output)
|
||||
|
||||
payload_ref =
|
||||
000000?? bytes_rc ... ; EncodedRef(R_receipt)
|
||||
```
|
||||
|
||||
Where each `EncodedRef(X)` is:
|
||||
|
||||
```text
|
||||
ref_len(X) (u32) || ReferenceBytes(X)
|
||||
```
|
||||
|
||||
These `EdgeBytes` become `Artifact.bytes` for an EdgeArtifact with `type_tag = TYPE_TAG_TGK1_EDGE_V1`. All conformant encoders MUST produce the same bytes for the same logical `EdgeBody`; all conformant decoders MUST reconstruct the same `EdgeBody` from those bytes.
|
||||
|
||||
---
|
||||
|
||||
**End of `ENC/TGK1-EDGE/1 v0.1.0 — Canonical Encoding for TGK EdgeArtifacts` (draft).**
|
||||
|
||||
---
|
||||
|
||||
## Document History
|
||||
|
||||
* **0.1.0 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.
|
||||
240
tier1/opreg-tgk-docgraph-1.md
Normal file
240
tier1/opreg-tgk-docgraph-1.md
Normal file
|
|
@ -0,0 +1,240 @@
|
|||
# OPREG/TGK-DOCGRAPH/1 — Document Graph Registry
|
||||
|
||||
Status: Draft
|
||||
Owner: Architecture
|
||||
Version: 0.1.0
|
||||
SoT: Plan
|
||||
Last Updated: 2025-12-01
|
||||
Linked Phase Pack: PH12
|
||||
Tags: [registry, tgk, docgraph]
|
||||
|
||||
<!-- Source: /amduat/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md | Canonical: /amduat/tier1/opreg-tgk-docgraph-1.md -->
|
||||
|
||||
**Document ID:** `OPREG/TGK-DOCGRAPH/1`
|
||||
**Layer:** L1 Profile (TGK Doc Graph Registry over `TGK/1-CORE` + `ENC/TGK1-EDGE/1`)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE v0.4.x` — `Artifact`, `Reference`, `TypeTag`, `HashId`
|
||||
* `ENC/ASL1-CORE v1.x` — canonical encodings for Artifacts and References
|
||||
* `HASH/ASL1 v0.2.x` — ASL1 hash family (`HASH-ASL1-256`)
|
||||
* `TGK/1-CORE v0.7.x` — trace graph kernel: `Node`, `EdgeBody`, `EdgeTypeId`
|
||||
* `ENC/TGK1-EDGE/1 v0.1.x` — canonical encoding for `EdgeBody` / EdgeArtifacts
|
||||
* `AMDUAT-DOCID` (Tier-0) — document identity and SoT/surface model
|
||||
|
||||
**Integrates with (informative):**
|
||||
|
||||
* `TGK/STORE/1` — graph store/query profile over ASL/1-STORE + TGK
|
||||
* ADR-032 and PH10/PH12 import designs (RΩ / export)
|
||||
* Future doc graph consumers (assistant overlays, IDX, provenance views)
|
||||
|
||||
© 2025 Amduat Programme.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
---
|
||||
|
||||
## 0. Purpose and Non-Goals
|
||||
|
||||
### 0.1 Purpose
|
||||
|
||||
`OPREG/TGK-DOCGRAPH/1` defines a **doc/import/navigation graph registry** for Amduat:
|
||||
|
||||
* It names **node concepts** (as ASL/1 Artifacts) for:
|
||||
* conceptual documents (DOCID lineages),
|
||||
* document versions at a given snapshot (e.g. RΩ),
|
||||
* Git commits and blobs,
|
||||
* Amduat SoT instances.
|
||||
* It names **edge types** (`EdgeTypeId`s) that connect those concepts:
|
||||
* document ↔ version, surface, SoT state,
|
||||
* version ↔ Git blob/commit,
|
||||
* document ↔ Amduat instance.
|
||||
* It constrains how those edges are represented as EdgeArtifacts under
|
||||
`ENC/TGK1-EDGE/1` and consumed via `TGK/STORE/1`.
|
||||
|
||||
This registry is intentionally **doc/import scoped**. Execution, fact, and
|
||||
certificate edges live in their own TGK/OPREG registries and MUST NOT reuse
|
||||
`EdgeTypeId` assignments from this doc graph registry.
|
||||
|
||||
This Tier-1 stub is the **canonical registry companion** to the PH12 design
|
||||
note `PH12-EV-IMPORT-001 — Doc Graph OPREG Profile Design
|
||||
(/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md)`,
|
||||
which records design intent and sandbox experience; this document is the SoT
|
||||
for the node and edge vocabulary.
|
||||
|
||||
### 0.2 Non-goals
|
||||
|
||||
This registry does **not** define:
|
||||
|
||||
* any storage API (`ASL/1-STORE`, `TGK/STORE/1` already cover that),
|
||||
* any provenance algorithms or queries (`TGK/PROV/1` and higher layers),
|
||||
* any assistant or overlay behavior (those consume this registry),
|
||||
* concrete import/export profiles (ADR-032 handles those).
|
||||
|
||||
It only defines **concepts and edge types**; encoding and storage use existing
|
||||
Tier-1 profiles.
|
||||
|
||||
---
|
||||
|
||||
## 1. Node Concepts (Informative overview)
|
||||
|
||||
This section summarizes node concepts; canonical encodings and type_tags are
|
||||
defined in companion encoding profiles (TBD).
|
||||
|
||||
### 1.1 DOC_CONCEPT
|
||||
|
||||
Conceptual governed document identity per `AMDUAT-DOCID`:
|
||||
|
||||
* `identity_authority` (string),
|
||||
* `lineage_id` (string),
|
||||
* optional `doc_code` (string),
|
||||
* optional `code_status` (e.g. `tentative`, `stable`).
|
||||
|
||||
There is exactly one `DOC_CONCEPT` node per `(identity_authority, lineage_id)`.
|
||||
|
||||
### 1.2 DOC_VERSION
|
||||
|
||||
Versioned SoT slice of a governed document at a snapshot commit:
|
||||
|
||||
* `identity_authority`, `lineage_id`, `doc_code`, `code_status`,
|
||||
* `g_commit` (Git commit id),
|
||||
* `sha256` (content hash of the doc bytes at `g_commit`),
|
||||
* `path` (repository path at `g_commit`, e.g. `/amduat/tier0/docid.md`),
|
||||
* `surface`, `sot` (SoT state) per DOCID header.
|
||||
|
||||
Multiple `DOC_VERSION` nodes may exist for a `DOC_CONCEPT` across commits.
|
||||
|
||||
### 1.3 GIT_COMMIT
|
||||
|
||||
Git commit metadata:
|
||||
|
||||
* `commit` (sha1),
|
||||
* `parents` (list of parent commit ids),
|
||||
* `tree` (tree id),
|
||||
* `author_name`, `author_email`, `authored_at`,
|
||||
* `committer_name`, `committer_email`, `committed_at`,
|
||||
* summary or truncated message.
|
||||
|
||||
### 1.4 GIT_BLOB
|
||||
|
||||
Content snapshot for a single blob at `g_commit`:
|
||||
|
||||
* `blob_sha` (sha1),
|
||||
* `sha256` (content hash),
|
||||
* `size_bytes`,
|
||||
* `mode` (tree mode, including exec/symlink bits),
|
||||
* `path` at `g_commit`.
|
||||
|
||||
### 1.5 AMDUAT_INSTANCE
|
||||
|
||||
Descriptor for an Amduat SoT instance:
|
||||
|
||||
* `g_commit` (RΩ commit),
|
||||
* `store_root` (SoT store root),
|
||||
* `store_backend_id`,
|
||||
* references to RΩ FER/1 receipts and manifests,
|
||||
* optional labels (environment, hostname, etc.).
|
||||
|
||||
### 1.6 Helper nodes
|
||||
|
||||
* `SURFACE` — surface classification nodes (e.g. `tier0`, `tier1`, `phase`, `evidence`).
|
||||
* `SOT_STATE` — SoT state nodes (`Yes`, `Plan`, `Ref`).
|
||||
|
||||
---
|
||||
|
||||
## 2. Edge Types (Doc Graph Domain)
|
||||
|
||||
`EdgeTypeId` values in this registry are reserved for doc/import/navigation
|
||||
edges. Concrete numeric assignments live in the encoding/catalogue layer.
|
||||
|
||||
Implementations and other OPREG registries MUST treat these `EdgeTypeId`s as
|
||||
belonging exclusively to the **Amduat doc graph domain**:
|
||||
|
||||
* the eventual allocation for this registry is expected to reserve a contiguous
|
||||
`EdgeTypeId` band (informally: an `AMDUAT-DOCGRAPH` band),
|
||||
* only doc/import/navigation semantics (edges in §§2.1–2.4) may occupy that
|
||||
band,
|
||||
* PEL execution, FER/1, CIL, FCT, and other TGK domains MUST use their own
|
||||
registries and bands.
|
||||
|
||||
### 2.1 Identity & version edges
|
||||
|
||||
* `EDGE_DOC_HAS_VERSION`
|
||||
`DOC_CONCEPT → DOC_VERSION` — this version belongs to this conceptual document.
|
||||
|
||||
* `EDGE_VERSION_OF`
|
||||
`DOC_VERSION → DOC_CONCEPT` — reverse link; derivable from `EDGE_DOC_HAS_VERSION`.
|
||||
|
||||
* `EDGE_DOC_HAS_IDENTITY`
|
||||
`DOC_VERSION → DOC_CONCEPT` — DOCID identity is attached to this version.
|
||||
|
||||
### 2.2 Surface & SoT edges
|
||||
|
||||
* `EDGE_DOC_ON_SURFACE`
|
||||
`DOC_VERSION → SURFACE` — surface classification (governance/spec/phase/evidence).
|
||||
|
||||
* `EDGE_DOC_SOT`
|
||||
`DOC_VERSION → SOT_STATE` — SoT status (`Yes`, `Plan`, `Ref`) for this version.
|
||||
|
||||
### 2.3 Git provenance edges
|
||||
|
||||
* `EDGE_VERSION_HAS_BLOB`
|
||||
`DOC_VERSION → GIT_BLOB` — ties a document version to the blob at `g_commit`.
|
||||
|
||||
* `EDGE_VERSION_FROM_COMMIT`
|
||||
`DOC_VERSION → GIT_COMMIT` — last commit that touched this path at/before the snapshot.
|
||||
|
||||
### 2.4 SoT instance edges
|
||||
|
||||
* `EDGE_DOC_MEMBER_OF_AMDUAT`
|
||||
`DOC_CONCEPT → AMDUAT_INSTANCE` — this document is part of a particular Amduat instance.
|
||||
|
||||
---
|
||||
|
||||
## 3. Encoding & Store Integration (Summary)
|
||||
|
||||
All doc-graph edges:
|
||||
|
||||
* are represented as TGK `EdgeBody` values with `EdgeTypeId` from this registry,
|
||||
* are encoded as EdgeArtifacts via `ENC/TGK1-EDGE/1` using `TYPE_TAG_TGK1_EDGE_V1`,
|
||||
* derive `EdgeRef` identities via `HASH/ASL1` over `EdgeBytes`,
|
||||
* live in ASL/1-STORE instances alongside other Artifacts.
|
||||
|
||||
Nodes (`DOC_CONCEPT`, `DOC_VERSION`, `GIT_COMMIT`, `GIT_BLOB`, `AMDUAT_INSTANCE`, etc.) are ordinary
|
||||
ASL/1 Artifacts; their `Reference`s are the TGK nodes.
|
||||
|
||||
`TGK/STORE/1` provides query semantics over the resulting graph.
|
||||
|
||||
JSON overlays or other projected views (for example, PH12 doc graph sandboxes)
|
||||
MAY be emitted for human navigation and experiments, but they are always
|
||||
derived from the underlying node Artifacts and EdgeArtifacts governed by this
|
||||
registry and `ENC/TGK1-EDGE/1`; overlays are never the source of truth for
|
||||
doc graph semantics.
|
||||
|
||||
---
|
||||
|
||||
## 4. Ingest & Encoder Interaction (Informative)
|
||||
|
||||
Implementations are expected to:
|
||||
|
||||
* materialise node Artifacts per this registry (and companion encoding profiles),
|
||||
* emit FER/1 receipts for ingest pipelines,
|
||||
* emit an idempotent edge worklist (doc-edge queue) that references `EdgeTypeId`s
|
||||
from this registry and node `Reference`s,
|
||||
* use a separate encoder to turn worklist items into EdgeArtifacts using `ENC/TGK1-EDGE/1`,
|
||||
writing them into ASL/1-STORE for consumption via `TGK/STORE/1`.
|
||||
|
||||
Details of worklist format and encoder scheduling are left to PH12/PHB01
|
||||
implementation notes; this registry only fixes the conceptual node/edge space.
|
||||
671
tier1/tgk-1-core.md
Normal file
671
tier1/tgk-1-core.md
Normal file
|
|
@ -0,0 +1,671 @@
|
|||
|
||||
# TGK/1-CORE — Trace Graph Kernel (Core)
|
||||
|
||||
Status: Approved
|
||||
Owner: Niklas Rydberg
|
||||
Version: 0.7.0
|
||||
SoT: Yes
|
||||
Last Updated: 2025-11-16
|
||||
Linked Phase Pack: N/A
|
||||
Tags: [traceability, execution]
|
||||
|
||||
<!-- Source: /amduat/docs/new/tgk.md | Canonical: /amduat/tier1/tgk-1-core.md -->
|
||||
|
||||
**Document ID:** `TGK/1-CORE`
|
||||
**Layer:** L1.5 — Logical graph kernel over ASL/1 (above ASL/1, orthogonal to PEL/1)
|
||||
|
||||
**Depends on (normative):**
|
||||
|
||||
* `ASL/1-CORE v0.3.x` — value substrate: `Artifact`, `Reference`, `TypeTag`, identity model
|
||||
|
||||
**Informative references:**
|
||||
|
||||
* `ENC/ASL1-CORE v1.0.x` — canonical encodings for ASL/1 values (`ArtifactBytes`, `ReferenceBytes`)
|
||||
* `HASH/ASL1 v0.2.x` — ASL1 hash family
|
||||
* `ASL/1-STORE v0.3.x` — content-addressable store semantics
|
||||
* `PEL/1` — execution substrate
|
||||
* `CIL/1`, `FCT/1`, `FER/1`, `OI/1` — higher-layer profiles built on top of TGK/1
|
||||
* (future) `ENC/TGK1-EDGE` — canonical edge-encoding profile
|
||||
* (future) `TGK/STORE/1` — graph store and query semantics
|
||||
* (future) `TGK/PROV/1` — provenance and trace semantics
|
||||
|
||||
> **Versioning note**
|
||||
> TGK/1-CORE is agnostic to minor revisions of these informative documents, provided they preserve:
|
||||
>
|
||||
> * the ASL/1-CORE definitions of `Artifact`, `Reference`, and `TypeTag`, and
|
||||
> * the existence of canonical encodings and hash families consistent with that model.
|
||||
|
||||
© 2025 Niklas Rydberg.
|
||||
|
||||
## License
|
||||
|
||||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||||
|
||||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||||
specifications.
|
||||
|
||||
Code examples in this document are provided under the Apache License 2.0 unless
|
||||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||||
public domain under CC0 1.0.
|
||||
|
||||
|
||||
---
|
||||
|
||||
## 0. Conventions
|
||||
|
||||
### 0.1 RFC 2119 terminology
|
||||
|
||||
The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**,
|
||||
**SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** are to be
|
||||
interpreted as described in RFC 2119.
|
||||
|
||||
### 0.2 Terms from ASL/1
|
||||
|
||||
This specification reuses the following terms from `ASL/1-CORE`:
|
||||
|
||||
* **Artifact** — immutable logical value:
|
||||
|
||||
```text
|
||||
Artifact {
|
||||
bytes: OctetString
|
||||
type_tag: optional TypeTag
|
||||
}
|
||||
```
|
||||
|
||||
* **Reference** — content address (logical identity handle) for an Artifact:
|
||||
|
||||
```text
|
||||
Reference {
|
||||
hash_id: HashId
|
||||
digest: OctetString
|
||||
}
|
||||
```
|
||||
|
||||
* **TypeTag** — opaque `uint32` identifying intended interpretation of an Artifact.
|
||||
|
||||
* **HashId** — `uint16` identifying a hash algorithm (e.g. from `HASH/ASL1`).
|
||||
|
||||
Where this document says **ArtifactRef**, it means an ASL/1 `Reference` that (logically) points to an `Artifact`. TGK/1-CORE does **not** assume the corresponding Artifact is present or retrievable in any particular store.
|
||||
|
||||
### 0.3 Additional terminology
|
||||
|
||||
* **Node** — synonym for an ASL/1 `Reference` when used as a graph vertex.
|
||||
* **EdgeBody** — the logical structure of a TGK edge (see §2.2).
|
||||
* **EdgeArtifact** — an ASL/1 `Artifact` whose payload logically encodes an `EdgeBody` (see §3).
|
||||
* **EdgeRef** — the ASL/1 `Reference` to an `EdgeArtifact`.
|
||||
* **EdgeTypeId** — `uint32` identifying the semantic type of an edge (see §2.3).
|
||||
* **ProvenanceGraph** — the logical graph derived from a set of Artifacts and TGK/1 edge semantics (see §4).
|
||||
* **ExecutionEnvironment** — a concrete deployment context characterized by:
|
||||
|
||||
* a **logical snapshot**: a finite set of Artifacts visible at that point in time; and
|
||||
* a fixed configuration of TGK-related profiles (edge encodings, type catalogs, provenance policies, etc.) “in effect” at that snapshot.
|
||||
|
||||
All invariants and uniqueness claims are evaluated with respect to such a finite snapshot.
|
||||
|
||||
> **Source-agnostic note (informative)**
|
||||
> The `Artifacts` set for a snapshot may be aggregated from any combination of ASL/1-STORE instances, archives, exports, or other sources. TGK/1-CORE is indifferent to where Artifacts come from or how they are stored; it operates purely on their logical values and `Reference`s.
|
||||
|
||||
TGK/1-CORE defines only **logical structures** and their equality / identity semantics. Physical storage, indexes, query APIs, and provenance algorithms are defined by separate profiles.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose, Scope & Non-Goals
|
||||
|
||||
### 1.1 Purpose
|
||||
|
||||
`TGK/1-CORE` defines the **minimal logical graph kernel over ASL/1 Artifacts**.
|
||||
|
||||
It provides:
|
||||
|
||||
* A definition of:
|
||||
|
||||
* **Nodes** as ASL/1 `Reference` values (ArtifactRefs); and
|
||||
* **Edges** as EdgeArtifacts whose payloads decode to `EdgeBody` values.
|
||||
* A way to view any snapshot of an ExecutionEnvironment (finite set of Artifacts + configured profiles) as a **ProvenanceGraph** that is a **pure projection** over:
|
||||
|
||||
* immutable Artifacts (including edge Artifacts), and
|
||||
* published edge-type specifications and encoding profiles.
|
||||
* A base vocabulary that higher profiles (PEL/1 integration, certification, facts, overlays, provenance) can use to declare:
|
||||
|
||||
* how they encode their relationships into edge Artifacts; and
|
||||
* how provenance traces are computed as projections over the resulting graph.
|
||||
|
||||
In other words:
|
||||
|
||||
> TGK/1-CORE makes “graph over artifacts” a first-class, **purely logical** notion, with all evidence residing in ASL/1 Artifacts.
|
||||
|
||||
> **TGK/EDGE-AS-ARTIFACT/CORE/1**
|
||||
> All TGK edges **MUST** be represented as ASL/1 Artifacts (“EdgeArtifacts”), and all references to edges **MUST** be ordinary ASL `Reference`s (“EdgeRef”). TGK/1-CORE **MUST NOT** introduce any separate identity scheme for edges.
|
||||
|
||||
### 1.2 Provenance kernel invariant & determinism
|
||||
|
||||
> **TGK/PROV-KERNEL/CORE/1**
|
||||
> For any ExecutionEnvironment considered at a particular **logical snapshot** (a finite set of Artifacts and the profile set in effect at that point):
|
||||
>
|
||||
> * the corresponding `ProvenanceGraph` (as defined in §4) is a **pure function** of:
|
||||
>
|
||||
> * that Artifact set, and
|
||||
> * the profiles’ decoding / edge-derivation rules; and
|
||||
> * any persisted graph indexes or materialized views are **optimizations only** and **MUST** be consistent with this projection.
|
||||
|
||||
> **TGK/DET/CORE/1**
|
||||
> For a fixed snapshot and fixed profile set, any two TGK/1-CORE–conformant implementations **MUST** derive isomorphic `ProvenanceGraph`s (identical edge and node sets, up to set equality). No aspect of the graph may depend on wall-clock time, process identity, storage layout, or other non-declared environment state.
|
||||
|
||||
> **TGK/NO-OFF-GRAPH-PROV/CORE/1**
|
||||
> Any relationship that is intended to participate in TGK-level provenance **MUST** be representable as:
|
||||
>
|
||||
> * an EdgeArtifact whose payload decodes to an `EdgeBody`, and
|
||||
> * Nodes (ASL `Reference`s) in its `from` / `to` / `payload` fields.
|
||||
>
|
||||
> TGK/1-CORE and its profiles **MUST NOT** rely on hidden, mutable, non-Artifactual state to represent provenance-relevant relationships.
|
||||
|
||||
TGK/1-CORE itself does **not** define a particular provenance algorithm; that is the role of `TGK/PROV/1` and higher-layer profiles.
|
||||
|
||||
### 1.3 Non-goals
|
||||
|
||||
TGK/1-CORE explicitly does **not** define:
|
||||
|
||||
* Canonical binary encodings or hashing rules for edges (delegated to edge-encoding profiles such as `ENC/TGK1-EDGE` and the ASL substrate stack).
|
||||
* Store APIs, physical graph storage, or indexing strategies (delegated to `TGK/STORE/1` and implementation design).
|
||||
* Error codes, authorization, or transport protocols.
|
||||
* A query or provenance language (delegated to `TGK/PROV/1`, overlays, or higher-level APIs).
|
||||
* Global registration or semantics of particular `EdgeTypeId` values (delegated to catalogs and profiles).
|
||||
|
||||
TGK/1-CORE is a **logical kernel** only.
|
||||
|
||||
### 1.4 Layering and dependencies
|
||||
|
||||
TGK/1-CORE sits:
|
||||
|
||||
* **Above ASL/1-CORE**:
|
||||
|
||||
* Reuses `Artifact`, `Reference`, `TypeTag`, and identity semantics.
|
||||
* Treats edge data as Artifacts; edge identities are ordinary `Reference`s.
|
||||
|
||||
* **Orthogonal to PEL/1**:
|
||||
|
||||
* MAY model PEL/1 executions as edges (via profiles).
|
||||
* Does not impose runtime behavior on PEL/1 engines.
|
||||
|
||||
#### 1.4.1 Layering invariant with PEL/1
|
||||
|
||||
**TGK/PEL-LAYERING-INV/CORE/1**
|
||||
|
||||
* TGK/1-CORE **MUST NOT** impose additional runtime behavior or API obligations on conformant PEL/1 engines beyond those defined in `PEL/1`.
|
||||
* Any TGK edges that describe PEL/1 executions **MUST** be derivable solely from stored ASL/1 Artifacts (programs, inputs, execution results, receipts) and published specifications.
|
||||
* Whether a PEL/1 implementation emits edge Artifacts directly is an implementation detail and is **not** part of PEL/1 conformance.
|
||||
|
||||
---
|
||||
|
||||
## 2. Core Graph Model
|
||||
|
||||
### 2.1 Node
|
||||
|
||||
A **Node** in TGK/1-CORE is any ASL/1 `Reference`:
|
||||
|
||||
```text
|
||||
Node := Reference // i.e., an ArtifactRef
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
||||
* Nodes are identified **only** by their `Reference` value.
|
||||
* TGK/1-CORE does not distinguish “edge nodes” vs “data nodes”; that is a profile-level notion.
|
||||
* There is no separate node ID layer; there are no node identifiers beyond `Reference`.
|
||||
* The presence of a Node in the ProvenanceGraph is implied by its appearance in any TGK edge’s `from`, `to`, or `payload` fields (see §4.1).
|
||||
|
||||
> **Edges-over-edges note (informative)**
|
||||
> Because Nodes are plain `Reference`s, they can point to any Artifact, including EdgeArtifacts. TGK/1-CORE therefore allows edges-over-edges (meta-edges that describe or govern other edges). The semantics of such patterns are determined by the profiles that define the relevant `EdgeTypeId` values.
|
||||
|
||||
### 2.2 EdgeBody
|
||||
|
||||
An **EdgeBody** is the logical content of a TGK edge:
|
||||
|
||||
```text
|
||||
EdgeBody {
|
||||
type: EdgeTypeId
|
||||
from: Node[] // ordered, MAY be empty
|
||||
to: Node[] // ordered, MAY be empty
|
||||
payload: Reference // ArtifactRef, always present
|
||||
}
|
||||
```
|
||||
|
||||
Semantics and invariants:
|
||||
|
||||
* `type : EdgeTypeId`
|
||||
Identifies the **kind** of relationship (e.g., execution, attestation, overlay mapping). Semantics of each `EdgeTypeId` are defined in separate specifications, not by TGK/1-CORE.
|
||||
|
||||
* `from : Node[]`
|
||||
Ordered list of source nodes. MAY be empty. Order is semantically significant and part of the logical value.
|
||||
|
||||
* `to : Node[]`
|
||||
Ordered list of target nodes. MAY be empty. Order is semantically significant.
|
||||
|
||||
* `payload : Reference`
|
||||
A syntactically valid ASL/1 `Reference`, always present. TGK/1-CORE does **not** require that `payload` be resolvable in any particular store; existence is a deployment concern.
|
||||
|
||||
* **Non-emptiness constraint**
|
||||
|
||||
> **TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1**
|
||||
> For a well-formed `EdgeBody`, at least one of `from` or `to` **MUST** be non-empty. An `EdgeBody` with `from = []` **and** `to = []` is invalid and MUST NOT be produced or accepted as a TGK edge.
|
||||
|
||||
> **TGK/PROV-EVIDENCE/CORE/1 (RECOMMENDED)**
|
||||
> To support provenance, edge types that describe “how we got here” **SHOULD** ensure that:
|
||||
>
|
||||
> * `payload` references an Artifact whose content is a stable, replayable description of the relationship; and
|
||||
> * the `from` and `to` node sets can, in principle, be recomputed from that payload and other Artifacts in the environment, according to the edge type’s profile.
|
||||
>
|
||||
> In edge types that use minimal descriptors as payload, those descriptors **SHOULD** themselves be defined such that their content is a deterministic function of the other Artifacts and parameters that define the relationship, so that edge Artifacts can always be re-derived.
|
||||
|
||||
**Duplicates and self-reference**
|
||||
|
||||
TGK/1-CORE does not forbid:
|
||||
|
||||
* duplicate entries within `from`,
|
||||
* duplicate entries within `to`, or
|
||||
* `payload` also appearing in `from` or `to`.
|
||||
|
||||
The semantics (if any) of such patterns are defined by the profiles that own the relevant `EdgeTypeId`. The kernel only requires that:
|
||||
|
||||
* `from` and `to` are ordered lists of syntactically valid ASL/1 `Reference`s; and
|
||||
* they obey TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1.
|
||||
|
||||
TGK/1-CORE does **not** constrain how `EdgeBody` values are encoded into `Artifact.bytes`; this is the role of encoding profiles like `ENC/TGK1-EDGE`.
|
||||
|
||||
### 2.3 EdgeTypeId
|
||||
|
||||
`EdgeTypeId` identifies the semantic type of an edge:
|
||||
|
||||
```text
|
||||
EdgeTypeId = uint32
|
||||
```
|
||||
|
||||
Constraints:
|
||||
|
||||
* For any given ExecutionEnvironment snapshot, each `EdgeTypeId` that appears in TGK edges **MUST** have a single, well-defined and immutable semantics within that environment.
|
||||
* TGK/1-CORE does not prescribe a global registration mechanism or reserved ranges.
|
||||
* Catalogs such as `TGK/TYPES-CORE` typically bind `EdgeTypeId` values to human-readable names, owning profiles, and structural constraints (e.g. allowed cardinalities of `from` / `to`), but TGK/1-CORE does not standardize that surface.
|
||||
|
||||
**Unknown types**
|
||||
|
||||
* If an ExecutionEnvironment encounters an Artifact whose payload decodes to an `EdgeBody` whose `type` is not recognized in its configured catalogs/profile set, it **MUST** treat that Artifact as **not** forming a TGK edge for that environment:
|
||||
|
||||
* that Artifact does **not** qualify as an `EdgeArtifact` under §3.1; and
|
||||
* it therefore contributes no edges or nodes to the ProvenanceGraph.
|
||||
|
||||
> **Environment-relative semantics (informative)**
|
||||
> Recognition of `EdgeTypeId` values depends on the ExecutionEnvironment’s configured catalogs and profiles. As a result, the exact set of TGK edges derived from a fixed set of Artifacts may differ between environments. TGK/1-CORE considers this expected: the kernel guarantees determinism only *relative* to a given snapshot + profile set, not across all possible environments.
|
||||
|
||||
---
|
||||
|
||||
## 3. Edge Artifacts and Decoding
|
||||
|
||||
TGK/1-CORE uses **EdgeArtifacts** as the only concrete representation of edges.
|
||||
|
||||
### 3.1 EdgeArtifact definition
|
||||
|
||||
An **EdgeArtifact** is any ASL/1 `Artifact` that, relative to a given ExecutionEnvironment snapshot:
|
||||
|
||||
1. Has a `type_tag` whose `tag_id` is recognized (by the local profile set) as an edge tag; and
|
||||
2. Has `bytes` that, under at least one applicable edge encoding profile, decode to a single well-formed `EdgeBody` value as defined in §2.2; and
|
||||
3. Has an `EdgeBody.type` that is recognized (by the local profile set) as a supported `EdgeTypeId` for this environment (see §2.3).
|
||||
|
||||
Formally, for a given snapshot:
|
||||
|
||||
* Let `EDGE_TAG_SET` be the set of `TypeTag.tag_id` values configured as TGK edge tags.
|
||||
* For each active edge encoding profile `P` in the environment:
|
||||
|
||||
* `P` provides a **partial** decoding function:
|
||||
|
||||
```text
|
||||
decode_edge_payload_P : OctetString -> EdgeBody | error
|
||||
```
|
||||
|
||||
which is a pure function of its input bytes.
|
||||
|
||||
> **Configuration origin note (informative)**
|
||||
> `EDGE_TAG_SET` is derived from the ExecutionEnvironment’s configured TGK-related profiles and catalogs (e.g., `TGK/TYPES-CORE`, `ENC/TGK1-EDGE`), and/or from explicit deployment configuration. TGK/1-CORE does not prescribe how this configuration is stored, distributed, or governed; it only assumes that, for any snapshot, there is a well-defined set of `TypeTag.tag_id` values considered edge tags. In many deployments, one or more `TypeTag` values (e.g., a `TGK_EDGE_V1` tag) will be reserved specifically for edge Artifacts, but this is a convention, not a kernel requirement.
|
||||
|
||||
Then, an Artifact `A` is an EdgeArtifact iff:
|
||||
|
||||
* `A.type_tag` is present and `A.type_tag.tag_id ∈ EDGE_TAG_SET`; and
|
||||
|
||||
* there exists at least one active profile `P` such that:
|
||||
|
||||
```text
|
||||
decode_edge_payload_P(A.bytes) = EdgeBody E // succeeds, no error
|
||||
```
|
||||
|
||||
where `E` is a well-formed `EdgeBody` per §2.2; and
|
||||
|
||||
* `E.type` is recognized in the environment as a supported `EdgeTypeId` for TGK purposes (see §2.3).
|
||||
|
||||
Artifacts that satisfy the edge-tag and decoding constraints but whose decoded `EdgeBody.type` is not recognized as a supported `EdgeTypeId` for this environment MUST NOT be treated as EdgeArtifacts (see §2.3).
|
||||
|
||||
TGK/1-CORE does not prescribe:
|
||||
|
||||
* a particular `tag_id` for EdgeArtifacts; or
|
||||
* a particular encoding for `EdgeBody` into `Artifact.bytes`.
|
||||
|
||||
Those are the responsibility of edge encoding profiles and catalogs.
|
||||
|
||||
> **Single-edge-per-artifact invariant (informative)**
|
||||
> TGK/1-CORE assumes each EdgeArtifact encodes exactly one `EdgeBody` and thus one logical edge. Bundling multiple logical edges into a single Artifact is outside the TGK/1-CORE model and, if needed, **SHOULD** be expressed as multiple EdgeArtifacts (e.g., via an index or bundle Artifact that refers to other EdgeArtifacts).
|
||||
|
||||
> **Environment-relative edgehood (informative)**
|
||||
> An Artifact can be an EdgeArtifact in one ExecutionEnvironment (given its profile set) and not in another. TGK/1-CORE defines edgehood relative to the configured profiles, not as an intrinsic property of the Artifact alone.
|
||||
|
||||
### 3.2 Edge decoding and multi-profile behavior
|
||||
|
||||
For each active edge encoding profile `P`:
|
||||
|
||||
* The function `decode_edge_payload_P` **MUST** be:
|
||||
|
||||
* **partial** — returns either:
|
||||
|
||||
* a successfully decoded `EdgeBody`, or
|
||||
* an error signaling “not a valid edge payload for this profile”;
|
||||
* **deterministic** — no hidden state, randomness, or external configuration affects its output.
|
||||
|
||||
Additional constraints:
|
||||
|
||||
* For Artifacts whose `type_tag.tag_id ∉ EDGE_TAG_SET`, all edge encoding profiles **MUST** treat `decode_edge_payload_P` as not applicable (always error) and **MUST NOT** attempt to reinterpret arbitrary non-edge-tag Artifacts as TGK edges.
|
||||
|
||||
* For Artifacts whose `type_tag.tag_id ∈ EDGE_TAG_SET`:
|
||||
|
||||
* It is **permitted** that some active profiles do not apply (they simply return an error).
|
||||
* If more than one active profile successfully decodes `A.bytes`, then all those profiles **MUST** decode to the **same** logical `EdgeBody` value. If two active profiles decode the same Artifact to different `EdgeBody` values, the ExecutionEnvironment is misconfigured and **MUST NOT** treat that Artifact as an EdgeArtifact until the conflict is resolved.
|
||||
|
||||
> **TGK/EDGE-PROFILE-RECOMMEND/CORE/1 (RECOMMENDED)**
|
||||
> For operational simplicity, ExecutionEnvironments **SHOULD** configure at most one active edge-encoding profile for any given edge `TypeTag.tag_id` at a time. When multiple profiles may apply to the same EdgeArtifacts (e.g., during a migration), they **MUST** be governed so that any Artifact accepted by more than one profile decodes to the same `EdgeBody`.
|
||||
|
||||
### 3.3 EdgeRef
|
||||
|
||||
An **EdgeRef** is simply the ASL/1 `Reference` to an EdgeArtifact:
|
||||
|
||||
```text
|
||||
EdgeRef := Reference // reference to an EdgeArtifact
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
||||
* No new identity scheme is introduced for edges.
|
||||
* The identity and equality of EdgeArtifacts and EdgeRefs are fully governed by ASL/1-CORE (canonical encoding + hashing via `ENC/ASL1-CORE` and `HASH/ASL1`).
|
||||
* For a fixed canonical Artifact encoding and hash profile:
|
||||
|
||||
* equality of EdgeRefs is equivalent to equality of the underlying EdgeArtifacts; and
|
||||
* by injective edge encodings in the applicable encoding profile, equivalent (modulo cryptographic collision assumptions) to equality of their logical `EdgeBody` values.
|
||||
|
||||
> **Duplicate logical edges (informative)**
|
||||
> In most deployments, a given logical edge type and encoding will produce a unique EdgeArtifact for a given `EdgeBody`, because canonical encoding + ASL hashing make that Artifact and its `Reference` unique. Distinct `EdgeRef` values that encode semantically equivalent relationships can still arise if different `TypeTag` / encoding / profile combinations are used to express the same relationship. TGK/1-CORE does not attempt to normalize such cases; higher-layer profiles MAY choose to detect or coalesce them.
|
||||
|
||||
> **Store interaction note (informative)**
|
||||
> Any ASL/1-STORE that holds EdgeArtifacts can be used to resolve `EdgeRef` via normal `get(Reference)` semantics. TGK/1-CORE does not define a separate persistence layer for edges; they are ordinary Artifacts as far as ASL/1-STORE is concerned.
|
||||
|
||||
### 3.4 Relationship between EdgeArtifact and EdgeBody
|
||||
|
||||
For each EdgeArtifact:
|
||||
|
||||
```text
|
||||
A_edge : Artifact
|
||||
Ref_edge : Reference // derived per ASL/1-CORE
|
||||
Body : EdgeBody // Body = EdgeBody(A_edge) via the unique decoding result
|
||||
```
|
||||
|
||||
The mapping `EdgeBody(A_edge)` is determined by the environment’s active profiles and MUST obey the determinism and well-formedness constraints above.
|
||||
|
||||
Encoding profiles such as `ENC/TGK1-EDGE` define:
|
||||
|
||||
* the concrete layout of `EdgeBody` into `Artifact.bytes`; and
|
||||
* how `TypeTag` values map to particular edge schemas.
|
||||
|
||||
---
|
||||
|
||||
## 4. ProvenanceGraph as Projection
|
||||
|
||||
### 4.1 Graph derived from Artifacts
|
||||
|
||||
Given:
|
||||
|
||||
* a finite snapshot set of Artifacts `Artifacts`; and
|
||||
* a fixed set of active edge-encoding profiles and type catalogs (the **profile set**) in an ExecutionEnvironment,
|
||||
|
||||
the **ProvenanceGraph** induced by `Artifacts` and the profile set is the pair:
|
||||
|
||||
```text
|
||||
ProvenanceGraph {
|
||||
Nodes: set<Node>
|
||||
Edges: set<(EdgeRef, EdgeBody)>
|
||||
}
|
||||
```
|
||||
|
||||
defined as follows:
|
||||
|
||||
1. **Edges**
|
||||
|
||||
* Let `EdgeArtifacts ⊆ Artifacts` be the subset of Artifacts that qualify as EdgeArtifacts under §3.1 and §3.2.
|
||||
* For each `A_edge ∈ EdgeArtifacts`:
|
||||
|
||||
* Let `Ref_edge` be its ASL `Reference`.
|
||||
* Let `Body = EdgeBody(A_edge)` be the decoded `EdgeBody`.
|
||||
|
||||
Then:
|
||||
|
||||
```text
|
||||
Edges = { (Ref_edge, Body) | A_edge ∈ EdgeArtifacts }
|
||||
```
|
||||
|
||||
2. **Nodes**
|
||||
|
||||
Nodes are all ArtifactRefs that appear anywhere in edges:
|
||||
|
||||
```text
|
||||
Nodes = {
|
||||
n : Reference |
|
||||
∃ (Ref_edge, Body) ∈ Edges such that
|
||||
n ∈ Body.from ∪ Body.to ∪ { Body.payload }
|
||||
}
|
||||
```
|
||||
|
||||
Clarifications:
|
||||
|
||||
* The `Nodes` set includes only `Reference`s that participate in at least one edge as source, target, or payload.
|
||||
* Artifacts (and their References) that have no incoming or outgoing edges are **not** included in the ProvenanceGraph by TGK/1-CORE. Profiles MAY define derived views that treat all Artifacts as degree-zero nodes, but this is outside the TGK/1-CORE kernel.
|
||||
* TGK/1-CORE does **not** require that every `Node` in `Nodes` correspond to an Artifact present in the `Artifacts` set. The ProvenanceGraph is a graph over the **Reference space**. Whether a given `Reference` is resolvable to an Artifact in a particular store or federation is outside this kernel and is governed by `ASL/1-STORE` and deployment policy.
|
||||
|
||||
> **TGK/GRAPH-PROJECTION/CORE/1**
|
||||
> For a fixed snapshot set of Artifacts and a fixed profile set, the ProvenanceGraph, as defined above, is unique. Implementations MAY cache or index edges and nodes, but **MUST NOT** introduce logical edges that cannot be derived from EdgeArtifacts and the profiles in effect at that snapshot.
|
||||
|
||||
### 4.2 Informative: provenance traces
|
||||
|
||||
TGK/1-CORE does **not** define provenance or trace operations normatively. However, it is intended to be the substrate for:
|
||||
|
||||
* `TGK/PROV/1`, which defines:
|
||||
|
||||
* provenance policies (e.g., “which edge types participate”), and
|
||||
* trace operators (e.g., backwards reachability) over `ProvenanceGraph`.
|
||||
|
||||
As an informative sketch, a backwards provenance operator would:
|
||||
|
||||
* start from a set of target `Node`s (ArtifactRefs); and
|
||||
* walk backwards along edges whose `EdgeTypeId` are selected by some policy,
|
||||
* until reaching nodes that are considered roots by that policy.
|
||||
|
||||
Any such operator **MUST**, when specified in `TGK/PROV/1`, be defined purely as a projection over `ProvenanceGraph`, consistent with `TGK/PROV-KERNEL/CORE/1`, `TGK/DET/CORE/1`, and `TGK/NO-OFF-GRAPH-PROV/CORE/1`.
|
||||
|
||||
---
|
||||
|
||||
## 5. Interaction with Other Layers (Informative)
|
||||
|
||||
### 5.1 Interaction with PEL/1
|
||||
|
||||
A PEL/1 execution typically involves:
|
||||
|
||||
* a `Program` ArtifactRef,
|
||||
* zero or more input ArtifactRefs,
|
||||
* an `ExecutionResult` ArtifactRef that references output ArtifactRefs.
|
||||
|
||||
A profile such as `TGK/PEL/1` can define:
|
||||
|
||||
* a specific `EdgeTypeId` (e.g., `EDGE_EXECUTION`); and
|
||||
* an edge encoding that maps PEL/1 execution payloads to an `EdgeBody`:
|
||||
|
||||
```text
|
||||
EdgeBody.type = EDGE_EXECUTION
|
||||
EdgeBody.from = [program_ref] ∪ input_refs[]
|
||||
EdgeBody.to = output_refs[] ∪ [execution_result_ref]
|
||||
EdgeBody.payload = execution_result_ref
|
||||
```
|
||||
|
||||
Then, for each execution, an EdgeArtifact is produced (by the runtime or an ingestion tool) with:
|
||||
|
||||
* a TGK edge `TypeTag`, and
|
||||
* a payload encoding that an edge profile (e.g., `ENC/TGK1-EDGE`) decodes to such an `EdgeBody`.
|
||||
|
||||
The resulting ProvenanceGraph expresses execution relationships as edges over ArtifactRefs.
|
||||
|
||||
TGK/1-CORE does not require PEL/1 engines to emit such edge Artifacts; they MAY be derived post hoc from stored Artifacts.
|
||||
|
||||
### 5.2 Interaction with CIL/1
|
||||
|
||||
CIL/1 defines certificate Artifacts. A profile (e.g., `TGK/CIL/1`) can specify:
|
||||
|
||||
* `EdgeTypeId = EDGE_ATTESTS`.
|
||||
|
||||
For each certificate Artifact `cert_ref` whose subject is `subject_ref`:
|
||||
|
||||
```text
|
||||
EdgeBody.type = EDGE_ATTESTS
|
||||
EdgeBody.from = [cert_ref]
|
||||
EdgeBody.to = [subject_ref]
|
||||
EdgeBody.payload = cert_ref
|
||||
```
|
||||
|
||||
EdgeArtifacts that encode these `EdgeBody` values make certificate relationships explicit in the ProvenanceGraph.
|
||||
|
||||
TGK/1-CORE itself does not verify signatures or policies; CIL/1 and governance profiles do.
|
||||
|
||||
### 5.3 Interaction with FCT/1, FER/1, OI/1
|
||||
|
||||
Profiles can similarly define:
|
||||
|
||||
* evidence-to-fact edges (e.g., `EDGE_FACT_SUPPORTS`),
|
||||
* overlay mapping edges (e.g., `EDGE_OVERLAY_MAPS`),
|
||||
* other domain relationships.
|
||||
|
||||
The common pattern is:
|
||||
|
||||
* define an `EdgeTypeId`;
|
||||
* define how to encode a logical `EdgeBody` into an EdgeArtifact payload;
|
||||
* derive the graph as in §4.1.
|
||||
|
||||
TGK/1-CORE itself is agnostic to those semantics.
|
||||
|
||||
---
|
||||
|
||||
## 6. Conformance
|
||||
|
||||
An implementation is **TGK/1-CORE–conformant** if and only if it satisfies all of the following:
|
||||
|
||||
1. **Node model**
|
||||
|
||||
* Treats any ASL/1 `Reference` as a potential Node (`Node := Reference`).
|
||||
* Does not introduce a separate node identity layer for TGK purposes.
|
||||
|
||||
2. **Edge artifacts and decoding**
|
||||
|
||||
* Defines (via configuration or companion specs) which `TypeTag.tag_id` values represent TGK edge Artifacts (`EDGE_TAG_SET`).
|
||||
|
||||
* For each active edge encoding profile `P`, provides a partial, deterministic decoder:
|
||||
|
||||
```text
|
||||
decode_edge_payload_P : OctetString -> EdgeBody | error
|
||||
```
|
||||
|
||||
that:
|
||||
|
||||
* succeeds (returns `EdgeBody`) exactly for payloads considered valid edges under profile `P`; and
|
||||
* returns an error otherwise.
|
||||
|
||||
* For any Artifact `A` with `A.type_tag.tag_id ∉ EDGE_TAG_SET`, all edge profiles **MUST** treat `decode_edge_payload_P` as not applicable (always error) and **MUST NOT** attempt to interpret `A.bytes` as a TGK edge payload.
|
||||
|
||||
* For any Artifact `A` with `A.type_tag.tag_id ∈ EDGE_TAG_SET`:
|
||||
|
||||
* `A` is an EdgeArtifact only if at least one active profile successfully decodes `A.bytes` to a well-formed `EdgeBody` whose `type` is recognized as a supported `EdgeTypeId` in the environment.
|
||||
* If more than one active profile decodes `A.bytes` successfully, they **MUST** all decode it to the same logical `EdgeBody`. If they do not, the environment **MUST NOT** treat `A` as an EdgeArtifact until the inconsistency is resolved.
|
||||
|
||||
3. **EdgeBody invariants**
|
||||
|
||||
* Treats as well-formed only those `EdgeBody` values that satisfy §2.2:
|
||||
|
||||
* `from` and `to` are ordered lists of syntactically valid ASL/1 `Reference`s;
|
||||
* they satisfy TGK/EDGE-NONEMPTY-ENDPOINT/CORE/1; and
|
||||
* `payload` is always a syntactically valid ASL/1 `Reference` and always present.
|
||||
* Edge encoding profiles **MUST** reject payloads that would decode to an `EdgeBody` violating these invariants.
|
||||
|
||||
4. **Graph projection**
|
||||
|
||||
* Given:
|
||||
|
||||
* a finite snapshot set of Artifacts; and
|
||||
* the configured edge tags + decoding rules (profile set),
|
||||
* it can construct the ProvenanceGraph as in §4.1:
|
||||
|
||||
* Edge set derived from EdgeArtifacts;
|
||||
* Node set derived from `from`, `to`, and `payload` fields of `EdgeBody` values.
|
||||
* Any graph indexes or caches it exposes **MUST** be consistent with this projection (`TGK/GRAPH-PROJECTION/CORE/1`, `TGK/DET/CORE/1`).
|
||||
|
||||
5. **Immutability**
|
||||
|
||||
* Treats EdgeArtifacts as immutable, as required by ASL/1-CORE.
|
||||
* Does not attempt to “edit” an edge in place; logical changes **MUST** be represented by new Artifacts (edge Artifacts and/or other Artifacts) rather than mutating existing ones.
|
||||
|
||||
6. **Layering invariant with PEL/1**
|
||||
|
||||
* Respects `TGK/PEL-LAYERING-INV/CORE/1`:
|
||||
|
||||
* Does not impose additional requirements on PEL/1 engines beyond those in `PEL/1`.
|
||||
* Allows PEL/1-related edge profiles to be implemented either by the runtime or by ingestion tools, without affecting PEL/1 conformance.
|
||||
|
||||
7. **Profile compatibility**
|
||||
|
||||
* If it claims to implement specific TGK-related profiles (e.g., `TGK/PEL/1`, `TGK/CIL/1`, `TGK/PROV/1`), it **MUST**:
|
||||
|
||||
* interpret `EdgeTypeId` and edge payloads according to those profiles; and
|
||||
* ensure that all edges defined by those profiles can be represented as EdgeArtifacts consistent with TGK/1-CORE.
|
||||
|
||||
Everything else — canonical encodings for `EdgeBody`, edge hashing, graph store APIs, provenance algorithms, error models — belongs to:
|
||||
|
||||
* edge encoding profiles (`ENC/TGK1-EDGE`),
|
||||
* storage/query profiles (`TGK/STORE/1`), and
|
||||
* provenance profiles (`TGK/PROV/1`) and higher semantic layers (`FCT/1`, `FER/1`, `OI/1`, etc.).
|
||||
|
||||
---
|
||||
|
||||
## 7. Evolution (Informative)
|
||||
|
||||
TGK/1-CORE is intended to evolve **additively**:
|
||||
|
||||
* New edge types are introduced by assigning new `EdgeTypeId` values in catalogs and profiles.
|
||||
* New edge tags are introduced by assigning new `TypeTag.tag_id` values to EdgeArtifacts.
|
||||
* New encodings are introduced by adding new edge encoding profiles and decoders.
|
||||
|
||||
Existing EdgeArtifacts and their decoded `EdgeBody` values:
|
||||
|
||||
* **MUST NOT** be retroactively reinterpreted to have different logical meaning under TGK/1-CORE; and
|
||||
* **MUST** remain valid inputs to any future profile sets that claim to support their `TypeTag` and `EdgeTypeId`, subject to the multi-profile behavior rules in §3.2.
|
||||
|
||||
Introducing a new edge-encoding profile that begins to treat previously non-edge Artifacts (e.g., with a new `TypeTag` or a previously unused `EdgeTypeId`) as EdgeArtifacts is allowed and considered an additive extension.
|
||||
|
||||
It is **not** permitted to change an existing profile or catalog in a way that causes an Artifact that previously decoded to a given `EdgeBody` (under a given `(TypeTag, EdgeTypeId)` and profile set) to be decoded to a different `EdgeBody` in the same environment. Such changes **SHOULD** instead be modeled via new `TypeTag` values and/or new `EdgeTypeId` assignments.
|
||||
|
||||
This aligns TGK/1-CORE with the broader Amduat design principle of **“never rewrite history; evolve by addition and projection.”**
|
||||
|
||||
---
|
||||
|
||||
## Document History
|
||||
|
||||
* **0.7.0 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.
|
||||
1839
tier1/tgk-prov-1.md
Normal file
1839
tier1/tgk-prov-1.md
Normal file
File diff suppressed because it is too large
Load diff
1156
tier1/tgk-store-1.md
Normal file
1156
tier1/tgk-store-1.md
Normal file
File diff suppressed because it is too large
Load diff
Loading…
Reference in a new issue