amduat/tier1/tgk-prov-1.md

# TGK/PROV/1 — Provenance & Trace Semantics over TGK/1-CORE

Status: Approved
Owner: Niklas Rydberg
Version: 0.1.20
SoT: Yes
Last Updated: 2025-11-16
Linked Phase Pack: N/A
Tags: [traceability, evidence]

<!-- Source: /amduat/docs/new/tgk-prov-1.md | Canonical: /amduat/tier1/tgk-prov-1.md -->

Here’s **iteration 22** of `TGK/PROV/1`, bumped to **0.1.20**. I’m happy to treat this as **ready for approval** (no further iterations needed from my side). After that you’ll find an updated **SUBSTRATE/STACK-OVERVIEW** and a suggested **commit message**.

© 2025 Niklas Rydberg.

## License

Except where otherwise noted, this document (text and diagrams) is licensed under
the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
Universal (CC0) to enable unrestricted reuse in implementations and derivative
specifications.

Code examples in this document are provided under the Apache License 2.0 unless
explicitly stated otherwise. Test vectors, where present, are dedicated to the
public domain under CC0 1.0.


---

**Document ID:** `TGK/PROV/1`
**Layer:** L1.8 — Provenance & trace semantics over `TGK/1-CORE`

**Depends on (normative):**

* `TGK/1-CORE v0.7.x` — trace graph kernel: `Node`, `EdgeBody`, `EdgeTypeId`, `ProvenanceGraph`
* (via `TGK/1-CORE`) `ASL/1-CORE v0.4.x` — value substrate (`Artifact`, `Reference`, `TypeTag`)

**Informative references:**

* `SUBSTRATE/STACK-OVERVIEW v0.3.x` — layering and dependency discipline
* `TGK/STORE/1 v0.2.x` — graph store & query semantics (adjacency queries over `ProvenanceGraph`)
* `ENC/TGK1-EDGE/1 v0.1.x` — canonical encoding for TGK `EdgeBody` / EdgeArtifacts
* `PEL/TRACE-DAG/1 v0.1.x` — DAG execution trace profile (example provenance source)
* `CIL/1`, `FER/1`, `FCT/1`, `OI/1` — profiles that define provenance-relevant edge types

> **Normativity note**
> `TGK/PROV/1` defines **pure functions over `ProvenanceGraph`** only. It MUST NOT assume any particular store, graph index, or query API. Implementations MAY realize these functions using `TGK/STORE/1` or other substrates, but conformance is judged against the abstract semantics in this document.

---

## 0. Conventions

### 0.1 RFC 2119 terminology

The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**,
**SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** are to be
interpreted as described in RFC 2119.

### 0.2 Imported types and structures

From `ASL/1-CORE` and `TGK/1-CORE`:

```text
Reference {
  hash_id: HashId
  digest:  OctetString
}

Node := Reference

EdgeTypeId = uint32

EdgeBody {
  type:    EdgeTypeId
  from:    Node[]
  to:      Node[]
  payload: Reference
}

ProvenanceGraph {
  Nodes: set<Node>
  Edges: set<(EdgeRef, EdgeBody)>
}

EdgeRef := Reference
```

From `TGK/1-CORE` (informative, restated):

* `ProvenanceGraph` is a pure projection over:

  * a finite set of Artifacts, and
  * a configured set of TGK edge profiles and catalogs.

* All edges are represented as EdgeArtifacts; `EdgeRef` is an ordinary ASL `Reference`.

Unless otherwise stated, **sets are mathematical sets** (no duplicates, no ordering). Lists are ordered and MAY contain duplicates.

### 0.3 Additional terminology (this document)

`TGK/PROV/1` introduces the following logical types:

```text
ProvDirection (u8) {
  BACKWARD = 1  // walk from 'to' nodes to 'from' nodes
  FORWARD  = 2  // walk from 'from' nodes to 'to' nodes
  BOTH     = 3  // union of BACKWARD and FORWARD
}

EdgeTypeFilter {
  // If empty, match all EdgeTypeId values present in the graph.
  types: list<EdgeTypeId>
}

DepthLimit = optional uint32   // hop-count limit; absent = unbounded

TraceGraph {
  seeds: set<Node>
  nodes: set<Node>
  edges: set<(EdgeRef, EdgeBody)>
}
```

Additional derived types used in this version:

```text
DepthMap = map<Node, uint32>

ProvLayer {
  depth: uint32
  nodes: set<Node>
}

ProvLayering {
  layers: list<ProvLayer>  // sorted by increasing depth, no empty layers
}
```

We also use a shorthand for finite seed sets:

```text
SeedSet = set<Node>   // always finite under TGK/PROV/1
```

A `TraceGraph` is a **subgraph view** over some `ProvenanceGraph`, annotated with the seed set from which it was computed. `DepthMap` and `ProvLayering` are depth-oriented views over the same closure. `TGK/PROV/1` does not assign additional semantics to `TraceGraph.seeds`, depths, or layers by itself; higher-layer profiles (e.g. FER/1) may do so when encoding traces into Artifacts or presenting them to users.

Seeds are always treated as **finite sets of nodes**:

* Semantically, a seed input is a `SeedSet`.
* Even if a concrete API accepts seeds as a list or array, duplicates are ignored and order is irrelevant.

In all operators in this document, the seed set `S` is assumed to be a **finite `SeedSet`**. Concrete APIs MAY accept seeds in any finite representation, but semantically they are finite sets.

### 0.4 Snapshots & finiteness assumptions

`TGK/PROV/1` always works **relative to a single snapshot** of an ExecutionEnvironment in the sense of `TGK/1-CORE`:

* Each snapshot has a **finite** Artifact set and a fixed TGK profile set in effect.
* The corresponding `ProvenanceGraph G` for that snapshot has:

  ```text
  G.Nodes  -- finite
  G.Edges  -- finite
  ```

All semantics in this specification are evaluated with respect to such a finite `ProvenanceGraph G` and a finite `SeedSet S`:

* All closures, traces, depth maps, and layerings are therefore finite by construction.
* Any mention of “for a fixed `ProvenanceGraph G`” or “for a snapshot” implicitly assumes this finiteness, as guaranteed by `TGK/1-CORE`.

`TGK/PROV/1` does not describe how snapshots evolve over time; it only speaks about the results of provenance operators **within** a given snapshot.

---

## 1. Purpose, Scope & Non-Goals

### 1.1 Purpose

`TGK/PROV/1` defines a **kernel provenance model** over `ProvenanceGraph`:

* A small set of **provenance query parameters**:

  * direction (`ProvDirection`),
  * edge-type selection (`EdgeTypeFilter`),
  * depth bound (`DepthLimit`).

* Canonical **closure operators** over `ProvenanceGraph`:

  * `prov_closure_nodes` — node set reachable from seeds under a given policy;
  * `prov_trace` — a `TraceGraph` subgraph induced by that closure.

* Canonical **depth-oriented views**:

  * `prov_depths` — a `DepthMap` assigning each reachable node a minimum hop-count depth;
  * `prov_layers` — a `ProvLayering` partitioning reachable nodes by depth.

These operators are specified as **pure functions** over a fixed `ProvenanceGraph` snapshot (§0.4) and a fixed parameter set, in line with TGK invariants:

* `TGK/PROV-KERNEL/CORE/1` — provenance as a projection over edges;
* `TGK/DET/CORE/1` — determinism for a fixed snapshot + profile set;
* `TGK/NO-OFF-GRAPH-PROV/CORE/1` — no hidden, non-Artifactual provenance.

For any `ProvenanceGraph G`, finite seed set `S : SeedSet`, and provenance parameters `Q`:

```text
prov_closure_nodes(G, S, Q) -> set<Node>
prov_depths       (G, S, Q) -> DepthMap
prov_layers       (G, S, Q) -> ProvLayering
prov_trace        (G, S, Q) -> TraceGraph
```

are pure, deterministic functions of `(G, S, Q)`.

### 1.2 Scope

`TGK/PROV/1` covers:

* The **provenance parameter model** (`ProvDirection`, `EdgeTypeFilter`, `DepthLimit`).

* The **semantics** of:

  * directional neighbors and edge incidence,
  * backward / forward / bidirectional cones,
  * node-depth assignment and per-layer grouping,
  * the induced `TraceGraph` subgraph.

* An explicit requirement that all provenance semantics:

  * operate solely over `ProvenanceGraph`, and
  * never introduce edges or nodes not already present in that graph.

This document is intentionally conservative. It specifies “kernel” provenance operators suitable as a foundation for:

* execution provenance (e.g. PEL DAG traces),
* certificate and evidence provenance (CIL/1, FER/1),
* fact and overlay provenance (FCT/1, OI/1),

without baking those domains into the kernel.

### 1.3 Non-goals

`TGK/PROV/1` explicitly does **not** define:

* Store or graph store surfaces (`ASL/1-STORE`, `TGK/STORE/1`); it is purely semantic.
* Provenance **policies** (e.g. “which edge types count as causal provenance vs annotations”).
* A full **query language** (filters on payload content, fact schemas, certificate semantics).
* Minimality or optimality criteria for traces (e.g. “smallest subgraph that explains X”).
* How provenance results are encoded into Artifacts (e.g. receipts or reports) — those are `FER/1` and related specifications.

Implementations and profiles MAY define richer operators (slicing, k-bounded paths, temporal constraints, etc.) on top of this kernel, but MUST describe them as projections over `ProvenanceGraph`.

---

## 2. Provenance Parameters

### 2.1 Edge type selection (`EdgeTypeFilter`)

`EdgeTypeFilter` constrains which edges participate in a provenance traversal.

```text
EdgeTypeFilter {
  types: list<EdgeTypeId>
}
```

For a fixed `ProvenanceGraph G`, define the **effective edge-type set**:

```text
EffectiveTypeSet(filter, G) =
  if filter.types is empty:
    { body.type | (ref, body) ∈ G.Edges }
  else:
    { t | t ∈ filter.types ∧ ∃ (ref, body) ∈ G.Edges, body.type = t }
```

Only edges whose `EdgeBody.type` belongs to `EffectiveTypeSet(filter, G)` are considered during traversal.

Notes:

* Repeated `EdgeTypeId` values in `filter.types` have no effect.
* If `EffectiveTypeSet(filter, G)` is empty, then no edges participate in traversal for that graph and filter; the resulting edge set is empty and the node closure consists solely of the seeds (see §3.3).
* For any two filters `filter₁`, `filter₂`, if:

  ```text
  EffectiveTypeSet(filter₁, G) ⊆ EffectiveTypeSet(filter₂, G)
  ```

  then all neighbor sets, closures, and traces computed with `filter₁` are subsets of those computed with `filter₂` for the same `G`, seeds, direction, and depth limit (see §2.2, §3.4.5, and §4.6).

`TGK/PROV/1` does not prescribe which edge types “should” be included; higher-layer provenance profiles SHOULD specify concrete filters (e.g. “execution edges only”, “facts + receipts + certificates”).

When provenance is implemented over a `TGK/STORE/1` graph store, this `EdgeTypeFilter` has the same logical shape and semantics as `TGK/STORE/1`’s `EdgeTypeFilter`, with the understanding that `G.Edges` already reflects the environment’s supported edge types.

### 2.2 Direction semantics (`ProvDirection`)

`ProvDirection` determines how selected edges connect nodes during traversal.

```text
ProvDirection (u8) {
  BACKWARD = 1
  FORWARD  = 2
  BOTH     = 3
}
```

For a given `ProvenanceGraph G`, direction `dir`, and edge-type filter `filter`, define:

1. The **selected edge set**:

   ```text
   SelectedEdges(G, filter) =
     { (ref, body) ∈ G.Edges | body.type ∈ EffectiveTypeSet(filter, G) }
   ```

2. For a node `n`, the set of **incident selected edges** (via `from`/`to` only):

   ```text
   IncEdges(G, n, filter) =
     { (ref, body) ∈ SelectedEdges(G, filter) |
         n ∈ body.from ∪ body.to }
   ```

3. The **directional neighbor functions** `Neighbors_dir`:

   * For `dir = BACKWARD`:

     ```text
     Neighbors_BACKWARD(G, n, filter) =
       { m ∈ Node |
           ∃ (ref, body) ∈ SelectedEdges(G, filter),
             n ∈ body.to,
             m ∈ body.from
       }
     ```

   * For `dir = FORWARD`:

     ```text
     Neighbors_FORWARD(G, n, filter) =
       { m ∈ Node |
           ∃ (ref, body) ∈ SelectedEdges(G, filter),
             n ∈ body.from,
             m ∈ body.to
       }
     ```

   * For `dir = BOTH`:

     ```text
     Neighbors_BOTH(G, n, filter) =
       Neighbors_BACKWARD(G, n, filter)
       ∪ Neighbors_FORWARD(G, n, filter)
     ```

All neighbor sets are mathematical sets: multiple edges connecting the same pair of nodes do not create duplicate neighbor entries.

By the definition of `ProvenanceGraph.Nodes` in `TGK/1-CORE`, all endpoints of any edge in `G.Edges` lie in `G.Nodes`. Therefore, for all `dir`, `n`, and `filter`:

```text
Neighbors_dir(G, n, filter) ⊆ G.Nodes
```

Self-loops and multi-endpoint edges behave naturally under these definitions:

* If a selected edge has the same node in both `from` and `to`, then:

  * under `FORWARD` and `BACKWARD` that node is its own neighbor, and
  * under `BOTH` it still appears only once as a neighbor.

* If a node appears multiple times in `from` and/or `to` for a given edge, this does not change neighbor sets: neighbors are deduplicated at the set level.

For fixed `G`, `n`, and `dir`, neighbor sets are **monotone in the effective edge-type set**: if `filter₁` and `filter₂` satisfy:

```text
EffectiveTypeSet(filter₁, G) ⊆ EffectiveTypeSet(filter₂, G)
```

then:

```text
Neighbors_dir(G, n, filter₁) ⊆ Neighbors_dir(G, n, filter₂)
```

These neighbor definitions are intentionally aligned with `TGK/STORE/1`’s `GraphDirection` semantics when provenance traversals are implemented via graph-store adjacency:

* `BACKWARD` corresponds to walking along edges where the current node appears in `body.to`, towards nodes in `body.from` (similar to `GraphDirection.IN`).
* `FORWARD` corresponds to walking along edges where the current node appears in `body.from`, towards nodes in `body.to` (similar to `GraphDirection.OUT`).
* `BOTH` is the union of the two.

#### Payload nodes

In this version:

* `payload` is **not** used to advance the traversal frontier.
* Payload nodes may appear in the resulting `TraceGraph.nodes` set (see §4.2–§4.3), but `Neighbors_*` never step through `payload`.

> **TGK/PROV-PAYLOAD-NONTRAV/1**
> Kernel provenance traversal under `TGK/PROV/1` **MUST NOT** use `EdgeBody.payload` as a source of neighbors. Only `from` and `to` participate in adjacency, depth assignment, and closure. Profiles that wish to treat payload values as traversable endpoints MUST either:
>
> * express those relationships via additional edges whose endpoints lie in `from` / `to`; or
> * define profile-specific traversal rules on top of this kernel, without redefining the core neighbor functions.

### 2.3 Depth limit (`DepthLimit`)

`DepthLimit` is an optional non-negative integer that bounds traversal length:

```text
DepthLimit = optional uint32
```

Semantics:

* Seeds are at depth `0`.
* Traversal proceeds in **edge hops** (as defined by `Neighbors_dir`).
* For a limit `D`:

  * nodes at depth `d` with `d < D` MAY generate new neighbors at depth `d+1`;
  * nodes at depth `d = D` DO NOT generate new neighbors;
  * no node has depth `> D`.

If `DepthLimit` is absent, traversal is unbounded in depth (but still finite in its effect on `G.Nodes` because `G` is finite, see §0.4).

Special case:

* If `DepthLimit` is present and `D = 0`, then no seed generates neighbors at all; the closure consists exactly of the seed set `S` (subject to the empty-seed rule in §3.3).

`DepthLimit` applies to the number of edges stepped along admissible directions, **not** to the number of nodes. If there are multiple paths from seeds to a node, its depth is the length of the shortest such path under the chosen direction and type filter.

---

## 3. Closure Semantics over `ProvenanceGraph`

This section defines `prov_closure_nodes`, the core node-set closure operator, together with the canonical depth and layering views built on the same construction.

### 3.1 Provenance query descriptor (logical)

For convenience, this document packages parameters into a logical `ProvQuery` type:

```text
ProvQuery {
  direction:   ProvDirection
  type_filter: EdgeTypeFilter
  depth_limit: DepthLimit
}
```

Concrete APIs MAY represent queries differently (e.g. as separate arguments), but their observable behavior MUST be equivalent to using a `ProvQuery` as above.

Unless otherwise stated, in what follows:

* `G` is a fixed `ProvenanceGraph` for some snapshot (finite, per §0.4),
* `S : SeedSet` is a finite seed set `S ⊆ Node` (seeds need not be in `G.Nodes`), and
* `Q` is a fixed `ProvQuery`.

### 3.2 Node-depth assignment

Fix:

* a `ProvenanceGraph G`,
* a finite seed set `S : SeedSet`, and
* a `ProvQuery Q`.

Let:

* `dir    = Q.direction`,
* `filter = Q.type_filter`,
* `D      = Q.depth_limit` (which may be absent).

Define a partial function:

```text
depth_Q(G, S, n) : Node -> uint32 or undefined
```

as the **minimum traversal depth** of node `n` from seeds `S` under `G` and `Q`, via the following inductive construction.

1. **Base depth-0 assignment**

   ```text
   S_0 = S
   depth_Q(G, S, n) = 0   for all n ∈ S_0
   ```

   Seeds may or may not be in `G.Nodes`. Seeds that are not in `G.Nodes`:

   * still have `depth_Q(G, S, n) = 0` defined, but
   * never appear in any `Frontier_d` set below (because `Frontier_d` ranges over `G.Nodes` only),
   * and therefore never generate neighbors.

2. **Iterative step**

   For each integer `d ≥ 0`, define:

   ```text
   Frontier_d   = { n ∈ G.Nodes | depth_Q(G, S, n) = d }

   NewNodes_d+1 =
     { m ∈ G.Nodes |
         ∃ n ∈ Frontier_d,
             m ∈ Neighbors_dir(G, n, filter),
             depth_Q(G, S, m) is undefined
     }
   ```

   * If `D` is present and `d ≥ D`, then `NewNodes_d+1 = ∅` (no further expansion).
   * For each `m ∈ NewNodes_d+1`, set:

     ```text
     depth_Q(G, S, m) = d + 1
     ```

3. **Termination**

   Since `G.Nodes` is finite (§0.4) and each node in `G.Nodes` receives at most one depth assignment, this iterative construction over `G.Nodes` terminates in finitely many steps. Nodes for which `depth_Q(G, S, n)` remains undefined and which are not seeds are not reachable from `S` under `(G, Q)`.

Depths for seeds in `S` are defined directly in step 1 and do not affect the termination of the frontier expansion over `G.Nodes`.

### 3.3 Node closure: `prov_closure_nodes`

Define the **provenance node closure**:

```text
prov_closure_nodes(G, S, Q) -> set<Node>
```

as the set `Reachable_Q(G, S)` given by:

```text
Reachable_Q(G, S) =
  S ∪ { n ∈ G.Nodes | depth_Q(G, S, n) is defined }
```

By definition:

```text
prov_closure_nodes(G, S, Q) = Reachable_Q(G, S)
```

This definition induces the following special cases and properties:

* **Seeds are always included**

  For any `G`, finite `S : SeedSet`, and `Q`:

  ```text
  S ⊆ Reachable_Q(G, S)
  ```

  regardless of whether seeds belong to `G.Nodes`.

* **Seeds not in `G.Nodes`**

  If `n ∈ S` but `n ∉ G.Nodes`, then:

  * `depth_Q(G, S, n) = 0`,
  * `n ∉ Frontier_d` for all `d`, and therefore `n` never generates neighbors,
  * but `n ∈ Reachable_Q(G, S)` by definition.

  By the definition of `G.Nodes` in `TGK/1-CORE`, such seeds do **not** appear as endpoints of any edges in `G.Edges`. They therefore affect only the seed/closure node sets, not the traversed portion of the graph.

* **Empty seed set**

  If `S = ∅`, then:

  * `depth_Q(G, S, n)` is undefined for all `n ∈ G.Nodes`, and
  * `Reachable_Q(G, ∅) = ∅`.

  In particular, with an empty seed set, node closure is empty and the trace has no nodes or edges (see §4.2).

* **Edge-type filter with no matching edges**

  If `EffectiveTypeSet(Q.type_filter, G)` is empty, no edges of `G` are traversed, and:

  * `depth_Q(G, S, n)` is undefined for all `n ∈ G.Nodes \ S`,
  * `Reachable_Q(G, S) = S`.

  Combined with the empty-seed rule above, when `S = ∅` and `EffectiveTypeSet(Q.type_filter, G)` is empty, the closure is empty.

* **Depth-limit special case**

  If `Q.depth_limit` is present with `D = 0`, then no `Frontier_d` with `d ≥ 0` ever expands, so:

  ```text
  Reachable_Q(G, S) = S
  ```

  for all `G` and finite `S`. This is consistent with the empty-seed and empty-filter behaviors above.

Because both `G.Nodes` and `S` are finite (§0.4, §0.3), `Reachable_Q(G, S)` (and thus `prov_closure_nodes(G, S, Q)`) is always a finite set.

Properties:

* If `DepthLimit` is absent, `Reachable_Q(G, S)` is the set of all nodes reachable from `S` by paths along edges in `SelectedEdges(G, filter)`, respecting `dir`.
* If `DepthLimit = D` is present, `Reachable_Q(G, S)` consists of:

  * all seeds; and
  * all nodes in `G.Nodes` reachable from `S` by paths of length ≤ `D` under `dir` and `filter`.

Collecting the seed-related aspects:

> **TGK/PROV-SEED-SEM/1**
> For any `ProvenanceGraph G`, finite seed set `S : SeedSet`, and query `Q`:
>
> * Seeds are always included in the closure:
>
>   ```text
>   S ⊆ Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
>   ```
>
> * If `S = ∅`, then `Reachable_Q(G, S) = ∅` (no seeds ⇒ empty closure).
>
> * For any seed `n ∈ S` with `n ∉ G.Nodes`, `n` is in `Reachable_Q(G, S)` but never appears in any `Frontier_d` and never generates neighbors; such seeds do not affect which nodes in `G.Nodes` are reachable.
>
> * Changing `Q.type_filter` or `Q.depth_limit` may add or remove **non-seed** reachable nodes but MUST NOT remove seeds from the closure: for all `Q` and all `Q'` differing only in `type_filter` or `depth_limit`, `S ⊆ Reachable_Q(G, S)` and `S ⊆ Reachable_Q'(G, S)`.

### 3.4 Closure properties and closure-operator invariant

For this section, define an **admissible `Q`-path in `G`** from `s ∈ S` to a node `n` to be a finite sequence:

```text
<s = n₀, n₁, ..., n_k = n>    with k ≥ 0
```

such that for each `i` with `0 ≤ i < k` there exists an edge `(ref, body) ∈ SelectedEdges(G, filter)` that connects `n_i` and `n_{i+1}` according to `dir`:

* if `dir = BACKWARD`, then `n_i ∈ body.to` and `n_{i+1} ∈ body.from`;
* if `dir = FORWARD`, then `n_i ∈ body.from` and `n_{i+1} ∈ body.to`;
* if `dir = BOTH`, then either of the above holds.

The **length** of such a path is `k` (the number of edge hops). By the definition of `Neighbors_dir`, all intermediate nodes after the first step lie in `G.Nodes`.

For any fixed `G`:

1. **Seed monotonicity**

   If `S₁ ⊆ S₂` (both finite `SeedSet`s), then for any `Q`:

   ```text
   Reachable_Q(G, S₁) ⊆ Reachable_Q(G, S₂)
   ```

2. **Depth-limit monotonicity**

   Let `Q` and `Q'` be identical except for `DepthLimit`:

   * If `Q.depth_limit` is absent and `Q'.depth_limit = D` is present, then:

     ```text
     Reachable_Q'(G, S) ⊆ Reachable_Q(G, S)
     ```

   * If both are present and `Q.depth_limit = D₁`, `Q'.depth_limit = D₂` with `D₁ ≤ D₂`, then:

     ```text
     Reachable_Q(G, S) ⊆ Reachable_Q'(G, S)
     ```

3. **Path characterization**

   For any node `n`:

   * If `depth_Q(G, S, n) = d` is defined, then there exists at least one admissible `Q`-path of length `d` from some `s ∈ S` to `n`.
   * If there exists an admissible `Q`-path of length `k` from some `s ∈ S` to `n`, then:

     ```text
     depth_Q(G, S, n) is defined ∧ depth_Q(G, S, n) ≤ k
     ```

4. **Seed-union and idempotence**

   * For any two finite seed sets `S₁`, `S₂` and fixed `Q`:

     ```text
     Reachable_Q(G, S₁ ∪ S₂)
       = Reachable_Q(G, S₁) ∪ Reachable_Q(G, S₂)
     ```

     and, by induction, for any finite family `{S_i}`:

     ```text
     Reachable_Q(G, ⋃_i S_i) = ⋃_i Reachable_Q(G, S_i)
     ```

   * For any finite `S` and fixed `Q`, closure is idempotent as a seed operator:

     ```text
     S' = Reachable_Q(G, S)
     ⇒ Reachable_Q(G, S') = S'
     ```

5. **Edge-type filter monotonicity**

   Let `Q₁` and `Q₂` be identical except for `type_filter`. If:

   ```text
   EffectiveTypeSet(Q₁.type_filter, G) ⊆ EffectiveTypeSet(Q₂.type_filter, G)
   ```

   then for any finite seed set `S : SeedSet`:

   ```text
   Reachable_Q₁(G, S) ⊆ Reachable_Q₂(G, S)
   ```

   That is, enlarging the set of admissible edge types (while keeping `direction` and `depth_limit` fixed) can only increase, never decrease, the closure.

Taken together, these properties mean:

> **TGK/PROV-CLOSURE-OP/1**
> For any fixed `ProvenanceGraph G` and `ProvQuery Q`, the mapping:
>
> ```text
> Cl_Q : P_f(Node) -> P_f(Node)
> Cl_Q(S) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
> ```
>
> where `P_f(Node)` denotes finite subsets of `Node`, is a **closure operator** on finite seed sets (extensive, monotone, and idempotent). Higher-layer provenance profiles MAY rely on these algebraic properties when reasoning about seed sets and composed provenance queries.

### 3.5 Alternative fixpoint characterization (informative)

For **unbounded** queries (i.e. `Q.depth_limit` is absent), `prov_closure_nodes` can be described equivalently as the **least fixpoint** of a monotone operator on node sets.

Fix `G`, finite `S : SeedSet`, `Q` with `Q.depth_limit = absent` and let `dir = Q.direction`, `filter = Q.type_filter`. Define an operator:

```text
F_Q(X) =
  S ∪ { m ∈ G.Nodes |
          ∃ n ∈ (X ∩ G.Nodes),
            m ∈ Neighbors_dir(G, n, filter)
      }
```

Then:

* `F_Q` is monotone on the powerset of `Node` (ordered by ⊆).
* The sequence `X₀ = S`, `X₁ = F_Q(X₀)`, `X₂ = F_Q(X₁)`, … stabilizes after finitely many steps on `G.Nodes` because `G.Nodes` is finite (§0.4).
* Its least fixpoint `lfp(F_Q)` satisfies:

  ```text
  lfp(F_Q) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
  ```

for unbounded queries.

For bounded queries (with `DepthLimit` present), §3.2’s depth-based construction is normative. The fixpoint view can still be recovered by intersecting the unbounded closure with the nodes whose minimum depth is ≤ `D`, but `TGK/PROV/1` does not rely on that reformulation.

This characterization is provided to make properties like monotonicity and idempotence easy to reason about when designing higher-level provenance policies.

### 3.6 Depth maps and layering

The `depth_Q` construction can be surfaced to callers as a canonical **depth map** and **layering** over the closure.

#### 3.6.1 Operation: `prov_depths`

Define:

```text
prov_depths(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> DepthMap
```

semantically as:

```text
prov_depths(G, S, Q) =
  { (n, d) |
      depth_Q(G, S, n) = d   // d is defined by §3.2
  }
```

Viewed as a map, its **domain** is exactly:

```text
dom(prov_depths(G, S, Q)) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
```

because:

* every seed `n ∈ S` has `depth_Q(G, S, n) = 0` by definition (even if `n ∉ G.Nodes`), and
* all nodes in `G.Nodes` with a defined depth also belong to `Reachable_Q(G, S)`.

Properties:

* For any `n` with `n ∈ Reachable_Q(G, S)`, `prov_depths(G, S, Q)[n]` is the minimum hop-count depth from some `s ∈ S` to `n` under `G` and `Q`.
* For any `n` with `n ∉ Reachable_Q(G, S)`, `prov_depths(G, S, Q)` has no entry for `n`.
* For every seed `n ∈ S`, `prov_depths(G, S, Q)[n] = 0`. Seeds therefore always appear in the depth map at depth 0 whenever they are present in the seed set.

Because `Reachable_Q(G, S)` is finite for finite `S` (§3.3), `prov_depths(G, S, Q)` is always a finite map.

In particular, for any node `n` with `n ∈ G.Nodes` and `prov_depths(G, S, Q)[n] > 0`, there exists at least one selected edge `(ref, body) ∈ SelectedEdges(G, filter)` and some node `m` with:

```text
prov_depths(G, S, Q)[m] = prov_depths(G, S, Q)[n] - 1
```

such that `m` and `n` are connected by that edge according to `Q.direction`. This is the “parent along some shortest path” witness implied by §3.4.

Implementations MAY choose any concrete representation for `DepthMap` (e.g. associative map, list of `(Node, depth)` pairs) as long as it is observable-equivalent to the mapping above.

`prov_depths` is a **derived view**: it does not change closure semantics; it simply exposes the depth information already used to define `Reachable_Q(G, S)`.

Collecting these facts:

> **TGK/PROV-DEPTH-MAP/1**
> For any `ProvenanceGraph G`, finite seed set `S : SeedSet`, and query `Q`:
>
> * `dom(prov_depths(G, S, Q)) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)`;
> * every seed `n ∈ S` has `prov_depths(G, S, Q)[n] = 0`; and
> * for every `n` in the domain, `prov_depths(G, S, Q)[n]` is the minimum hop-count length of any admissible `Q`-path from some seed in `S` to `n`.
>
> Higher-layer profiles MAY rely on this depth map as a canonical notion of “distance from seeds” under `(G, Q)`.

#### 3.6.2 Operation: `prov_layers`

Define:

```text
prov_layers(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> ProvLayering
```

semantically in terms of `prov_depths`:

1. Let `DMap = prov_depths(G, S, Q)`.

2. Let:

   ```text
   Depths_Q(G, S) = { d | ∃ n, (n, d) ∈ DMap }
   ```

   i.e., the set of depths actually realized by reachable nodes.

3. For each `d ∈ Depths_Q(G, S)`, define a layer:

   ```text
   Layer(d) = ProvLayer {
     depth = d
     nodes = { n | (n, d) ∈ DMap }
   }
   ```

4. Define `ProvLayering.layers` to be the list:

   ```text
   layers = [ Layer(d₀), Layer(d₁), ..., Layer(d_k) ]
   ```

   where `{d₀, ..., d_k} = Depths_Q(G, S)` and `d₀ < d₁ < ... < d_k` (strictly increasing). If `Depths_Q(G, S)` is empty (i.e., the closure is empty), then `layers` is the empty list.

Requirements:

* Each `Layer(d).nodes` MUST be non-empty by construction.

* The union over all `Layer(d).nodes` MUST equal `Reachable_Q(G, S)`:

  ```text
  ⋃_{d ∈ Depths_Q(G, S)} Layer(d).nodes
    = Reachable_Q(G, S)
  ```

* Distinct layers have disjoint node sets:

  ```text
  d₁ ≠ d₂ ⇒ Layer(d₁).nodes ∩ Layer(d₂).nodes = ∅
  ```

Seeds therefore always appear in the unique layer with `depth = 0` whenever `S` is non-empty.

Thus, `prov_layers` provides a canonical **partition** of the reachable node set by depth, with explicit depth labels and a deterministic ascending ordering of layers. In particular, if `S` is non-empty, seeds always appear in the unique layer with `depth = 0`.

`ProvLayering` is often convenient for:

* rendering trace frontiers stepwise (“distance from seed”), and
* defining layer-based transformations (e.g., trimming nodes beyond a given depth, or summarizing all nodes at a depth into a single fact).

As with `prov_depths`, `prov_layers` is a pure derived view that does not change any closure or trace semantics elsewhere in this specification.

> **TGK/PROV-LAYERING-PARTITION/1**
> For any `G`, finite `S : SeedSet`, and `Q`, the result of `prov_layers(G, S, Q)` partitions `Reachable_Q(G, S)` into non-empty, pairwise-disjoint depth-labeled layers whose union is exactly `Reachable_Q(G, S)`, with the layer depths strictly increasing.

---

## 4. TraceGraph Semantics

`prov_trace` lifts node closure to a concrete subgraph view.

### 4.1 TraceGraph structure

Recall:

```text
TraceGraph {
  seeds: set<Node>
  nodes: set<Node>
  edges: set<(EdgeRef, EdgeBody)>
}
```

A `TraceGraph T` is always interpreted **relative to some base `ProvenanceGraph G`**. This spec does not encode the association `T ↔ G`; profiles that persist traces MUST make that association explicit (e.g. via receipts in `FER/1`).

For convenience, define for any edge set `E`:

```text
EdgeNodes(E) =
  { n ∈ Node |
      ∃ (ref, body) ∈ E,
        n ∈ body.from ∪ body.to ∪ { body.payload }
  }
```

### 4.2 Operation: `prov_trace`

Logical signature:

```text
prov_trace(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> TraceGraph
```

Semantics:

1. **Compute node closure**

   ```text
   N = prov_closure_nodes(G, S, Q)   // i.e., N = Reachable_Q(G, S)
   ```

2. **Define the trace edge set**

   Let `filter = Q.type_filter`. Define:

   ```text
   E =
     ⋃ { IncEdges(G, n, filter) | n ∈ N }
   ```

   That is, `E` is the union, over all closure nodes, of the selected-type edges incident (via `from` or `to`) to those nodes.

   Equivalently (and as a direct set comprehension):

   ```text
   E =
     { (ref, body) ∈ SelectedEdges(G, filter) |
         (body.from ∩ N ≠ ∅) ∨ (body.to ∩ N ≠ ∅)
     }
   ```

   Notes:

   * `payload` does **not** control inclusion; edges are included based on `from` / `to`.
   * Edges whose type is not in `EffectiveTypeSet(filter, G)` are excluded, even if incident to nodes in `N`.

3. **Define the trace node set**

   The trace node set is:

   ```text
   Nodes_T = S ∪ EdgeNodes(E)
   ```

   i.e., seeds plus all nodes that appear in `from`, `to`, or `payload` for edges in `E`.

   By construction and by the properties of `depth_Q`:

   * `Reachable_Q(G, S) ⊆ Nodes_T` (see §4.3); and
   * for any `n ∈ Reachable_Q(G, S) ∩ G.Nodes` with `n ∉ S`, there exists at least one edge in `E` with `n` in `body.from` or `body.to`, so `n ∈ EdgeNodes(E)`.

   Thus this definition is equivalent to the alternative view:

   ```text
   Nodes_T = Reachable_Q(G, S) ∪ EdgeNodes(E)
   ```

   but is stated purely in terms of seeds and edges.

4. **Construct the TraceGraph**

   ```text
   TraceGraph T {
     seeds = S
     nodes = Nodes_T
     edges = E
   }
   ```

### 4.3 Invariants

For any `G`, finite `S : SeedSet`, `Q`, let `T = prov_trace(G, S, Q)`. Then:

1. **Seed inclusion**

   ```text
   T.seeds = S
   S ⊆ T.nodes
   ```

2. **Edge subset**

   ```text
   T.edges ⊆ G.Edges
   ```

   No edges are introduced that are not in the base `ProvenanceGraph`.

3. **Node closure over edges**

   All nodes incident to edges in `T.edges` are in `T.nodes` by construction:

   ```text
   { n ∈ Node |
       ∃ (ref, body) ∈ T.edges,
           n ∈ body.from ∪ body.to ∪ { body.payload }
   } ⊆ T.nodes
   ```

4. **Trace node characterization**

   Trace nodes are exactly **seeds plus edge nodes**:

   ```text
   T.nodes = S ∪ EdgeNodes(T.edges)
   ```

   In particular, every node in `T.nodes` that is not a seed appears at least once as an endpoint or payload of some edge in `T.edges`.

5. **Reachable-node containment**

   ```text
   Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes
   ```

   Seeds and all nodes reachable under `(G, Q)` are present in `T.nodes`.

   For any `n ∈ G.Nodes` with `n ∈ Reachable_Q(G, S)` and `n ∉ S`, there exists at least one edge `(ref, body) ∈ T.edges` with `n ∈ body.from ∪ body.to`. Nodes that are only seeds (possibly not in `G.Nodes`) MAY have degree zero in `T`.

6. **No off-graph provenance**

   * All edges in `T.edges` are drawn from `G.Edges`.
   * All nodes in `T.nodes` are seeds or nodes that appear somewhere in the `EdgeBody` of some edge in `G.Edges`.

   The operator does not invent nodes or edges that are not present in the underlying `ProvenanceGraph`.

7. **Finiteness**

   Because `G` is finite (§0.4) and `S` is a finite `SeedSet` (§0.3), `T` is always finite:

   ```text
   |T.edges| ≤ |G.Edges|
   |T.nodes| ≤ |G.Nodes| + |S|
   ```

   In particular, for any finite seed set `S`, `prov_trace` cannot diverge or yield an infinite trace graph for a finite base `ProvenanceGraph`.

Collecting 2–6:

> **TGK/PROV-TRACE-SUBGRAPH/1**
> For any `ProvenanceGraph G`, finite seed set `S : SeedSet`, and query `Q`, `prov_trace(G, S, Q)` yields a finite `TraceGraph T` such that:
>
> * `T.edges ⊆ G.Edges`, and
> * every node in `T.nodes` is either a seed or incident (as `from`, `to`, or `payload`) to some edge in `T.edges`.
>
> No nodes or edges outside `G` appear in `T`.

And more specifically:

> **TGK/PROV-TRACE-NODES/1**
> For any `G`, finite `S : SeedSet`, `Q` with `T = prov_trace(G, S, Q)`:
>
> * `T.seeds = S`;
> * `T.nodes = S ∪ EdgeNodes(T.edges)`; and
> * `Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes`.
>
> In particular, all reachable nodes from `S` under `(G, Q)` are contained in `T.nodes`, but the trace may additionally contain nodes introduced as endpoints or payload nodes of incident edges.

### 4.4 Direction and depth impact

* `direction` and `depth_limit` affect `T.nodes` and `T.edges` **indirectly via** `N = Reachable_Q(G, S)`:

  * Looser direction or larger depth typically produce larger `N`, and thus larger `T.edges` and `T.nodes`.
  * Stricter direction or smaller depth produce smaller `N`, and thus smaller `T.edges` and `T.nodes`.

* Edge inclusion is **incident-based**, not path-based:

  * An edge is in `T.edges` if and only if it is:

    * of a selected type; and
    * incident (via `from` or `to`) to at least one node in `N`.

  * There is no additional requirement that the edge itself lie along a shortest path. Nodes adjacent to closure nodes via such edges MAY lie at a graph-theoretic distance greater than the depth limit; they are still included in `T.nodes` because they are endpoints or payload nodes of included edges.

Profiles that need path-structured traces (e.g., minimal explanation trees, trees restricted strictly to nodes within a depth bound) MUST define those as further projections of `TraceGraph` (e.g., pruning edges and nodes that do not belong to any admissible path), without modifying the kernel semantics here.

### 4.5 Derived edge-closure helper (informative)

For completeness, it is sometimes convenient to speak about the **edge closure** associated with `(G, S, Q)`:

```text
prov_closure_edges(G, S, Q) =
  { (ref, body) ∈ SelectedEdges(G, Q.type_filter) |
      (body.from ∩ Reachable_Q(G, S) ≠ ∅) ∨
      (body.to   ∩ Reachable_Q(G, S) ≠ ∅)
  }
```

By inspection, for any `G`, finite `S : SeedSet`, `Q`:

```text
prov_closure_edges(G, S, Q) = prov_trace(G, S, Q).edges
```

This helper is purely definitional and does not add semantics beyond §4.2; it is provided for profiles that wish to talk separately about edge-closure vs node-closure.

### 4.6 TraceGraph closure and monotonicity (informative)

Using the properties of `Reachable_Q` in §3.4, `prov_trace` enjoys corresponding algebraic properties. For any fixed `G`:

1. **Seed monotonicity**

   If `S₁ ⊆ S₂` (finite `SeedSet`s), then for any `Q`:

   ```text
   prov_trace(G, S₁, Q).nodes ⊆ prov_trace(G, S₂, Q).nodes
   prov_trace(G, S₁, Q).edges ⊆ prov_trace(G, S₂, Q).edges
   ```

2. **Depth-limit monotonicity**

   For `Q` and `Q'` identical except for `DepthLimit`, with `Q.depth_limit` absent and `Q'.depth_limit` present, or with `Q.depth_limit ≤ Q'.depth_limit` when both are present:

   ```text
   prov_trace(G, S, Q).nodes ⊆ prov_trace(G, S, Q').nodes
   prov_trace(G, S, Q).edges ⊆ prov_trace(G, S, Q').edges
   ```

3. **Edge-type filter monotonicity**

   For `Q₁` and `Q₂` identical except for `type_filter`, if:

   ```text
   EffectiveTypeSet(Q₁.type_filter, G) ⊆ EffectiveTypeSet(Q₂.type_filter, G)
   ```

   then for any finite seed set `S : SeedSet`:

   ```text
   prov_trace(G, S, Q₁).nodes ⊆ prov_trace(G, S, Q₂).nodes
   prov_trace(G, S, Q₁).edges ⊆ prov_trace(G, S, Q₂).edges
   ```

   That is, allowing more edge types (while keeping `direction` and `depth_limit` fixed) produces a super-trace that contains all nodes and edges of the more restricted trace.

4. **Seed-union and idempotence**

   * For any finite seed sets `S₁`, `S₂` and fixed `Q`:

     ```text
     prov_trace(G, S₁ ∪ S₂, Q).nodes
       = prov_trace(G, S₁, Q).nodes ∪ prov_trace(G, S₂, Q).nodes

     prov_trace(G, S₁ ∪ S₂, Q).edges
       = prov_trace(G, S₁, Q).edges ∪ prov_trace(G, S₂, Q).edges
     ```

   * For any finite `S` and fixed `Q`, using `S' = Reachable_Q(G, S)`:

     ```text
     prov_trace(G, S,  Q).nodes = prov_trace(G, S', Q).nodes
     prov_trace(G, S,  Q).edges = prov_trace(G, S', Q).edges
     ```

     I.e., using the closure as the new seed set does not change the trace.

These properties are consequences of the definitions in §3 and §4 and are intended to support reasoning about compositional provenance operators in higher-level profiles.

> **TGK/PROV-TRACE-MONO/1**
> For any fixed `ProvenanceGraph G` and query `Q`, the operator:
>
> ```text
> T_Q(G, S) = prov_trace(G, S, Q)
> ```
>
> is:
>
> * **monotone in finite seed sets** (`S₁ ⊆ S₂ ⇒ T_Q(G, S₁) ⊆ T_Q(G, S₂)` on both nodes and edges),
> * **monotone in depth limits** (larger or absent `DepthLimit` yields a super-trace), and
> * **monotone in effective edge types** (larger `EffectiveTypeSet` yields a super-trace),
>
> and satisfies **seed-union** and **idempotence** with respect to node closure as in the equations above. Higher-layer profiles MAY rely on these properties when comparing or composing traces.

### 4.7 Depth-aware trace views (informative)

While `TraceGraph` itself does not carry depth annotations, implementations often need to present or persist depth information alongside traces (e.g., for layered UIs or explanation trees).

For any `G`, finite `S : SeedSet`, `Q`, it is always valid to pair:

```text
T      = prov_trace (G, S, Q)
DMap   = prov_depths(G, S, Q)
Layers = prov_layers(G, S, Q)
```

with the following relationships:

* `dom(DMap) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes`.

* For any `(n, d) ∈ DMap`, node `n` is in the `ProvLayer` with `depth = d`.

* Nodes in `T.nodes \ dom(DMap)` (if any) are exactly the additional nodes introduced as endpoints or payload nodes of edges in `T.edges` that lie beyond the depth bound or are adjacent via incident-only inclusion (§4.4). Equivalently:

  ```text
  T.nodes \ dom(DMap) = EdgeNodes(T.edges) \ Reachable_Q(G, S)
  ```

Higher-layer profiles MAY define depth-annotated trace encodings that:

* embed `DMap` or `Layers` into Artifacts; or
* derive narrative or visual structures from `ProvLayering`.

Such profiles MUST treat `prov_depths` and `prov_layers` as **derived views** of the same kernel closure — they MUST NOT redefine depths in ways that contradict `depth_Q` (§3.2).

---

## 5. Relationship to TGK/1-CORE Invariants

`TGK/PROV/1` is designed to be consistent with, and to operationalize, the core TGK invariants.

### 5.1 Provenance kernel invariant

Recall from `TGK/1-CORE`:

> **TGK/PROV-KERNEL/CORE/1**
> For any snapshot (finite Artifact set + profile set), the corresponding `ProvenanceGraph` is a pure function of that input, and any graph indexes or materialized views are optimizations only.

`TGK/PROV/1` extends this to traces and depth views:

> **TGK/PROV-TRACE-KERNEL/1**
> For any fixed `ProvenanceGraph G`, finite seed set `S : SeedSet`, and query `Q`, all of:
>
> * `prov_closure_nodes(G, S, Q)`,
> * `prov_depths       (G, S, Q)`,
> * `prov_layers       (G, S, Q)`, and
> * `prov_trace        (G, S, Q)`
>
> are pure functions of `(G, S, Q)`. Any stored or cached trace or depth views are optimizations only and MUST be consistent with these definitions.

Together with **TGK/PROV-CLOSURE-OP/1**, **TGK/PROV-SEED-SEM/1**, and **TGK/PROV-DEPTH-MAP/1**, this guarantees that higher-level provenance policies can treat `Reachable_Q` and the associated depth map as a well-behaved closure-and-distance structure over finite seed sets.

### 5.2 Determinism

From `TGK/1-CORE`:

> **TGK/DET/CORE/1**
> For a fixed snapshot + profile set, any two implementations derive isomorphic `ProvenanceGraph`s.

`TGK/PROV/1` inherits and refines this:

> **TGK/PROV-DET/1**
> For any two `TGK/1-CORE`–conformant implementations that derive the same `ProvenanceGraph G` for a snapshot, and for any fixed finite `S : SeedSet` and `Q`, the results of:
>
> * `prov_closure_nodes(G, S, Q)`,
> * `prov_depths       (G, S, Q)`,
> * `prov_layers       (G, S, Q)`, and
> * `prov_trace        (G, S, Q)`
>
> MUST be identical (same node set, same edge set, identical depth assignments, and identical per-depth layer partition) across implementations.

### 5.3 No off-graph provenance

From `TGK/1-CORE`:

> **TGK/NO-OFF-GRAPH-PROV/CORE/1**
> Provenance-relevant relationships must be expressible as edges over `Reference`s; no hidden mutable state.

`TGK/PROV/1` enforces:

* All input is `ProvenanceGraph G`.
* All output edges are a subset of `G.Edges` (by **TGK/PROV-TRACE-SUBGRAPH/1**).
* All output nodes are seeds or nodes that appear somewhere in the `EdgeBody` of some edge in `G.Edges` (by **TGK/PROV-TRACE-NODES/1**).
* Depths and layers are derived solely from the adjacency structure of `G.Edges` and the finite seed set `S`, consistent with **TGK/PROV-DEPTH-MAP/1** and **TGK/PROV-LAYERING-PARTITION/1**.
* Seed behavior in closure is governed by **TGK/PROV-SEED-SEM/1**.
* Traversal uses only `from` and `to` endpoints for adjacency, not `payload`, as required by **TGK/PROV-PAYLOAD-NONTRAV/1**.

Traversal and depth assignment never consult stores, clocks, external indexes, or non-graph metadata.

---

## 6. Interaction with TGK/STORE/1 (Informative)

Although `TGK/PROV/1` is in terms of `ProvenanceGraph` only, most implementations will obtain graph structure via a `TGK/STORE/1` `GraphStoreInstance`.

For a fixed snapshot:

* `G.Edges` corresponds to the set of all `GraphEdgeView { edge_ref, body }` reachable via:

  * `scan_edges` (if implemented), or
  * adjacency queries plus `resolve_edge`.

* `EdgeTypeFilter` here corresponds directly to `TGK/STORE/1`’s `EdgeTypeFilter`:

  * an empty `types` list means “all supported edge types actually present in the graph”;
  * a non-empty `types` list is intersected with the instance’s supported edge types, and then with the set of types that actually appear in edges.

* `Neighbors_BACKWARD`, `Neighbors_FORWARD`, and `Neighbors_BOTH` can be implemented via combinations of:

  * `edges_from(node, type_filter)` and
  * `edges_to(node, type_filter)`,

  with `type_filter` derived from `Q.type_filter` and direction mapped to OUT/IN/BOTH as appropriate:

  * BACKWARD ≈ `edges_to` (IN) followed by neighbors in `from`,
  * FORWARD ≈ `edges_from` (OUT) followed by neighbors in `to`,
  * BOTH ≈ union of the two.

Depth-based operators (`prov_depths`, `prov_layers`) can be implemented atop the same adjacency operations:

* Traversal order and depth assignments must follow the semantics of `depth_Q` (§3.2), independent of the order in which the graph store returns edges.
* Any paging or batching at the `TGK/STORE/1` layer MUST NOT alter the resulting depth map or closure.

An implementation that uses `TGK/STORE/1` MUST ensure that its traversals:

* respect the identity and projection semantics of `TGK/STORE/1`; and
* produce node and edge sets, along with depth and layer assignments, that match `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` exactly for some snapshot.

Nothing in `TGK/PROV/1` requires or forbids specific paging, caching, or distribution strategies in the graph store; it only constrains the observable results.

---

## 7. Examples (Informative Sketches)

These examples are illustrative only and non-normative.

### 7.1 Backward execution provenance

Assume a profile `TGK/PEL/1` defines:

* `EdgeTypeId = EDGE_EXECUTION`.
* For each execution:

  ```text
  EdgeBody.type    = EDGE_EXECUTION
  EdgeBody.from    = [program_ref] ∪ input_refs
  EdgeBody.to      = output_refs ∪ [execution_result_ref]
  EdgeBody.payload = execution_result_ref
  ```

Let:

* `G` be a `ProvenanceGraph` that includes such edges.
* `S = { o }` where `o` is a particular output ArtifactRef.
* `Q` with:

  * `direction   = BACKWARD`
  * `type_filter = { EDGE_EXECUTION }`
  * `depth_limit = absent` (unbounded)

Then:

* `prov_closure_nodes(G, {o}, Q)` contains:

  * `o` at depth `0`,
  * the execution result and input/program nodes at depth `1`,
  * and recursively all ancestors reachable via `EDGE_EXECUTION` edges.

* `prov_depths(G, {o}, Q)` yields a map where, for example:

  * `prov_depths(...)[o]                = 0`,
  * `prov_depths(...)[execution_result] = 1`,
  * `prov_depths(...)[input_i]          = 1`,
  * etc., with deeper ancestors at depth `2`, `3`, and so on.

* `prov_layers(G, {o}, Q)` partitions these nodes into depth-based layers.

* `prov_trace(G, {o}, Q)` includes:

  * all `EDGE_EXECUTION` edges incident (via `from` or `to`) to any node in this closure, and
  * all endpoint and payload nodes of those edges.

Higher layers may then further project `TraceGraph` and associated layers into an explanation tree, or annotate it with timings, certs, or UI overlays.

### 7.2 Mixed execution + evidence provenance

Suppose a profile `TGK/FER/1` defines:

* `EDGE_RECEIPT_SUPPORTS` edges from receipt Artifacts to Artifacts supported by the receipts.

A domain profile might choose:

* a `type_filter` that includes both `EDGE_EXECUTION` and `EDGE_RECEIPT_SUPPORTS`;
* `direction = BACKWARD`, to follow from outputs back to inputs and receipts.

`TGK/PROV/1` does not need to know what “execution” or “receipt” means; it only knows that these are edge types that define how to walk the graph, how to compute reachable nodes, and how to assign depths and layers.

---

## 8. Conformance

An implementation is **TGK/PROV/1–conformant** if, for each provenance operator it claims to implement, it satisfies all of the following:

1. **Graph-based semantics**

   * Treats `ProvenanceGraph` as the sole graph abstraction.
   * Does not introduce any off-graph nodes or edges into `TraceGraph` results or into any depth or layering views, consistent with **TGK/NO-OFF-GRAPH-PROV/CORE/1**, **TGK/PROV-TRACE-SUBGRAPH/1**, and **TGK/PROV-TRACE-NODES/1**.

2. **Parameter handling**

   * Interprets `EdgeTypeFilter` as in §2.1 (including the empty-list “all types present in `G`” rule, and the monotonicity rule when effective edge-type sets are nested).
   * Implements `ProvDirection` semantics exactly as in §2.2, including the distinction between BACKWARD, FORWARD, and BOTH, and the restriction that traversal uses `from` / `to` only (not `payload`), as required by **TGK/PROV-PAYLOAD-NONTRAV/1**.
   * Interprets `DepthLimit` as a hop-count limit as in §2.3 (no expansions from nodes at depth `D`, with the `D = 0` seeds-only behavior).
   * Treats seeds as **finite sets** of nodes:

     * All semantics in this document are defined on a finite `SeedSet S`, and conformance is judged relative to that finite set.
     * If an implementation exposes APIs that accept seeds as lists or sequences, it MUST interpret them semantically as finite sets when computing provenance (ignoring duplicates and not depending on order).
     * Implementations MUST NOT attempt to interpret or emulate infinite seed families when invoking `TGK/PROV/1` operators.

3. **Node closure**

   * For any `G`, finite `S : SeedSet`, `Q`, computes `prov_closure_nodes(G, S, Q)` as in §3.3, i.e.:

     ```text
     prov_closure_nodes(G, S, Q) = Reachable_Q(G, S)
     ```

   * May use any traversal algorithm (e.g., BFS, DFS with memoization) but MUST produce exactly the `Reachable_Q(G, S)` set.

   * Ensures that, for fixed `G` and `Q`, the mapping `S ↦ prov_closure_nodes(G, S, Q)` over finite seed sets behaves as a closure operator (extensive, monotone, idempotent), as stated in **TGK/PROV-CLOSURE-OP/1**.

   * Preserves the seed semantics in **TGK/PROV-SEED-SEM/1**:

     * seeds are always included in the closure,
     * empty seed sets produce empty closures,
     * seeds not in `G.Nodes` are in the closure but never expand the frontier.

   * Respects the **edge-type filter monotonicity** of §3.4.5: when two queries differ only in `type_filter` with nested effective type sets, the closure for the narrower edge set is a subset of the closure for the broader set.

4. **Depth maps and layering**

   * If it exposes `prov_depths`, it MUST:

     * compute `depth_Q(G, S, n)` as in §3.2, and
     * return a `DepthMap` exactly equal to the mapping in §3.6.1 (domain = `Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)`, seeds at depth 0, values = minimum hop-count depth), consistent with **TGK/PROV-DEPTH-MAP/1**.

   * If it exposes `prov_layers`, it MUST:

     * derive `ProvLayering` from `prov_depths` as in §3.6.2,
     * ensure that layers are ordered by strictly increasing `depth` and that nodes are partitioned by depth without overlap, consistent with **TGK/PROV-LAYERING-PARTITION/1**.

5. **TraceGraph construction**

   * For any `G`, finite `S : SeedSet`, `Q`, computes `prov_trace(G, S, Q)` as in §4.2.

   * Ensures all invariants in §4.3 hold, including:

     * `T.edges ⊆ G.Edges`,
     * `T.seeds = S`,
     * `T.nodes = S ∪ EdgeNodes(T.edges)`,
     * `Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes`.

   * For any node `n` with `n ∈ G.Nodes`, `n ∈ Reachable_Q(G, S)`, and `prov_depths(G, S, Q)[n] > 0`, ensures that there exists at least one edge `(ref, body) ∈ T.edges` with `n ∈ body.from ∪ body.to`, so that every non-seed reachable node appears incident to at least one trace edge.

   * Ensures that its outputs respect the monotonicity and idempotence properties of §4.6, consistent with **TGK/PROV-TRACE-MONO/1** (monotone in seeds, depth limits, and effective edge types; seed-union and idempotence with respect to closure).

6. **Determinism**

   * For fixed `G`, finite `S : SeedSet`, and `Q`, produces the same outputs across runs and across equivalent implementations, consistent with §5.2:

     * same closure node set,
     * same depth assignments (if provided),
     * same per-depth layering (if provided),
     * same trace edge and node sets.

7. **Layering**

   * Does not assume any particular physical storage or graph index model.
   * If implemented on top of `TGK/STORE/1`, respects the semantics of that profile and does not alter its identity or projection guarantees.

8. **Neutrality**

   * Does not bake in PEL-, CIL-, FER-, FCT-, or OI-specific semantics.
   * Leaves provenance policies (which edge types to include and how to present them) to higher-level profiles and applications.

Everything else — user-facing query languages, result serialization, UI representations, domain-specific policies, optimization strategies — is outside the scope of `TGK/PROV/1` and MAY be specified by additional profiles.

---

## 9. Version History (Informative)

**0.1.20 — 2025-11-16**

* Marked the draft as stable for use as an Amduat 2.0 substrate baseline candidate.
* Performed only micro editorial cleanups (punctuation and phrasing) without changing any definitions, equations, or invariants.
* For any finite snapshot `G` and finite `SeedSet S`, the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` is **identical** to v0.1.19; this version simply records that the kernel provenance semantics are considered ready for review and adoption.

**0.1.19 — 2025-11-16**

* Added §0.4 (**Snapshots & finiteness assumptions**) to collect in one place the finiteness guarantees inherited from `TGK/1-CORE`:

  * each `ProvenanceGraph G` considered by this spec is the projection of a finite snapshot, with finite `G.Nodes` and `G.Edges`, and
  * all semantics in `TGK/PROV/1` are defined relative to such a finite `G` and a finite `SeedSet`.

* Cross-referenced that finiteness assumption where it is used:

  * in §3.2 and §3.3 when arguing that depth assignment and closure construction always terminate and yield finite results, and
  * in §3.5 and §4.3 when stating that fixpoint iterations stabilize and `TraceGraph` is always finite.

* These changes are **purely clarificatory**: for any finite snapshot `G` and finite seed set `S`, the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` is unchanged from v0.1.18. The new subsection and cross-references simply make the reliance on `TGK/1-CORE`’s finiteness assumptions explicit and easy to cite from other documents.

**0.1.18 — 2025-11-16**

* Introduced an explicit `SeedSet` alias in §0.3:

  ```text
  SeedSet = set<Node>   // always finite under TGK/PROV/1
  ```

  and used this terminology in the core sections (§3, §4, §5, §8) to make “finite seed sets” a first-class, named concept rather than repeating “finite set of nodes” in prose.

* Clarified in §1.1, §3.1, and the conformance section that all operators are defined over **finite seed sets** and that the closure operator in **TGK/PROV-CLOSURE-OP/1** is formally a map on `P_f(Node)` (finite subsets of `Node`), aligning notation with the already-stated finiteness assumptions from v0.1.17.

* Slightly tightened wording around `SeedSet` in parameter handling (§8) to emphasize that:

  * implementations MUST interpret any seed input as a finite set (duplicates ignored, order irrelevant), and
  * MUST NOT attempt to treat infinite seed families as valid inputs to `TGK/PROV/1` operators.

These changes are purely **terminological and clarificatory**. For any finite seed set `S`, the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` is identical to v0.1.17; the new `SeedSet` alias simply makes the finiteness domain and closure-operator laws more explicit.

**0.1.17 — 2025-11-16**

* Made the **finiteness assumptions** for seeds explicit and wired them into the invariants:

  * In §0.3, stated that all seed sets `S` are assumed to be **finite**; seeds are now formally “finite sets of nodes” rather than arbitrary subsets of `Node`.
  * In §3.1 and §3.3, clarified that the closure operator is defined for finite `S`, and noted explicitly that `Reachable_Q(G, S)` is always finite because both `G.Nodes` and `S` are finite.
  * In §3.4 and §4.6, phrased the closure and trace monotonicity/idempotence laws in terms of **finite seed sets**, while preserving their algebraic form.

* Tightened the **finiteness invariant for traces and depth maps**:

  * In §4.3.7, updated the `TraceGraph` finiteness argument to rely on both `G` and `S` being finite, and qualified the “cannot diverge” statement as applying to finite seed sets.
  * In §3.6.1, noted that `prov_depths(G, S, Q)` is always a finite map because its domain is the finite closure `Reachable_Q(G, S)`.

* Strengthened **conformance** around seed sets:

  * In §8 (parameter handling), elevated finiteness to a normative requirement: implementations MUST interpret seeds as finite sets and MUST NOT attempt to treat infinite seed families as inputs to `TGK/PROV/1` operators.
  * Kept the existing “seeds are sets” rule (duplicates ignored, order irrelevant) and made it clear that this refers to finite sets.

These changes are **clarificatory** for all realistic deployments, where seeds are always finite. For any finite seed set `S`, the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` is unchanged compared to v0.1.16.

**0.1.16 — 2025-11-16**

* Made the equivalence between the **named closure operator** and the internal `Reachable_Q` set completely explicit:

  * In §3.3, defined `prov_closure_nodes(G, S, Q)` directly as `Reachable_Q(G, S)` and stated the equality normatively, and in §3.4 and §3.5 referred to `Cl_Q(S)` as both `Reachable_Q(G, S)` and `prov_closure_nodes(G, S, Q)`.
  * In §3.6.1 and §4.7, updated the depth and trace relationships to reference both `Reachable_Q(G, S)` and `prov_closure_nodes(G, S, Q)`, making it clear that depth maps and layers are always defined over the same closure set used by `prov_closure_nodes`.

* Elevated the **“seeds are sets”** behavior from an informal remark to an explicit conformance requirement:

  * Kept the explanatory note in §0.3 that seeds are always treated as sets (duplicates ignored).
  * In §8 (conformance), added a normative bullet under parameter handling: implementations whose APIs accept seeds as lists or sequences MUST interpret them semantically as sets when computing provenance, so results depend only on the set of seeds, not their multiplicity or order.

* Minor editorial consistency cleanups:

  * Clarified in §3.6.2 that when the closure is empty, `ProvLayering.layers` is an empty list (while still forbidding empty individual layers).
  * Slightly rephrased invariant statements in §3.6.1, §4.3, and §4.7 to consistently mention both `Reachable_Q(G, S)` and `prov_closure_nodes(G, S, Q)`, reducing the chance of misreading them as distinct concepts.

These changes are **purely clarificatory** relative to v0.1.15: they do not alter the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, or `prov_trace` for any `(G, S, Q)`, but make the correspondence between the closure operator, its depth views, and the conformance obligations easier to see and reason about.

**0.1.15 — 2025-11-16**

* Clarified the behavior of **self-loops and repeated endpoints** in neighbor semantics:

  * In §2.2, made explicit that if a selected edge has the same node in both `from` and `to`, that node is its own neighbor under `FORWARD` and `BACKWARD`, and appears only once as a neighbor under `BOTH`.
  * Stated that multiple occurrences of the same node within `from` and/or `to` do not change neighbor sets, since neighbors are mathematical sets.

* Strengthened the **depth and layering explanations** around seeds:

  * In §3.6.1, highlighted that `dom(prov_depths)` is exactly the closure `Reachable_Q(G, S)` and that seeds always appear in the depth map at depth 0, even when they are not in `G.Nodes`.
  * In §3.6.2, reiterated that when `S` is non-empty, seeds always appear in the unique layer with `depth = 0`, making the layering behavior around seeds precise and easy to rely on.

* Minor editorial and consistency improvements:

  * Clarified in §0.3 that seeds are treated as **sets** of nodes (deduplicated), even if supplied as lists in concrete APIs.
  * Slightly tightened the explanation in §4.2–§4.3 that trace nodes are **exactly** seeds plus nodes appearing in the included edges, while reachable nodes are guaranteed to be a subset of that node set.
  * Removed a stray editorial fragment at the end of the 0.1.0 history entry, keeping the version history clean and self-contained.

These changes are **pure clarifications** over v0.1.14. They do not alter the semantics of `prov_closure_nodes`, `prov_depths`, `prov_layers`, or `prov_trace` for any `(G, S, Q)`; they simply make certain edge cases (self-loops, duplicated endpoints, and seed behavior in depth/layer views) more explicit for implementers and profile authors.

**0.1.14 — 2025-11-16**

* Made the **non-traversal role of payload** fully explicit and normative:

  * In §2.2 (“Payload nodes”), introduced **TGK/PROV-PAYLOAD-NONTRAV/1**, stating that kernel provenance traversal MUST NOT use `EdgeBody.payload` as a source of neighbors and that adjacency, depth, and closure are defined purely in terms of `from` and `to`.
  * Clarified that profiles wishing to traverse payload values MUST either represent those relationships via additional edges or define profile-specific traversal on top of this kernel, without altering the core neighbor functions.

* Gave the **trace monotonicity and idempotence** properties an explicit rule name:

  * In §4.6, collected the existing seed, depth-limit, and edge-type monotonicity properties, plus the seed-union and idempotence equalities, into **TGK/PROV-TRACE-MONO/1** for easier reference from higher-layer profiles.
  * Emphasized that `prov_trace` is monotone in seeds, depth limits, and effective edge types, and idempotent under reseeding with the node closure, mirroring the closure-operator law **TGK/PROV-CLOSURE-OP/1**.

* Tightened **conformance** around these invariants:

  * In §8, required implementations to enforce the payload non-traversal rule (**TGK/PROV-PAYLOAD-NONTRAV/1**) as part of parameter handling, and to ensure that trace outputs satisfy the monotonicity and idempotence properties summarized in **TGK/PROV-TRACE-MONO/1**.
  * These obligations are purely **expository**: they make explicit requirements that were already implied by the earlier definitions of `Neighbors_dir`, `Reachable_Q`, and `prov_trace`. No operational semantics changed compared to v0.1.13; the same graphs, closures, depths, and traces are computed for any given `(G, S, Q)`.

**0.1.13 — 2025-11-16**

* Made the **dependence on edge-type selection** explicitly monotone throughout the kernel:

  * In §2.1, clarified that `EffectiveTypeSet(filter, G)` not only determines which edges participate in traversal, but also induces a natural monotonicity: enlarging the effective edge-type set can only add neighbors, reachable nodes, and trace edges, never remove them.
  * In §2.2, noted that, for fixed `G`, `n`, and `direction`, `Neighbors_dir(G, n, filter)` is monotone in `EffectiveTypeSet(filter, G)` and always returns nodes in `G.Nodes`, making the endpoint set of neighbors explicit.
  * In §3.4, added an **edge-type filter monotonicity** property as item 5: when two queries differ only in `type_filter` and one effective type set is a subset of the other, the corresponding closures are nested (`Reachable_Q₁ ⊆ Reachable_Q₂`).
  * In §4.6, extended the trace monotonicity discussion with an explicit **edge-type filter monotonicity** rule for `prov_trace`: allowing more edge types produces a super-trace that contains all nodes and edges of the more restricted trace.

* Tightened **conformance** around filter behavior:

  * In §8, updated the parameter-handling and node-closure bullets to require that implementations respect the new edge-type monotonicity property (derived directly from the semantics), so that higher layers can safely depend on “more edge types ⇒ bigger or equal trace/closure” when composing provenance policies.

* All changes are **strictly definitional**: they give names and explicit statements to monotonicity properties already implied by the original `SelectedEdges`, `Neighbors_dir`, and `Reachable_Q` definitions. The operational behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, and `prov_trace` for any concrete `G`, `S`, `Q` is unchanged compared to v0.1.12.

**0.1.12 — 2025-11-16**

* Captured seed behavior as an explicit, named invariant:

  * In §3.3, introduced **TGK/PROV-SEED-SEM/1**, summarizing the seed-related rules already implied by the closure definition:

    * seeds are always included in the closure, regardless of filters or depth limits;
    * an empty seed set yields an empty closure;
    * seeds not in `G.Nodes` are included in the closure but never expand the frontier or affect reachability inside `G.Nodes`.

  * Clarified that changes to `EdgeTypeFilter` or `DepthLimit` may add or remove non-seed reachable nodes, but MUST NOT remove seeds from the closure.

* Wired the new seed invariant into:

  * §5.1, so that the provenance kernel description explicitly references both closure-operator behavior and seed semantics; and
  * §8 (conformance), by requiring implementations to preserve the seed semantics in **TGK/PROV-SEED-SEM/1** alongside the existing closure, depth, layering, and trace invariants.

* Removed minor duplicated trailing text in the v0.1.0 history entry for editorial cleanliness, without changing any normative semantics.

**0.1.11 — 2025-11-16**

* Made the depth semantics more explicit and named the key invariants:

  * In §3.6.1, clarified that `prov_depths` always includes **all seeds**, even seeds not in `G.Nodes`, with depth 0, and that its domain is exactly `Reachable_Q(G, S)`.
  * Introduced **TGK/PROV-DEPTH-MAP/1** to capture the invariant that `prov_depths`:

    * has domain equal to the closure `Reachable_Q(G, S)`,
    * assigns seeds depth 0, and
    * assigns each node its minimal admissible-path hop-count depth.

* Tightened the layering definition and gave it a named invariant:

  * In §3.6.2, emphasized that `prov_layers` is a **strict partition** of `Reachable_Q(G, S)` into non-empty, disjoint depth-labeled layers whose union is exactly the reachable node set.
  * Introduced **TGK/PROV-LAYERING-PARTITION/1** to make this partition property easy to cite from higher-layer profiles.

* Updated §5.1–§5.3 and §8 to reference the new invariants, so conformance explicitly guarantees:

  * closure-operator behavior for `Reachable_Q` (**TGK/PROV-CLOSURE-OP/1**),
  * canonical minimal-depth semantics for `prov_depths` (**TGK/PROV-DEPTH-MAP/1**), and
  * partition semantics for `prov_layers` (**TGK/PROV-LAYERING-PARTITION/1**),
    without changing any substantive behavior compared to v0.1.10.

**0.1.10 — 2025-11-16**

* Introduced an explicit **closure-operator invariant** for node closure:

  * In §3.4, added **TGK/PROV-CLOSURE-OP/1**, stating that for fixed `G` and `Q` the map `S ↦ Reachable_Q(G, S)` is a Kuratowski-style closure operator on subsets of `Node` (extensive, monotone, idempotent).
  * This makes it easy for higher-level profiles to rely on algebraic properties of provenance closure when composing queries or policies.

* Gave the trace subgraph and node characterization explicit **normative rule names**:

  * In §4.3, collected the existing invariants into:

    * **TGK/PROV-TRACE-SUBGRAPH/1** (trace is a finite subgraph of `G` with no invented nodes or edges), and
    * **TGK/PROV-TRACE-NODES/1** (node characterization as `seeds ∪ EdgeNodes(edges)` plus reachable-node containment).

  * Wired these invariants into §5.3 and §8 so conformance is explicitly tied to them.

* Slightly tightened the prose around the relationships between `Reachable_Q`, `TraceGraph.nodes`, and the depth/layer views:

  * Clarified in §4.7 that `T.nodes \ dom(DMap)` is exactly `EdgeNodes(T.edges) \ Reachable_Q(G, S)`, i.e. the “extra” nodes introduced purely as endpoints or payload nodes of incident edges.
  * Emphasized in §5.1 that **TGK/PROV-TRACE-KERNEL/1** and **TGK/PROV-CLOSURE-OP/1** jointly guarantee a clean, graph-only kernel for higher-layer provenance semantics.

* All changes are **strictly definitional** and do not alter the behavior of `prov_closure_nodes`, `prov_depths`, `prov_layers`, or `prov_trace` compared to v0.1.9; they only assign names to already-implied properties and link them into conformance.

**0.1.9 — 2025-11-16**

* Tightened the **TraceGraph node-set characterization**:

  * Defined `TraceGraph.nodes` canonically as `seeds ∪ EdgeNodes(edges)` in §4.2, and made this equality explicit as an invariant in §4.3.
  * Noted that this is equivalent to the earlier `Reachable_Q(G, S) ∪ EdgeNodes(E)` view but expressed purely in terms of seeds and edges, which simplifies validation of persisted traces.

* Strengthened the link between **depths and edges**:

  * Clarified in §3.6.1 that any node in `G.Nodes` with depth > 0 must have at least one “parent along some shortest path” edge consistent with `Q.direction`.
  * Reflected this in conformance (§8) by requiring that any such node appear incident (via `from` / `to`) to at least one edge in the corresponding `TraceGraph`.

* Refined **depth-aware trace relationships**:

  * In §4.7, made explicit that `T.nodes \ dom(prov_depths)` is exactly `EdgeNodes(T.edges) \ Reachable_Q(G, S)`, i.e. the additional nodes introduced as endpoints or payload nodes of edges that lie beyond the depth bound or appear only via incident-based inclusion.

* Confirmed that all changes are **semantically compatible** with v0.1.8: the closure, depth, layering, and trace operators still compute the same results; the new material only sharpens equalities, invariants, and conformance obligations.

**0.1.8 — 2025-11-16**

* Made the **depth map** and **layering** views first-class:

  * Introduced `DepthMap`, `ProvLayer`, and `ProvLayering` types in §0.3.
  * Defined a normative `prov_depths(G, S, Q)` operator (§3.6.1) whose domain is exactly `Reachable_Q(G, S)` and whose values are the minimum hop-count depths already implicit in `depth_Q`.
  * Defined a normative `prov_layers(G, S, Q)` operator (§3.6.2) that partitions the reachable node set into non-empty depth-labeled layers, ordered by increasing depth.

* Extended the **kernel invariants** to cover depth views:

  * Updated `TGK/PROV-TRACE-KERNEL/1` (§5.1) and `TGK/PROV-DET/1` (§5.2) to include `prov_depths` and `prov_layers` alongside `prov_closure_nodes` and `prov_trace`.

* Clarified depth-aware trace usage:

  * Added §4.7 to describe how `TraceGraph`, `DepthMap`, and `ProvLayering` relate, and to note that nodes outside `dom(prov_depths)` are precisely those introduced as endpoints of incident edges beyond the depth bound or via incident-only inclusion.

* Updated the **conformance** section (§8) so that, when implementations choose to expose depth or layering, those views MUST be derived exactly from `depth_Q` as defined in §3.2–§3.3.

All changes were **purely definitional** and did not alter the semantics of `prov_closure_nodes` or `prov_trace` as defined in v0.1.7.

**0.1.7 — 2025-11-16**

* Extended the closure properties in §3.4 with an explicit **seed-union** law and **idempotence** of `Reachable_Q` when used as a seed set, making the algebraic behavior of the closure operator easier to reference from higher layers.
* Introduced a new §4.6 to collect the corresponding **TraceGraph monotonicity and idempotence** properties for `prov_trace`, showing how changes in seeds and depth limits affect both node and edge sets and how re-seeding with the closure leaves the trace unchanged.
* Kept all changes purely definitional and algebraic; no operational semantics were altered, and all behaviors remain compatible with v0.1.6.

**0.1.6 — 2025-11-16**

* Corrected the treatment of seeds that are not in `G.Nodes` (§3.3), making it explicit that such seeds cannot appear as endpoints of edges in `G.Edges` by the definition of `ProvenanceGraph.Nodes` in `TGK/1-CORE`, and therefore influence only the seed/closure node sets and not the traversed portion of the graph.
* Introduced an explicit definition of **admissible `Q`-paths** in §3.4 and rewrote the path characterization there to refer to these paths, tightening the connection between the inductive depth construction and the intuitive “shortest path length” interpretation.
* Performed minor editorial cleanups around the closure properties to keep the relationship between `depth_Q`, admissible paths, and BFS-style semantics precise and easy to reference from higher-level profiles.

**0.1.5 — 2025-11-16**

* Clarified `DepthLimit` semantics, making the `D = 0` “seeds-only” special case explicit and aligning the prose more clearly with the hop-count view already implied by the inductive depth definition.
* Tightened the description of how empty seed sets and empty `EffectiveTypeSet` interact with depth limits, ensuring the resulting closure behavior is fully spelled out and obviously consistent across cases.
* Added an explicit **finiteness** invariant for `TraceGraph`, noting that `prov_trace` always returns a finite result when `ProvenanceGraph` is finite and seed sets are finite in practice.
* Introduced an informative **fixpoint characterization** (§3.5) of unbounded closures as the least fixpoint of a monotone operator, to make reasoning about monotonicity and idempotence easier for higher-level profiles.
* Strengthened the `TGK/STORE/1` interaction discussion (§6) to state explicitly that `EdgeTypeFilter` here shares semantics with `TGK/STORE/1`’s filter, and to map `ProvDirection` to `GraphDirection`-style OUT/IN/BOTH neighbor semantics more concretely.

**0.1.4 — 2025-11-16**

* Made the behavior of `prov_closure_nodes` under special cases explicit and normative:

  * Seeds not in `G.Nodes` are always depth-0 but never generate neighbors; they appear only as seeds and closure nodes.
  * An empty seed set yields an empty closure for any `Q`.
  * If `EffectiveTypeSet(filter, G)` is empty, node closure is exactly the seed set.

* Re-expressed the trace edge set `E` in §4.2 directly as:

  ```text
  E = ⋃ { IncEdges(G, n, filter) | n ∈ N }
  ```

  and noted its equivalence to the previous comprehension, so that `SelectedEdges`, `IncEdges`, and the neighbor definitions form an obviously composable family of helpers.

* Performed minor editorial cleanups to keep the distinction between sets, lists, and ordering explicit, and to underline that any ordering concerns live in `TGK/STORE/1`, not in this provenance kernel.

**0.1.3 — 2025-11-16**

* Made the role of `SelectedEdges(G, filter)` and `IncEdges(G, n, filter)` explicit and reused them consistently between closure and trace definitions, to make the composition of definitions clearer and reduce repetition.
* Clarified the inductive definition of `depth_Q` so that seeds that are not in `G.Nodes` are explicitly acknowledged as depth-0 but never appear in any `Frontier_d`, while still being included in `Reachable_Q(G, S)` (§3.2–§3.3).
* Tightened language around set semantics (no ordering, no multiplicity) and made it explicit that any ordering concerns live in `TGK/STORE/1`, not this profile.
* Slightly sharpened the mapping between `prov_trace` and the derived helper `prov_closure_edges`, and made the use of `SelectedEdges` in §4.2 structurally parallel to the node-closure definitions.

**0.1.2 — 2025-11-16**

* Added an explicit **closure properties** subsection (§3.4) to capture:

  * seed monotonicity (more seeds → superset of reachable nodes),
  * depth-limit monotonicity (larger depth bound → superset of reachable nodes),
  * and the path characterization of `depth_Q` (BFS-style minimum path length).

* Introduced a small, purely definitional helper `prov_closure_edges(G, S, Q)` (§4.5) to make it easier for profiles to talk separately about edge-closure vs node-closure, while keeping it normatively equal to `prov_trace(...).edges`.

* Clarified in §2.2 / §3.2 that neighbor and frontier sets are always taken over `G.Nodes`, and that seeds outside `G.Nodes` never expand the frontier but remain part of the closure and in `TraceGraph.nodes`.

**0.1.1 — 2025-11-16**

* Clarified `TraceGraph` construction so that:

  * `TraceGraph.edges` is the set of selected-type edges incident (via `from`/`to`) to at least one node in the closure `Reachable_Q(G, S)`.
  * `TraceGraph.nodes` is the union of the closure nodes and all endpoint/payload nodes of those edges (`EdgeNodes(E)`), ensuring that all nodes incident to edges in the trace are present in `TraceGraph.nodes`.

* Fixed the node-set definition to guarantee the “node closure over edges” invariant:

  ```text
  { incident nodes of T.edges } ⊆ T.nodes
  ```

* Made the incident-based nature of edge inclusion explicit in §4.4, and noted that nodes adjacent to closure nodes via included edges may lie beyond the depth bound but are still part of `TraceGraph` as edge endpoints.

* Introduced helper notions `SelectedEdges(G, filter)`, `IncEdges(G, n, filter)`, and `EdgeNodes(E)` for clearer, reusable definitions.

* Tightened the description of seeds not in `G.Nodes`: they receive depth `0`, do not expand the traversal, but remain in the closure and in `TraceGraph.nodes`.

**0.1.0 — 2025-11-16**

* Initial draft of `TGK/PROV/1` as a kernel provenance semantics layer over `TGK/1-CORE`.

* Defined core parameter model:

  * `ProvDirection` (BACKWARD, FORWARD, BOTH),
  * `EdgeTypeFilter` (edge-type selection),
  * `DepthLimit` (hop-count bound).

* Introduced `prov_closure_nodes` as a deterministic node-set closure operator over a fixed `ProvenanceGraph` and query.

* Introduced `TraceGraph` and `prov_trace` as a subgraph projection induced by node closure, with:

  * incident-based edge inclusion for selected edge types, and
  * payload nodes included but not traversed.

* Aligned semantics explicitly with TGK kernel invariants:

  * `TGK/PROV-KERNEL/CORE/1`,
  * `TGK/DET/CORE/1`,
  * `TGK/NO-OFF-GRAPH-PROV/CORE/1`.

* Documented the relationship with `TGK/STORE/1` (graph store) as informative only, keeping `TGK/PROV/1` store-neutral and purely semantic.

---

## Document History

* **0.1.20 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.