amduat/tier1/tgk-prov-1.md
2025-12-20 11:32:17 +01:00

78 KiB
Raw Permalink Blame History

TGK/PROV/1 — Provenance & Trace Semantics over TGK/1-CORE

Status: Approved Owner: Niklas Rydberg Version: 0.1.20 SoT: Yes Last Updated: 2025-11-16 Linked Phase Pack: N/A Tags: [traceability, evidence]

Heres iteration 22 of TGK/PROV/1, bumped to 0.1.20. Im happy to treat this as ready for approval (no further iterations needed from my side). After that youll find an updated SUBSTRATE/STACK-OVERVIEW and a suggested commit message.

© 2025 Niklas Rydberg.

License

Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.

Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.


Document ID: TGK/PROV/1 Layer: L1.8 — Provenance & trace semantics over TGK/1-CORE

Depends on (normative):

  • TGK/1-CORE v0.7.x — trace graph kernel: Node, EdgeBody, EdgeTypeId, ProvenanceGraph
  • (via TGK/1-CORE) ASL/1-CORE v0.4.x — value substrate (Artifact, Reference, TypeTag)

Informative references:

  • SUBSTRATE/STACK-OVERVIEW v0.3.x — layering and dependency discipline
  • TGK/STORE/1 v0.2.x — graph store & query semantics (adjacency queries over ProvenanceGraph)
  • ENC/TGK1-EDGE/1 v0.1.x — canonical encoding for TGK EdgeBody / EdgeArtifacts
  • PEL/TRACE-DAG/1 v0.1.x — DAG execution trace profile (example provenance source)
  • CIL/1, FER/1, FCT/1, OI/1 — profiles that define provenance-relevant edge types

Normativity note TGK/PROV/1 defines pure functions over ProvenanceGraph only. It MUST NOT assume any particular store, graph index, or query API. Implementations MAY realize these functions using TGK/STORE/1 or other substrates, but conformance is judged against the abstract semantics in this document.


0. Conventions

0.1 RFC 2119 terminology

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL are to be interpreted as described in RFC 2119.

0.2 Imported types and structures

From ASL/1-CORE and TGK/1-CORE:

Reference {
  hash_id: HashId
  digest:  OctetString
}

Node := Reference

EdgeTypeId = uint32

EdgeBody {
  type:    EdgeTypeId
  from:    Node[]
  to:      Node[]
  payload: Reference
}

ProvenanceGraph {
  Nodes: set<Node>
  Edges: set<(EdgeRef, EdgeBody)>
}

EdgeRef := Reference

From TGK/1-CORE (informative, restated):

  • ProvenanceGraph is a pure projection over:

    • a finite set of Artifacts, and
    • a configured set of TGK edge profiles and catalogs.
  • All edges are represented as EdgeArtifacts; EdgeRef is an ordinary ASL Reference.

Unless otherwise stated, sets are mathematical sets (no duplicates, no ordering). Lists are ordered and MAY contain duplicates.

0.3 Additional terminology (this document)

TGK/PROV/1 introduces the following logical types:

ProvDirection (u8) {
  BACKWARD = 1  // walk from 'to' nodes to 'from' nodes
  FORWARD  = 2  // walk from 'from' nodes to 'to' nodes
  BOTH     = 3  // union of BACKWARD and FORWARD
}

EdgeTypeFilter {
  // If empty, match all EdgeTypeId values present in the graph.
  types: list<EdgeTypeId>
}

DepthLimit = optional uint32   // hop-count limit; absent = unbounded

TraceGraph {
  seeds: set<Node>
  nodes: set<Node>
  edges: set<(EdgeRef, EdgeBody)>
}

Additional derived types used in this version:

DepthMap = map<Node, uint32>

ProvLayer {
  depth: uint32
  nodes: set<Node>
}

ProvLayering {
  layers: list<ProvLayer>  // sorted by increasing depth, no empty layers
}

We also use a shorthand for finite seed sets:

SeedSet = set<Node>   // always finite under TGK/PROV/1

A TraceGraph is a subgraph view over some ProvenanceGraph, annotated with the seed set from which it was computed. DepthMap and ProvLayering are depth-oriented views over the same closure. TGK/PROV/1 does not assign additional semantics to TraceGraph.seeds, depths, or layers by itself; higher-layer profiles (e.g. FER/1) may do so when encoding traces into Artifacts or presenting them to users.

Seeds are always treated as finite sets of nodes:

  • Semantically, a seed input is a SeedSet.
  • Even if a concrete API accepts seeds as a list or array, duplicates are ignored and order is irrelevant.

In all operators in this document, the seed set S is assumed to be a finite SeedSet. Concrete APIs MAY accept seeds in any finite representation, but semantically they are finite sets.

0.4 Snapshots & finiteness assumptions

TGK/PROV/1 always works relative to a single snapshot of an ExecutionEnvironment in the sense of TGK/1-CORE:

  • Each snapshot has a finite Artifact set and a fixed TGK profile set in effect.

  • The corresponding ProvenanceGraph G for that snapshot has:

    G.Nodes  -- finite
    G.Edges  -- finite
    

All semantics in this specification are evaluated with respect to such a finite ProvenanceGraph G and a finite SeedSet S:

  • All closures, traces, depth maps, and layerings are therefore finite by construction.
  • Any mention of “for a fixed ProvenanceGraph G” or “for a snapshot” implicitly assumes this finiteness, as guaranteed by TGK/1-CORE.

TGK/PROV/1 does not describe how snapshots evolve over time; it only speaks about the results of provenance operators within a given snapshot.


1. Purpose, Scope & Non-Goals

1.1 Purpose

TGK/PROV/1 defines a kernel provenance model over ProvenanceGraph:

  • A small set of provenance query parameters:

    • direction (ProvDirection),
    • edge-type selection (EdgeTypeFilter),
    • depth bound (DepthLimit).
  • Canonical closure operators over ProvenanceGraph:

    • prov_closure_nodes — node set reachable from seeds under a given policy;
    • prov_trace — a TraceGraph subgraph induced by that closure.
  • Canonical depth-oriented views:

    • prov_depths — a DepthMap assigning each reachable node a minimum hop-count depth;
    • prov_layers — a ProvLayering partitioning reachable nodes by depth.

These operators are specified as pure functions over a fixed ProvenanceGraph snapshot (§0.4) and a fixed parameter set, in line with TGK invariants:

  • TGK/PROV-KERNEL/CORE/1 — provenance as a projection over edges;
  • TGK/DET/CORE/1 — determinism for a fixed snapshot + profile set;
  • TGK/NO-OFF-GRAPH-PROV/CORE/1 — no hidden, non-Artifactual provenance.

For any ProvenanceGraph G, finite seed set S : SeedSet, and provenance parameters Q:

prov_closure_nodes(G, S, Q) -> set<Node>
prov_depths       (G, S, Q) -> DepthMap
prov_layers       (G, S, Q) -> ProvLayering
prov_trace        (G, S, Q) -> TraceGraph

are pure, deterministic functions of (G, S, Q).

1.2 Scope

TGK/PROV/1 covers:

  • The provenance parameter model (ProvDirection, EdgeTypeFilter, DepthLimit).

  • The semantics of:

    • directional neighbors and edge incidence,
    • backward / forward / bidirectional cones,
    • node-depth assignment and per-layer grouping,
    • the induced TraceGraph subgraph.
  • An explicit requirement that all provenance semantics:

    • operate solely over ProvenanceGraph, and
    • never introduce edges or nodes not already present in that graph.

This document is intentionally conservative. It specifies “kernel” provenance operators suitable as a foundation for:

  • execution provenance (e.g. PEL DAG traces),
  • certificate and evidence provenance (CIL/1, FER/1),
  • fact and overlay provenance (FCT/1, OI/1),

without baking those domains into the kernel.

1.3 Non-goals

TGK/PROV/1 explicitly does not define:

  • Store or graph store surfaces (ASL/1-STORE, TGK/STORE/1); it is purely semantic.
  • Provenance policies (e.g. “which edge types count as causal provenance vs annotations”).
  • A full query language (filters on payload content, fact schemas, certificate semantics).
  • Minimality or optimality criteria for traces (e.g. “smallest subgraph that explains X”).
  • How provenance results are encoded into Artifacts (e.g. receipts or reports) — those are FER/1 and related specifications.

Implementations and profiles MAY define richer operators (slicing, k-bounded paths, temporal constraints, etc.) on top of this kernel, but MUST describe them as projections over ProvenanceGraph.


2. Provenance Parameters

2.1 Edge type selection (EdgeTypeFilter)

EdgeTypeFilter constrains which edges participate in a provenance traversal.

EdgeTypeFilter {
  types: list<EdgeTypeId>
}

For a fixed ProvenanceGraph G, define the effective edge-type set:

EffectiveTypeSet(filter, G) =
  if filter.types is empty:
    { body.type | (ref, body) ∈ G.Edges }
  else:
    { t | t ∈ filter.types ∧ ∃ (ref, body) ∈ G.Edges, body.type = t }

Only edges whose EdgeBody.type belongs to EffectiveTypeSet(filter, G) are considered during traversal.

Notes:

  • Repeated EdgeTypeId values in filter.types have no effect.

  • If EffectiveTypeSet(filter, G) is empty, then no edges participate in traversal for that graph and filter; the resulting edge set is empty and the node closure consists solely of the seeds (see §3.3).

  • For any two filters filter₁, filter₂, if:

    EffectiveTypeSet(filter₁, G) ⊆ EffectiveTypeSet(filter₂, G)
    

    then all neighbor sets, closures, and traces computed with filter₁ are subsets of those computed with filter₂ for the same G, seeds, direction, and depth limit (see §2.2, §3.4.5, and §4.6).

TGK/PROV/1 does not prescribe which edge types “should” be included; higher-layer provenance profiles SHOULD specify concrete filters (e.g. “execution edges only”, “facts + receipts + certificates”).

When provenance is implemented over a TGK/STORE/1 graph store, this EdgeTypeFilter has the same logical shape and semantics as TGK/STORE/1s EdgeTypeFilter, with the understanding that G.Edges already reflects the environments supported edge types.

2.2 Direction semantics (ProvDirection)

ProvDirection determines how selected edges connect nodes during traversal.

ProvDirection (u8) {
  BACKWARD = 1
  FORWARD  = 2
  BOTH     = 3
}

For a given ProvenanceGraph G, direction dir, and edge-type filter filter, define:

  1. The selected edge set:

    SelectedEdges(G, filter) =
      { (ref, body) ∈ G.Edges | body.type ∈ EffectiveTypeSet(filter, G) }
    
  2. For a node n, the set of incident selected edges (via from/to only):

    IncEdges(G, n, filter) =
      { (ref, body) ∈ SelectedEdges(G, filter) |
          n ∈ body.from  body.to }
    
  3. The directional neighbor functions Neighbors_dir:

    • For dir = BACKWARD:

      Neighbors_BACKWARD(G, n, filter) =
        { m ∈ Node |
            ∃ (ref, body) ∈ SelectedEdges(G, filter),
              n ∈ body.to,
              m ∈ body.from
        }
      
    • For dir = FORWARD:

      Neighbors_FORWARD(G, n, filter) =
        { m ∈ Node |
            ∃ (ref, body) ∈ SelectedEdges(G, filter),
              n ∈ body.from,
              m ∈ body.to
        }
      
    • For dir = BOTH:

      Neighbors_BOTH(G, n, filter) =
        Neighbors_BACKWARD(G, n, filter)
         Neighbors_FORWARD(G, n, filter)
      

All neighbor sets are mathematical sets: multiple edges connecting the same pair of nodes do not create duplicate neighbor entries.

By the definition of ProvenanceGraph.Nodes in TGK/1-CORE, all endpoints of any edge in G.Edges lie in G.Nodes. Therefore, for all dir, n, and filter:

Neighbors_dir(G, n, filter) ⊆ G.Nodes

Self-loops and multi-endpoint edges behave naturally under these definitions:

  • If a selected edge has the same node in both from and to, then:

    • under FORWARD and BACKWARD that node is its own neighbor, and
    • under BOTH it still appears only once as a neighbor.
  • If a node appears multiple times in from and/or to for a given edge, this does not change neighbor sets: neighbors are deduplicated at the set level.

For fixed G, n, and dir, neighbor sets are monotone in the effective edge-type set: if filter₁ and filter₂ satisfy:

EffectiveTypeSet(filter₁, G) ⊆ EffectiveTypeSet(filter₂, G)

then:

Neighbors_dir(G, n, filter₁) ⊆ Neighbors_dir(G, n, filter₂)

These neighbor definitions are intentionally aligned with TGK/STORE/1s GraphDirection semantics when provenance traversals are implemented via graph-store adjacency:

  • BACKWARD corresponds to walking along edges where the current node appears in body.to, towards nodes in body.from (similar to GraphDirection.IN).
  • FORWARD corresponds to walking along edges where the current node appears in body.from, towards nodes in body.to (similar to GraphDirection.OUT).
  • BOTH is the union of the two.

Payload nodes

In this version:

  • payload is not used to advance the traversal frontier.
  • Payload nodes may appear in the resulting TraceGraph.nodes set (see §4.2§4.3), but Neighbors_* never step through payload.

TGK/PROV-PAYLOAD-NONTRAV/1 Kernel provenance traversal under TGK/PROV/1 MUST NOT use EdgeBody.payload as a source of neighbors. Only from and to participate in adjacency, depth assignment, and closure. Profiles that wish to treat payload values as traversable endpoints MUST either:

  • express those relationships via additional edges whose endpoints lie in from / to; or
  • define profile-specific traversal rules on top of this kernel, without redefining the core neighbor functions.

2.3 Depth limit (DepthLimit)

DepthLimit is an optional non-negative integer that bounds traversal length:

DepthLimit = optional uint32

Semantics:

  • Seeds are at depth 0.

  • Traversal proceeds in edge hops (as defined by Neighbors_dir).

  • For a limit D:

    • nodes at depth d with d < D MAY generate new neighbors at depth d+1;
    • nodes at depth d = D DO NOT generate new neighbors;
    • no node has depth > D.

If DepthLimit is absent, traversal is unbounded in depth (but still finite in its effect on G.Nodes because G is finite, see §0.4).

Special case:

  • If DepthLimit is present and D = 0, then no seed generates neighbors at all; the closure consists exactly of the seed set S (subject to the empty-seed rule in §3.3).

DepthLimit applies to the number of edges stepped along admissible directions, not to the number of nodes. If there are multiple paths from seeds to a node, its depth is the length of the shortest such path under the chosen direction and type filter.


3. Closure Semantics over ProvenanceGraph

This section defines prov_closure_nodes, the core node-set closure operator, together with the canonical depth and layering views built on the same construction.

3.1 Provenance query descriptor (logical)

For convenience, this document packages parameters into a logical ProvQuery type:

ProvQuery {
  direction:   ProvDirection
  type_filter: EdgeTypeFilter
  depth_limit: DepthLimit
}

Concrete APIs MAY represent queries differently (e.g. as separate arguments), but their observable behavior MUST be equivalent to using a ProvQuery as above.

Unless otherwise stated, in what follows:

  • G is a fixed ProvenanceGraph for some snapshot (finite, per §0.4),
  • S : SeedSet is a finite seed set S ⊆ Node (seeds need not be in G.Nodes), and
  • Q is a fixed ProvQuery.

3.2 Node-depth assignment

Fix:

  • a ProvenanceGraph G,
  • a finite seed set S : SeedSet, and
  • a ProvQuery Q.

Let:

  • dir = Q.direction,
  • filter = Q.type_filter,
  • D = Q.depth_limit (which may be absent).

Define a partial function:

depth_Q(G, S, n) : Node -> uint32 or undefined

as the minimum traversal depth of node n from seeds S under G and Q, via the following inductive construction.

  1. Base depth-0 assignment

    S_0 = S
    depth_Q(G, S, n) = 0   for all n ∈ S_0
    

    Seeds may or may not be in G.Nodes. Seeds that are not in G.Nodes:

    • still have depth_Q(G, S, n) = 0 defined, but
    • never appear in any Frontier_d set below (because Frontier_d ranges over G.Nodes only),
    • and therefore never generate neighbors.
  2. Iterative step

    For each integer d ≥ 0, define:

    Frontier_d   = { n ∈ G.Nodes | depth_Q(G, S, n) = d }
    
    NewNodes_d+1 =
      { m ∈ G.Nodes |
          ∃ n ∈ Frontier_d,
              m ∈ Neighbors_dir(G, n, filter),
              depth_Q(G, S, m) is undefined
      }
    
    • If D is present and d ≥ D, then NewNodes_d+1 = ∅ (no further expansion).

    • For each m ∈ NewNodes_d+1, set:

      depth_Q(G, S, m) = d + 1
      
  3. Termination

    Since G.Nodes is finite (§0.4) and each node in G.Nodes receives at most one depth assignment, this iterative construction over G.Nodes terminates in finitely many steps. Nodes for which depth_Q(G, S, n) remains undefined and which are not seeds are not reachable from S under (G, Q).

Depths for seeds in S are defined directly in step 1 and do not affect the termination of the frontier expansion over G.Nodes.

3.3 Node closure: prov_closure_nodes

Define the provenance node closure:

prov_closure_nodes(G, S, Q) -> set<Node>

as the set Reachable_Q(G, S) given by:

Reachable_Q(G, S) =
  S  { n ∈ G.Nodes | depth_Q(G, S, n) is defined }

By definition:

prov_closure_nodes(G, S, Q) = Reachable_Q(G, S)

This definition induces the following special cases and properties:

  • Seeds are always included

    For any G, finite S : SeedSet, and Q:

    S ⊆ Reachable_Q(G, S)
    

    regardless of whether seeds belong to G.Nodes.

  • Seeds not in G.Nodes

    If n ∈ S but n ∉ G.Nodes, then:

    • depth_Q(G, S, n) = 0,
    • n ∉ Frontier_d for all d, and therefore n never generates neighbors,
    • but n ∈ Reachable_Q(G, S) by definition.

    By the definition of G.Nodes in TGK/1-CORE, such seeds do not appear as endpoints of any edges in G.Edges. They therefore affect only the seed/closure node sets, not the traversed portion of the graph.

  • Empty seed set

    If S = ∅, then:

    • depth_Q(G, S, n) is undefined for all n ∈ G.Nodes, and
    • Reachable_Q(G, ∅) = ∅.

    In particular, with an empty seed set, node closure is empty and the trace has no nodes or edges (see §4.2).

  • Edge-type filter with no matching edges

    If EffectiveTypeSet(Q.type_filter, G) is empty, no edges of G are traversed, and:

    • depth_Q(G, S, n) is undefined for all n ∈ G.Nodes \ S,
    • Reachable_Q(G, S) = S.

    Combined with the empty-seed rule above, when S = ∅ and EffectiveTypeSet(Q.type_filter, G) is empty, the closure is empty.

  • Depth-limit special case

    If Q.depth_limit is present with D = 0, then no Frontier_d with d ≥ 0 ever expands, so:

    Reachable_Q(G, S) = S
    

    for all G and finite S. This is consistent with the empty-seed and empty-filter behaviors above.

Because both G.Nodes and S are finite (§0.4, §0.3), Reachable_Q(G, S) (and thus prov_closure_nodes(G, S, Q)) is always a finite set.

Properties:

  • If DepthLimit is absent, Reachable_Q(G, S) is the set of all nodes reachable from S by paths along edges in SelectedEdges(G, filter), respecting dir.

  • If DepthLimit = D is present, Reachable_Q(G, S) consists of:

    • all seeds; and
    • all nodes in G.Nodes reachable from S by paths of length ≤ D under dir and filter.

Collecting the seed-related aspects:

TGK/PROV-SEED-SEM/1 For any ProvenanceGraph G, finite seed set S : SeedSet, and query Q:

  • Seeds are always included in the closure:

    S ⊆ Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
    
  • If S = ∅, then Reachable_Q(G, S) = ∅ (no seeds ⇒ empty closure).

  • For any seed n ∈ S with n ∉ G.Nodes, n is in Reachable_Q(G, S) but never appears in any Frontier_d and never generates neighbors; such seeds do not affect which nodes in G.Nodes are reachable.

  • Changing Q.type_filter or Q.depth_limit may add or remove non-seed reachable nodes but MUST NOT remove seeds from the closure: for all Q and all Q' differing only in type_filter or depth_limit, S ⊆ Reachable_Q(G, S) and S ⊆ Reachable_Q'(G, S).

3.4 Closure properties and closure-operator invariant

For this section, define an admissible Q-path in G from s ∈ S to a node n to be a finite sequence:

<s = n₀, n₁, ..., n_k = n>    with k ≥ 0

such that for each i with 0 ≤ i < k there exists an edge (ref, body) ∈ SelectedEdges(G, filter) that connects n_i and n_{i+1} according to dir:

  • if dir = BACKWARD, then n_i ∈ body.to and n_{i+1} ∈ body.from;
  • if dir = FORWARD, then n_i ∈ body.from and n_{i+1} ∈ body.to;
  • if dir = BOTH, then either of the above holds.

The length of such a path is k (the number of edge hops). By the definition of Neighbors_dir, all intermediate nodes after the first step lie in G.Nodes.

For any fixed G:

  1. Seed monotonicity

    If S₁ ⊆ S₂ (both finite SeedSets), then for any Q:

    Reachable_Q(G, S₁) ⊆ Reachable_Q(G, S₂)
    
  2. Depth-limit monotonicity

    Let Q and Q' be identical except for DepthLimit:

    • If Q.depth_limit is absent and Q'.depth_limit = D is present, then:

      Reachable_Q'(G, S) ⊆ Reachable_Q(G, S)
      
    • If both are present and Q.depth_limit = D₁, Q'.depth_limit = D₂ with D₁ ≤ D₂, then:

      Reachable_Q(G, S) ⊆ Reachable_Q'(G, S)
      
  3. Path characterization

    For any node n:

    • If depth_Q(G, S, n) = d is defined, then there exists at least one admissible Q-path of length d from some s ∈ S to n.

    • If there exists an admissible Q-path of length k from some s ∈ S to n, then:

      depth_Q(G, S, n) is defined ∧ depth_Q(G, S, n) ≤ k
      
  4. Seed-union and idempotence

    • For any two finite seed sets S₁, S₂ and fixed Q:

      Reachable_Q(G, S₁  S₂)
        = Reachable_Q(G, S₁)  Reachable_Q(G, S₂)
      

      and, by induction, for any finite family {S_i}:

      Reachable_Q(G, _i S_i) = _i Reachable_Q(G, S_i)
      
    • For any finite S and fixed Q, closure is idempotent as a seed operator:

      S' = Reachable_Q(G, S)
      ⇒ Reachable_Q(G, S') = S'
      
  5. Edge-type filter monotonicity

    Let Q₁ and Q₂ be identical except for type_filter. If:

    EffectiveTypeSet(Q₁.type_filter, G) ⊆ EffectiveTypeSet(Q₂.type_filter, G)
    

    then for any finite seed set S : SeedSet:

    Reachable_Q₁(G, S) ⊆ Reachable_Q₂(G, S)
    

    That is, enlarging the set of admissible edge types (while keeping direction and depth_limit fixed) can only increase, never decrease, the closure.

Taken together, these properties mean:

TGK/PROV-CLOSURE-OP/1 For any fixed ProvenanceGraph G and ProvQuery Q, the mapping:

Cl_Q : P_f(Node) -> P_f(Node)
Cl_Q(S) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)

where P_f(Node) denotes finite subsets of Node, is a closure operator on finite seed sets (extensive, monotone, and idempotent). Higher-layer provenance profiles MAY rely on these algebraic properties when reasoning about seed sets and composed provenance queries.

3.5 Alternative fixpoint characterization (informative)

For unbounded queries (i.e. Q.depth_limit is absent), prov_closure_nodes can be described equivalently as the least fixpoint of a monotone operator on node sets.

Fix G, finite S : SeedSet, Q with Q.depth_limit = absent and let dir = Q.direction, filter = Q.type_filter. Define an operator:

F_Q(X) =
  S  { m ∈ G.Nodes |
          ∃ n ∈ (X ∩ G.Nodes),
            m ∈ Neighbors_dir(G, n, filter)
      }

Then:

  • F_Q is monotone on the powerset of Node (ordered by ⊆).

  • The sequence X₀ = S, X₁ = F_Q(X₀), X₂ = F_Q(X₁), … stabilizes after finitely many steps on G.Nodes because G.Nodes is finite (§0.4).

  • Its least fixpoint lfp(F_Q) satisfies:

    lfp(F_Q) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)
    

for unbounded queries.

For bounded queries (with DepthLimit present), §3.2s depth-based construction is normative. The fixpoint view can still be recovered by intersecting the unbounded closure with the nodes whose minimum depth is ≤ D, but TGK/PROV/1 does not rely on that reformulation.

This characterization is provided to make properties like monotonicity and idempotence easy to reason about when designing higher-level provenance policies.

3.6 Depth maps and layering

The depth_Q construction can be surfaced to callers as a canonical depth map and layering over the closure.

3.6.1 Operation: prov_depths

Define:

prov_depths(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> DepthMap

semantically as:

prov_depths(G, S, Q) =
  { (n, d) |
      depth_Q(G, S, n) = d   // d is defined by §3.2
  }

Viewed as a map, its domain is exactly:

dom(prov_depths(G, S, Q)) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q)

because:

  • every seed n ∈ S has depth_Q(G, S, n) = 0 by definition (even if n ∉ G.Nodes), and
  • all nodes in G.Nodes with a defined depth also belong to Reachable_Q(G, S).

Properties:

  • For any n with n ∈ Reachable_Q(G, S), prov_depths(G, S, Q)[n] is the minimum hop-count depth from some s ∈ S to n under G and Q.
  • For any n with n ∉ Reachable_Q(G, S), prov_depths(G, S, Q) has no entry for n.
  • For every seed n ∈ S, prov_depths(G, S, Q)[n] = 0. Seeds therefore always appear in the depth map at depth 0 whenever they are present in the seed set.

Because Reachable_Q(G, S) is finite for finite S (§3.3), prov_depths(G, S, Q) is always a finite map.

In particular, for any node n with n ∈ G.Nodes and prov_depths(G, S, Q)[n] > 0, there exists at least one selected edge (ref, body) ∈ SelectedEdges(G, filter) and some node m with:

prov_depths(G, S, Q)[m] = prov_depths(G, S, Q)[n] - 1

such that m and n are connected by that edge according to Q.direction. This is the “parent along some shortest path” witness implied by §3.4.

Implementations MAY choose any concrete representation for DepthMap (e.g. associative map, list of (Node, depth) pairs) as long as it is observable-equivalent to the mapping above.

prov_depths is a derived view: it does not change closure semantics; it simply exposes the depth information already used to define Reachable_Q(G, S).

Collecting these facts:

TGK/PROV-DEPTH-MAP/1 For any ProvenanceGraph G, finite seed set S : SeedSet, and query Q:

  • dom(prov_depths(G, S, Q)) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q);
  • every seed n ∈ S has prov_depths(G, S, Q)[n] = 0; and
  • for every n in the domain, prov_depths(G, S, Q)[n] is the minimum hop-count length of any admissible Q-path from some seed in S to n.

Higher-layer profiles MAY rely on this depth map as a canonical notion of “distance from seeds” under (G, Q).

3.6.2 Operation: prov_layers

Define:

prov_layers(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> ProvLayering

semantically in terms of prov_depths:

  1. Let DMap = prov_depths(G, S, Q).

  2. Let:

    Depths_Q(G, S) = { d | ∃ n, (n, d) ∈ DMap }
    

    i.e., the set of depths actually realized by reachable nodes.

  3. For each d ∈ Depths_Q(G, S), define a layer:

    Layer(d) = ProvLayer {
      depth = d
      nodes = { n | (n, d) ∈ DMap }
    }
    
  4. Define ProvLayering.layers to be the list:

    layers = [ Layer(d₀), Layer(d₁), ..., Layer(d_k) ]
    

    where {d₀, ..., d_k} = Depths_Q(G, S) and d₀ < d₁ < ... < d_k (strictly increasing). If Depths_Q(G, S) is empty (i.e., the closure is empty), then layers is the empty list.

Requirements:

  • Each Layer(d).nodes MUST be non-empty by construction.

  • The union over all Layer(d).nodes MUST equal Reachable_Q(G, S):

    _{d ∈ Depths_Q(G, S)} Layer(d).nodes
      = Reachable_Q(G, S)
    
  • Distinct layers have disjoint node sets:

    d₁ ≠ d₂ ⇒ Layer(d₁).nodes ∩ Layer(d₂).nodes = ∅
    

Seeds therefore always appear in the unique layer with depth = 0 whenever S is non-empty.

Thus, prov_layers provides a canonical partition of the reachable node set by depth, with explicit depth labels and a deterministic ascending ordering of layers. In particular, if S is non-empty, seeds always appear in the unique layer with depth = 0.

ProvLayering is often convenient for:

  • rendering trace frontiers stepwise (“distance from seed”), and
  • defining layer-based transformations (e.g., trimming nodes beyond a given depth, or summarizing all nodes at a depth into a single fact).

As with prov_depths, prov_layers is a pure derived view that does not change any closure or trace semantics elsewhere in this specification.

TGK/PROV-LAYERING-PARTITION/1 For any G, finite S : SeedSet, and Q, the result of prov_layers(G, S, Q) partitions Reachable_Q(G, S) into non-empty, pairwise-disjoint depth-labeled layers whose union is exactly Reachable_Q(G, S), with the layer depths strictly increasing.


4. TraceGraph Semantics

prov_trace lifts node closure to a concrete subgraph view.

4.1 TraceGraph structure

Recall:

TraceGraph {
  seeds: set<Node>
  nodes: set<Node>
  edges: set<(EdgeRef, EdgeBody)>
}

A TraceGraph T is always interpreted relative to some base ProvenanceGraph G. This spec does not encode the association T ↔ G; profiles that persist traces MUST make that association explicit (e.g. via receipts in FER/1).

For convenience, define for any edge set E:

EdgeNodes(E) =
  { n ∈ Node |
      ∃ (ref, body) ∈ E,
        n ∈ body.from  body.to  { body.payload }
  }

4.2 Operation: prov_trace

Logical signature:

prov_trace(
  G: ProvenanceGraph,
  S: SeedSet,     // finite
  Q: ProvQuery
) -> TraceGraph

Semantics:

  1. Compute node closure

    N = prov_closure_nodes(G, S, Q)   // i.e., N = Reachable_Q(G, S)
    
  2. Define the trace edge set

    Let filter = Q.type_filter. Define:

    E =
       { IncEdges(G, n, filter) | n ∈ N }
    

    That is, E is the union, over all closure nodes, of the selected-type edges incident (via from or to) to those nodes.

    Equivalently (and as a direct set comprehension):

    E =
      { (ref, body) ∈ SelectedEdges(G, filter) |
          (body.from ∩ N ≠ ∅)  (body.to ∩ N ≠ ∅)
      }
    

    Notes:

    • payload does not control inclusion; edges are included based on from / to.
    • Edges whose type is not in EffectiveTypeSet(filter, G) are excluded, even if incident to nodes in N.
  3. Define the trace node set

    The trace node set is:

    Nodes_T = S  EdgeNodes(E)
    

    i.e., seeds plus all nodes that appear in from, to, or payload for edges in E.

    By construction and by the properties of depth_Q:

    • Reachable_Q(G, S) ⊆ Nodes_T (see §4.3); and
    • for any n ∈ Reachable_Q(G, S) ∩ G.Nodes with n ∉ S, there exists at least one edge in E with n in body.from or body.to, so n ∈ EdgeNodes(E).

    Thus this definition is equivalent to the alternative view:

    Nodes_T = Reachable_Q(G, S)  EdgeNodes(E)
    

    but is stated purely in terms of seeds and edges.

  4. Construct the TraceGraph

    TraceGraph T {
      seeds = S
      nodes = Nodes_T
      edges = E
    }
    

4.3 Invariants

For any G, finite S : SeedSet, Q, let T = prov_trace(G, S, Q). Then:

  1. Seed inclusion

    T.seeds = S
    S ⊆ T.nodes
    
  2. Edge subset

    T.edges ⊆ G.Edges
    

    No edges are introduced that are not in the base ProvenanceGraph.

  3. Node closure over edges

    All nodes incident to edges in T.edges are in T.nodes by construction:

    { n ∈ Node |
        ∃ (ref, body) ∈ T.edges,
            n ∈ body.from  body.to  { body.payload }
    } ⊆ T.nodes
    
  4. Trace node characterization

    Trace nodes are exactly seeds plus edge nodes:

    T.nodes = S  EdgeNodes(T.edges)
    

    In particular, every node in T.nodes that is not a seed appears at least once as an endpoint or payload of some edge in T.edges.

  5. Reachable-node containment

    Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes
    

    Seeds and all nodes reachable under (G, Q) are present in T.nodes.

    For any n ∈ G.Nodes with n ∈ Reachable_Q(G, S) and n ∉ S, there exists at least one edge (ref, body) ∈ T.edges with n ∈ body.from body.to. Nodes that are only seeds (possibly not in G.Nodes) MAY have degree zero in T.

  6. No off-graph provenance

    • All edges in T.edges are drawn from G.Edges.
    • All nodes in T.nodes are seeds or nodes that appear somewhere in the EdgeBody of some edge in G.Edges.

    The operator does not invent nodes or edges that are not present in the underlying ProvenanceGraph.

  7. Finiteness

    Because G is finite (§0.4) and S is a finite SeedSet (§0.3), T is always finite:

    |T.edges| ≤ |G.Edges|
    |T.nodes| ≤ |G.Nodes| + |S|
    

    In particular, for any finite seed set S, prov_trace cannot diverge or yield an infinite trace graph for a finite base ProvenanceGraph.

Collecting 26:

TGK/PROV-TRACE-SUBGRAPH/1 For any ProvenanceGraph G, finite seed set S : SeedSet, and query Q, prov_trace(G, S, Q) yields a finite TraceGraph T such that:

  • T.edges ⊆ G.Edges, and
  • every node in T.nodes is either a seed or incident (as from, to, or payload) to some edge in T.edges.

No nodes or edges outside G appear in T.

And more specifically:

TGK/PROV-TRACE-NODES/1 For any G, finite S : SeedSet, Q with T = prov_trace(G, S, Q):

  • T.seeds = S;
  • T.nodes = S EdgeNodes(T.edges); and
  • Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes.

In particular, all reachable nodes from S under (G, Q) are contained in T.nodes, but the trace may additionally contain nodes introduced as endpoints or payload nodes of incident edges.

4.4 Direction and depth impact

  • direction and depth_limit affect T.nodes and T.edges indirectly via N = Reachable_Q(G, S):

    • Looser direction or larger depth typically produce larger N, and thus larger T.edges and T.nodes.
    • Stricter direction or smaller depth produce smaller N, and thus smaller T.edges and T.nodes.
  • Edge inclusion is incident-based, not path-based:

    • An edge is in T.edges if and only if it is:

      • of a selected type; and
      • incident (via from or to) to at least one node in N.
    • There is no additional requirement that the edge itself lie along a shortest path. Nodes adjacent to closure nodes via such edges MAY lie at a graph-theoretic distance greater than the depth limit; they are still included in T.nodes because they are endpoints or payload nodes of included edges.

Profiles that need path-structured traces (e.g., minimal explanation trees, trees restricted strictly to nodes within a depth bound) MUST define those as further projections of TraceGraph (e.g., pruning edges and nodes that do not belong to any admissible path), without modifying the kernel semantics here.

4.5 Derived edge-closure helper (informative)

For completeness, it is sometimes convenient to speak about the edge closure associated with (G, S, Q):

prov_closure_edges(G, S, Q) =
  { (ref, body) ∈ SelectedEdges(G, Q.type_filter) |
      (body.from ∩ Reachable_Q(G, S) ≠ ∅) 
      (body.to   ∩ Reachable_Q(G, S) ≠ ∅)
  }

By inspection, for any G, finite S : SeedSet, Q:

prov_closure_edges(G, S, Q) = prov_trace(G, S, Q).edges

This helper is purely definitional and does not add semantics beyond §4.2; it is provided for profiles that wish to talk separately about edge-closure vs node-closure.

4.6 TraceGraph closure and monotonicity (informative)

Using the properties of Reachable_Q in §3.4, prov_trace enjoys corresponding algebraic properties. For any fixed G:

  1. Seed monotonicity

    If S₁ ⊆ S₂ (finite SeedSets), then for any Q:

    prov_trace(G, S₁, Q).nodes ⊆ prov_trace(G, S₂, Q).nodes
    prov_trace(G, S₁, Q).edges ⊆ prov_trace(G, S₂, Q).edges
    
  2. Depth-limit monotonicity

    For Q and Q' identical except for DepthLimit, with Q.depth_limit absent and Q'.depth_limit present, or with Q.depth_limit ≤ Q'.depth_limit when both are present:

    prov_trace(G, S, Q).nodes ⊆ prov_trace(G, S, Q').nodes
    prov_trace(G, S, Q).edges ⊆ prov_trace(G, S, Q').edges
    
  3. Edge-type filter monotonicity

    For Q₁ and Q₂ identical except for type_filter, if:

    EffectiveTypeSet(Q₁.type_filter, G) ⊆ EffectiveTypeSet(Q₂.type_filter, G)
    

    then for any finite seed set S : SeedSet:

    prov_trace(G, S, Q₁).nodes ⊆ prov_trace(G, S, Q₂).nodes
    prov_trace(G, S, Q₁).edges ⊆ prov_trace(G, S, Q₂).edges
    

    That is, allowing more edge types (while keeping direction and depth_limit fixed) produces a super-trace that contains all nodes and edges of the more restricted trace.

  4. Seed-union and idempotence

    • For any finite seed sets S₁, S₂ and fixed Q:

      prov_trace(G, S₁  S₂, Q).nodes
        = prov_trace(G, S₁, Q).nodes  prov_trace(G, S₂, Q).nodes
      
      prov_trace(G, S₁  S₂, Q).edges
        = prov_trace(G, S₁, Q).edges  prov_trace(G, S₂, Q).edges
      
    • For any finite S and fixed Q, using S' = Reachable_Q(G, S):

      prov_trace(G, S,  Q).nodes = prov_trace(G, S', Q).nodes
      prov_trace(G, S,  Q).edges = prov_trace(G, S', Q).edges
      

      I.e., using the closure as the new seed set does not change the trace.

These properties are consequences of the definitions in §3 and §4 and are intended to support reasoning about compositional provenance operators in higher-level profiles.

TGK/PROV-TRACE-MONO/1 For any fixed ProvenanceGraph G and query Q, the operator:

T_Q(G, S) = prov_trace(G, S, Q)

is:

  • monotone in finite seed sets (S₁ ⊆ S₂ ⇒ T_Q(G, S₁) ⊆ T_Q(G, S₂) on both nodes and edges),
  • monotone in depth limits (larger or absent DepthLimit yields a super-trace), and
  • monotone in effective edge types (larger EffectiveTypeSet yields a super-trace),

and satisfies seed-union and idempotence with respect to node closure as in the equations above. Higher-layer profiles MAY rely on these properties when comparing or composing traces.

4.7 Depth-aware trace views (informative)

While TraceGraph itself does not carry depth annotations, implementations often need to present or persist depth information alongside traces (e.g., for layered UIs or explanation trees).

For any G, finite S : SeedSet, Q, it is always valid to pair:

T      = prov_trace (G, S, Q)
DMap   = prov_depths(G, S, Q)
Layers = prov_layers(G, S, Q)

with the following relationships:

  • dom(DMap) = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes.

  • For any (n, d) ∈ DMap, node n is in the ProvLayer with depth = d.

  • Nodes in T.nodes \ dom(DMap) (if any) are exactly the additional nodes introduced as endpoints or payload nodes of edges in T.edges that lie beyond the depth bound or are adjacent via incident-only inclusion (§4.4). Equivalently:

    T.nodes \ dom(DMap) = EdgeNodes(T.edges) \ Reachable_Q(G, S)
    

Higher-layer profiles MAY define depth-annotated trace encodings that:

  • embed DMap or Layers into Artifacts; or
  • derive narrative or visual structures from ProvLayering.

Such profiles MUST treat prov_depths and prov_layers as derived views of the same kernel closure — they MUST NOT redefine depths in ways that contradict depth_Q (§3.2).


5. Relationship to TGK/1-CORE Invariants

TGK/PROV/1 is designed to be consistent with, and to operationalize, the core TGK invariants.

5.1 Provenance kernel invariant

Recall from TGK/1-CORE:

TGK/PROV-KERNEL/CORE/1 For any snapshot (finite Artifact set + profile set), the corresponding ProvenanceGraph is a pure function of that input, and any graph indexes or materialized views are optimizations only.

TGK/PROV/1 extends this to traces and depth views:

TGK/PROV-TRACE-KERNEL/1 For any fixed ProvenanceGraph G, finite seed set S : SeedSet, and query Q, all of:

  • prov_closure_nodes(G, S, Q),
  • prov_depths (G, S, Q),
  • prov_layers (G, S, Q), and
  • prov_trace (G, S, Q)

are pure functions of (G, S, Q). Any stored or cached trace or depth views are optimizations only and MUST be consistent with these definitions.

Together with TGK/PROV-CLOSURE-OP/1, TGK/PROV-SEED-SEM/1, and TGK/PROV-DEPTH-MAP/1, this guarantees that higher-level provenance policies can treat Reachable_Q and the associated depth map as a well-behaved closure-and-distance structure over finite seed sets.

5.2 Determinism

From TGK/1-CORE:

TGK/DET/CORE/1 For a fixed snapshot + profile set, any two implementations derive isomorphic ProvenanceGraphs.

TGK/PROV/1 inherits and refines this:

TGK/PROV-DET/1 For any two TGK/1-COREconformant implementations that derive the same ProvenanceGraph G for a snapshot, and for any fixed finite S : SeedSet and Q, the results of:

  • prov_closure_nodes(G, S, Q),
  • prov_depths (G, S, Q),
  • prov_layers (G, S, Q), and
  • prov_trace (G, S, Q)

MUST be identical (same node set, same edge set, identical depth assignments, and identical per-depth layer partition) across implementations.

5.3 No off-graph provenance

From TGK/1-CORE:

TGK/NO-OFF-GRAPH-PROV/CORE/1 Provenance-relevant relationships must be expressible as edges over References; no hidden mutable state.

TGK/PROV/1 enforces:

  • All input is ProvenanceGraph G.
  • All output edges are a subset of G.Edges (by TGK/PROV-TRACE-SUBGRAPH/1).
  • All output nodes are seeds or nodes that appear somewhere in the EdgeBody of some edge in G.Edges (by TGK/PROV-TRACE-NODES/1).
  • Depths and layers are derived solely from the adjacency structure of G.Edges and the finite seed set S, consistent with TGK/PROV-DEPTH-MAP/1 and TGK/PROV-LAYERING-PARTITION/1.
  • Seed behavior in closure is governed by TGK/PROV-SEED-SEM/1.
  • Traversal uses only from and to endpoints for adjacency, not payload, as required by TGK/PROV-PAYLOAD-NONTRAV/1.

Traversal and depth assignment never consult stores, clocks, external indexes, or non-graph metadata.


6. Interaction with TGK/STORE/1 (Informative)

Although TGK/PROV/1 is in terms of ProvenanceGraph only, most implementations will obtain graph structure via a TGK/STORE/1 GraphStoreInstance.

For a fixed snapshot:

  • G.Edges corresponds to the set of all GraphEdgeView { edge_ref, body } reachable via:

    • scan_edges (if implemented), or
    • adjacency queries plus resolve_edge.
  • EdgeTypeFilter here corresponds directly to TGK/STORE/1s EdgeTypeFilter:

    • an empty types list means “all supported edge types actually present in the graph”;
    • a non-empty types list is intersected with the instances supported edge types, and then with the set of types that actually appear in edges.
  • Neighbors_BACKWARD, Neighbors_FORWARD, and Neighbors_BOTH can be implemented via combinations of:

    • edges_from(node, type_filter) and
    • edges_to(node, type_filter),

    with type_filter derived from Q.type_filter and direction mapped to OUT/IN/BOTH as appropriate:

    • BACKWARD ≈ edges_to (IN) followed by neighbors in from,
    • FORWARD ≈ edges_from (OUT) followed by neighbors in to,
    • BOTH ≈ union of the two.

Depth-based operators (prov_depths, prov_layers) can be implemented atop the same adjacency operations:

  • Traversal order and depth assignments must follow the semantics of depth_Q (§3.2), independent of the order in which the graph store returns edges.
  • Any paging or batching at the TGK/STORE/1 layer MUST NOT alter the resulting depth map or closure.

An implementation that uses TGK/STORE/1 MUST ensure that its traversals:

  • respect the identity and projection semantics of TGK/STORE/1; and
  • produce node and edge sets, along with depth and layer assignments, that match prov_closure_nodes, prov_depths, prov_layers, and prov_trace exactly for some snapshot.

Nothing in TGK/PROV/1 requires or forbids specific paging, caching, or distribution strategies in the graph store; it only constrains the observable results.


7. Examples (Informative Sketches)

These examples are illustrative only and non-normative.

7.1 Backward execution provenance

Assume a profile TGK/PEL/1 defines:

  • EdgeTypeId = EDGE_EXECUTION.

  • For each execution:

    EdgeBody.type    = EDGE_EXECUTION
    EdgeBody.from    = [program_ref]  input_refs
    EdgeBody.to      = output_refs  [execution_result_ref]
    EdgeBody.payload = execution_result_ref
    

Let:

  • G be a ProvenanceGraph that includes such edges.

  • S = { o } where o is a particular output ArtifactRef.

  • Q with:

    • direction = BACKWARD
    • type_filter = { EDGE_EXECUTION }
    • depth_limit = absent (unbounded)

Then:

  • prov_closure_nodes(G, {o}, Q) contains:

    • o at depth 0,
    • the execution result and input/program nodes at depth 1,
    • and recursively all ancestors reachable via EDGE_EXECUTION edges.
  • prov_depths(G, {o}, Q) yields a map where, for example:

    • prov_depths(...)[o] = 0,
    • prov_depths(...)[execution_result] = 1,
    • prov_depths(...)[input_i] = 1,
    • etc., with deeper ancestors at depth 2, 3, and so on.
  • prov_layers(G, {o}, Q) partitions these nodes into depth-based layers.

  • prov_trace(G, {o}, Q) includes:

    • all EDGE_EXECUTION edges incident (via from or to) to any node in this closure, and
    • all endpoint and payload nodes of those edges.

Higher layers may then further project TraceGraph and associated layers into an explanation tree, or annotate it with timings, certs, or UI overlays.

7.2 Mixed execution + evidence provenance

Suppose a profile TGK/FER/1 defines:

  • EDGE_RECEIPT_SUPPORTS edges from receipt Artifacts to Artifacts supported by the receipts.

A domain profile might choose:

  • a type_filter that includes both EDGE_EXECUTION and EDGE_RECEIPT_SUPPORTS;
  • direction = BACKWARD, to follow from outputs back to inputs and receipts.

TGK/PROV/1 does not need to know what “execution” or “receipt” means; it only knows that these are edge types that define how to walk the graph, how to compute reachable nodes, and how to assign depths and layers.


8. Conformance

An implementation is TGK/PROV/1conformant if, for each provenance operator it claims to implement, it satisfies all of the following:

  1. Graph-based semantics

    • Treats ProvenanceGraph as the sole graph abstraction.
    • Does not introduce any off-graph nodes or edges into TraceGraph results or into any depth or layering views, consistent with TGK/NO-OFF-GRAPH-PROV/CORE/1, TGK/PROV-TRACE-SUBGRAPH/1, and TGK/PROV-TRACE-NODES/1.
  2. Parameter handling

    • Interprets EdgeTypeFilter as in §2.1 (including the empty-list “all types present in G” rule, and the monotonicity rule when effective edge-type sets are nested).

    • Implements ProvDirection semantics exactly as in §2.2, including the distinction between BACKWARD, FORWARD, and BOTH, and the restriction that traversal uses from / to only (not payload), as required by TGK/PROV-PAYLOAD-NONTRAV/1.

    • Interprets DepthLimit as a hop-count limit as in §2.3 (no expansions from nodes at depth D, with the D = 0 seeds-only behavior).

    • Treats seeds as finite sets of nodes:

      • All semantics in this document are defined on a finite SeedSet S, and conformance is judged relative to that finite set.
      • If an implementation exposes APIs that accept seeds as lists or sequences, it MUST interpret them semantically as finite sets when computing provenance (ignoring duplicates and not depending on order).
      • Implementations MUST NOT attempt to interpret or emulate infinite seed families when invoking TGK/PROV/1 operators.
  3. Node closure

    • For any G, finite S : SeedSet, Q, computes prov_closure_nodes(G, S, Q) as in §3.3, i.e.:

      prov_closure_nodes(G, S, Q) = Reachable_Q(G, S)
      
    • May use any traversal algorithm (e.g., BFS, DFS with memoization) but MUST produce exactly the Reachable_Q(G, S) set.

    • Ensures that, for fixed G and Q, the mapping S ↦ prov_closure_nodes(G, S, Q) over finite seed sets behaves as a closure operator (extensive, monotone, idempotent), as stated in TGK/PROV-CLOSURE-OP/1.

    • Preserves the seed semantics in TGK/PROV-SEED-SEM/1:

      • seeds are always included in the closure,
      • empty seed sets produce empty closures,
      • seeds not in G.Nodes are in the closure but never expand the frontier.
    • Respects the edge-type filter monotonicity of §3.4.5: when two queries differ only in type_filter with nested effective type sets, the closure for the narrower edge set is a subset of the closure for the broader set.

  4. Depth maps and layering

    • If it exposes prov_depths, it MUST:

      • compute depth_Q(G, S, n) as in §3.2, and
      • return a DepthMap exactly equal to the mapping in §3.6.1 (domain = Reachable_Q(G, S) = prov_closure_nodes(G, S, Q), seeds at depth 0, values = minimum hop-count depth), consistent with TGK/PROV-DEPTH-MAP/1.
    • If it exposes prov_layers, it MUST:

      • derive ProvLayering from prov_depths as in §3.6.2,
      • ensure that layers are ordered by strictly increasing depth and that nodes are partitioned by depth without overlap, consistent with TGK/PROV-LAYERING-PARTITION/1.
  5. TraceGraph construction

    • For any G, finite S : SeedSet, Q, computes prov_trace(G, S, Q) as in §4.2.

    • Ensures all invariants in §4.3 hold, including:

      • T.edges ⊆ G.Edges,
      • T.seeds = S,
      • T.nodes = S EdgeNodes(T.edges),
      • Reachable_Q(G, S) = prov_closure_nodes(G, S, Q) ⊆ T.nodes.
    • For any node n with n ∈ G.Nodes, n ∈ Reachable_Q(G, S), and prov_depths(G, S, Q)[n] > 0, ensures that there exists at least one edge (ref, body) ∈ T.edges with n ∈ body.from body.to, so that every non-seed reachable node appears incident to at least one trace edge.

    • Ensures that its outputs respect the monotonicity and idempotence properties of §4.6, consistent with TGK/PROV-TRACE-MONO/1 (monotone in seeds, depth limits, and effective edge types; seed-union and idempotence with respect to closure).

  6. Determinism

    • For fixed G, finite S : SeedSet, and Q, produces the same outputs across runs and across equivalent implementations, consistent with §5.2:

      • same closure node set,
      • same depth assignments (if provided),
      • same per-depth layering (if provided),
      • same trace edge and node sets.
  7. Layering

    • Does not assume any particular physical storage or graph index model.
    • If implemented on top of TGK/STORE/1, respects the semantics of that profile and does not alter its identity or projection guarantees.
  8. Neutrality

    • Does not bake in PEL-, CIL-, FER-, FCT-, or OI-specific semantics.
    • Leaves provenance policies (which edge types to include and how to present them) to higher-level profiles and applications.

Everything else — user-facing query languages, result serialization, UI representations, domain-specific policies, optimization strategies — is outside the scope of TGK/PROV/1 and MAY be specified by additional profiles.


9. Version History (Informative)

0.1.20 — 2025-11-16

  • Marked the draft as stable for use as an Amduat 2.0 substrate baseline candidate.
  • Performed only micro editorial cleanups (punctuation and phrasing) without changing any definitions, equations, or invariants.
  • For any finite snapshot G and finite SeedSet S, the behavior of prov_closure_nodes, prov_depths, prov_layers, and prov_trace is identical to v0.1.19; this version simply records that the kernel provenance semantics are considered ready for review and adoption.

0.1.19 — 2025-11-16

  • Added §0.4 (Snapshots & finiteness assumptions) to collect in one place the finiteness guarantees inherited from TGK/1-CORE:

    • each ProvenanceGraph G considered by this spec is the projection of a finite snapshot, with finite G.Nodes and G.Edges, and
    • all semantics in TGK/PROV/1 are defined relative to such a finite G and a finite SeedSet.
  • Cross-referenced that finiteness assumption where it is used:

    • in §3.2 and §3.3 when arguing that depth assignment and closure construction always terminate and yield finite results, and
    • in §3.5 and §4.3 when stating that fixpoint iterations stabilize and TraceGraph is always finite.
  • These changes are purely clarificatory: for any finite snapshot G and finite seed set S, the behavior of prov_closure_nodes, prov_depths, prov_layers, and prov_trace is unchanged from v0.1.18. The new subsection and cross-references simply make the reliance on TGK/1-COREs finiteness assumptions explicit and easy to cite from other documents.

0.1.18 — 2025-11-16

  • Introduced an explicit SeedSet alias in §0.3:

    SeedSet = set<Node>   // always finite under TGK/PROV/1
    

    and used this terminology in the core sections (§3, §4, §5, §8) to make “finite seed sets” a first-class, named concept rather than repeating “finite set of nodes” in prose.

  • Clarified in §1.1, §3.1, and the conformance section that all operators are defined over finite seed sets and that the closure operator in TGK/PROV-CLOSURE-OP/1 is formally a map on P_f(Node) (finite subsets of Node), aligning notation with the already-stated finiteness assumptions from v0.1.17.

  • Slightly tightened wording around SeedSet in parameter handling (§8) to emphasize that:

    • implementations MUST interpret any seed input as a finite set (duplicates ignored, order irrelevant), and
    • MUST NOT attempt to treat infinite seed families as valid inputs to TGK/PROV/1 operators.

These changes are purely terminological and clarificatory. For any finite seed set S, the behavior of prov_closure_nodes, prov_depths, prov_layers, and prov_trace is identical to v0.1.17; the new SeedSet alias simply makes the finiteness domain and closure-operator laws more explicit.

0.1.17 — 2025-11-16

  • Made the finiteness assumptions for seeds explicit and wired them into the invariants:

    • In §0.3, stated that all seed sets S are assumed to be finite; seeds are now formally “finite sets of nodes” rather than arbitrary subsets of Node.
    • In §3.1 and §3.3, clarified that the closure operator is defined for finite S, and noted explicitly that Reachable_Q(G, S) is always finite because both G.Nodes and S are finite.
    • In §3.4 and §4.6, phrased the closure and trace monotonicity/idempotence laws in terms of finite seed sets, while preserving their algebraic form.
  • Tightened the finiteness invariant for traces and depth maps:

    • In §4.3.7, updated the TraceGraph finiteness argument to rely on both G and S being finite, and qualified the “cannot diverge” statement as applying to finite seed sets.
    • In §3.6.1, noted that prov_depths(G, S, Q) is always a finite map because its domain is the finite closure Reachable_Q(G, S).
  • Strengthened conformance around seed sets:

    • In §8 (parameter handling), elevated finiteness to a normative requirement: implementations MUST interpret seeds as finite sets and MUST NOT attempt to treat infinite seed families as inputs to TGK/PROV/1 operators.
    • Kept the existing “seeds are sets” rule (duplicates ignored, order irrelevant) and made it clear that this refers to finite sets.

These changes are clarificatory for all realistic deployments, where seeds are always finite. For any finite seed set S, the behavior of prov_closure_nodes, prov_depths, prov_layers, and prov_trace is unchanged compared to v0.1.16.

0.1.16 — 2025-11-16

  • Made the equivalence between the named closure operator and the internal Reachable_Q set completely explicit:

    • In §3.3, defined prov_closure_nodes(G, S, Q) directly as Reachable_Q(G, S) and stated the equality normatively, and in §3.4 and §3.5 referred to Cl_Q(S) as both Reachable_Q(G, S) and prov_closure_nodes(G, S, Q).
    • In §3.6.1 and §4.7, updated the depth and trace relationships to reference both Reachable_Q(G, S) and prov_closure_nodes(G, S, Q), making it clear that depth maps and layers are always defined over the same closure set used by prov_closure_nodes.
  • Elevated the “seeds are sets” behavior from an informal remark to an explicit conformance requirement:

    • Kept the explanatory note in §0.3 that seeds are always treated as sets (duplicates ignored).
    • In §8 (conformance), added a normative bullet under parameter handling: implementations whose APIs accept seeds as lists or sequences MUST interpret them semantically as sets when computing provenance, so results depend only on the set of seeds, not their multiplicity or order.
  • Minor editorial consistency cleanups:

    • Clarified in §3.6.2 that when the closure is empty, ProvLayering.layers is an empty list (while still forbidding empty individual layers).
    • Slightly rephrased invariant statements in §3.6.1, §4.3, and §4.7 to consistently mention both Reachable_Q(G, S) and prov_closure_nodes(G, S, Q), reducing the chance of misreading them as distinct concepts.

These changes are purely clarificatory relative to v0.1.15: they do not alter the behavior of prov_closure_nodes, prov_depths, prov_layers, or prov_trace for any (G, S, Q), but make the correspondence between the closure operator, its depth views, and the conformance obligations easier to see and reason about.

0.1.15 — 2025-11-16

  • Clarified the behavior of self-loops and repeated endpoints in neighbor semantics:

    • In §2.2, made explicit that if a selected edge has the same node in both from and to, that node is its own neighbor under FORWARD and BACKWARD, and appears only once as a neighbor under BOTH.
    • Stated that multiple occurrences of the same node within from and/or to do not change neighbor sets, since neighbors are mathematical sets.
  • Strengthened the depth and layering explanations around seeds:

    • In §3.6.1, highlighted that dom(prov_depths) is exactly the closure Reachable_Q(G, S) and that seeds always appear in the depth map at depth 0, even when they are not in G.Nodes.
    • In §3.6.2, reiterated that when S is non-empty, seeds always appear in the unique layer with depth = 0, making the layering behavior around seeds precise and easy to rely on.
  • Minor editorial and consistency improvements:

    • Clarified in §0.3 that seeds are treated as sets of nodes (deduplicated), even if supplied as lists in concrete APIs.
    • Slightly tightened the explanation in §4.2§4.3 that trace nodes are exactly seeds plus nodes appearing in the included edges, while reachable nodes are guaranteed to be a subset of that node set.
    • Removed a stray editorial fragment at the end of the 0.1.0 history entry, keeping the version history clean and self-contained.

These changes are pure clarifications over v0.1.14. They do not alter the semantics of prov_closure_nodes, prov_depths, prov_layers, or prov_trace for any (G, S, Q); they simply make certain edge cases (self-loops, duplicated endpoints, and seed behavior in depth/layer views) more explicit for implementers and profile authors.

0.1.14 — 2025-11-16

  • Made the non-traversal role of payload fully explicit and normative:

    • In §2.2 (“Payload nodes”), introduced TGK/PROV-PAYLOAD-NONTRAV/1, stating that kernel provenance traversal MUST NOT use EdgeBody.payload as a source of neighbors and that adjacency, depth, and closure are defined purely in terms of from and to.
    • Clarified that profiles wishing to traverse payload values MUST either represent those relationships via additional edges or define profile-specific traversal on top of this kernel, without altering the core neighbor functions.
  • Gave the trace monotonicity and idempotence properties an explicit rule name:

    • In §4.6, collected the existing seed, depth-limit, and edge-type monotonicity properties, plus the seed-union and idempotence equalities, into TGK/PROV-TRACE-MONO/1 for easier reference from higher-layer profiles.
    • Emphasized that prov_trace is monotone in seeds, depth limits, and effective edge types, and idempotent under reseeding with the node closure, mirroring the closure-operator law TGK/PROV-CLOSURE-OP/1.
  • Tightened conformance around these invariants:

    • In §8, required implementations to enforce the payload non-traversal rule (TGK/PROV-PAYLOAD-NONTRAV/1) as part of parameter handling, and to ensure that trace outputs satisfy the monotonicity and idempotence properties summarized in TGK/PROV-TRACE-MONO/1.
    • These obligations are purely expository: they make explicit requirements that were already implied by the earlier definitions of Neighbors_dir, Reachable_Q, and prov_trace. No operational semantics changed compared to v0.1.13; the same graphs, closures, depths, and traces are computed for any given (G, S, Q).

0.1.13 — 2025-11-16

  • Made the dependence on edge-type selection explicitly monotone throughout the kernel:

    • In §2.1, clarified that EffectiveTypeSet(filter, G) not only determines which edges participate in traversal, but also induces a natural monotonicity: enlarging the effective edge-type set can only add neighbors, reachable nodes, and trace edges, never remove them.
    • In §2.2, noted that, for fixed G, n, and direction, Neighbors_dir(G, n, filter) is monotone in EffectiveTypeSet(filter, G) and always returns nodes in G.Nodes, making the endpoint set of neighbors explicit.
    • In §3.4, added an edge-type filter monotonicity property as item 5: when two queries differ only in type_filter and one effective type set is a subset of the other, the corresponding closures are nested (Reachable_Q₁ ⊆ Reachable_Q₂).
    • In §4.6, extended the trace monotonicity discussion with an explicit edge-type filter monotonicity rule for prov_trace: allowing more edge types produces a super-trace that contains all nodes and edges of the more restricted trace.
  • Tightened conformance around filter behavior:

    • In §8, updated the parameter-handling and node-closure bullets to require that implementations respect the new edge-type monotonicity property (derived directly from the semantics), so that higher layers can safely depend on “more edge types ⇒ bigger or equal trace/closure” when composing provenance policies.
  • All changes are strictly definitional: they give names and explicit statements to monotonicity properties already implied by the original SelectedEdges, Neighbors_dir, and Reachable_Q definitions. The operational behavior of prov_closure_nodes, prov_depths, prov_layers, and prov_trace for any concrete G, S, Q is unchanged compared to v0.1.12.

0.1.12 — 2025-11-16

  • Captured seed behavior as an explicit, named invariant:

    • In §3.3, introduced TGK/PROV-SEED-SEM/1, summarizing the seed-related rules already implied by the closure definition:

      • seeds are always included in the closure, regardless of filters or depth limits;
      • an empty seed set yields an empty closure;
      • seeds not in G.Nodes are included in the closure but never expand the frontier or affect reachability inside G.Nodes.
    • Clarified that changes to EdgeTypeFilter or DepthLimit may add or remove non-seed reachable nodes, but MUST NOT remove seeds from the closure.

  • Wired the new seed invariant into:

    • §5.1, so that the provenance kernel description explicitly references both closure-operator behavior and seed semantics; and
    • §8 (conformance), by requiring implementations to preserve the seed semantics in TGK/PROV-SEED-SEM/1 alongside the existing closure, depth, layering, and trace invariants.
  • Removed minor duplicated trailing text in the v0.1.0 history entry for editorial cleanliness, without changing any normative semantics.

0.1.11 — 2025-11-16

  • Made the depth semantics more explicit and named the key invariants:

    • In §3.6.1, clarified that prov_depths always includes all seeds, even seeds not in G.Nodes, with depth 0, and that its domain is exactly Reachable_Q(G, S).

    • Introduced TGK/PROV-DEPTH-MAP/1 to capture the invariant that prov_depths:

      • has domain equal to the closure Reachable_Q(G, S),
      • assigns seeds depth 0, and
      • assigns each node its minimal admissible-path hop-count depth.
  • Tightened the layering definition and gave it a named invariant:

    • In §3.6.2, emphasized that prov_layers is a strict partition of Reachable_Q(G, S) into non-empty, disjoint depth-labeled layers whose union is exactly the reachable node set.
    • Introduced TGK/PROV-LAYERING-PARTITION/1 to make this partition property easy to cite from higher-layer profiles.
  • Updated §5.1§5.3 and §8 to reference the new invariants, so conformance explicitly guarantees:

    • closure-operator behavior for Reachable_Q (TGK/PROV-CLOSURE-OP/1),
    • canonical minimal-depth semantics for prov_depths (TGK/PROV-DEPTH-MAP/1), and
    • partition semantics for prov_layers (TGK/PROV-LAYERING-PARTITION/1), without changing any substantive behavior compared to v0.1.10.

0.1.10 — 2025-11-16

  • Introduced an explicit closure-operator invariant for node closure:

    • In §3.4, added TGK/PROV-CLOSURE-OP/1, stating that for fixed G and Q the map S ↦ Reachable_Q(G, S) is a Kuratowski-style closure operator on subsets of Node (extensive, monotone, idempotent).
    • This makes it easy for higher-level profiles to rely on algebraic properties of provenance closure when composing queries or policies.
  • Gave the trace subgraph and node characterization explicit normative rule names:

    • In §4.3, collected the existing invariants into:

      • TGK/PROV-TRACE-SUBGRAPH/1 (trace is a finite subgraph of G with no invented nodes or edges), and
      • TGK/PROV-TRACE-NODES/1 (node characterization as seeds EdgeNodes(edges) plus reachable-node containment).
    • Wired these invariants into §5.3 and §8 so conformance is explicitly tied to them.

  • Slightly tightened the prose around the relationships between Reachable_Q, TraceGraph.nodes, and the depth/layer views:

    • Clarified in §4.7 that T.nodes \ dom(DMap) is exactly EdgeNodes(T.edges) \ Reachable_Q(G, S), i.e. the “extra” nodes introduced purely as endpoints or payload nodes of incident edges.
    • Emphasized in §5.1 that TGK/PROV-TRACE-KERNEL/1 and TGK/PROV-CLOSURE-OP/1 jointly guarantee a clean, graph-only kernel for higher-layer provenance semantics.
  • All changes are strictly definitional and do not alter the behavior of prov_closure_nodes, prov_depths, prov_layers, or prov_trace compared to v0.1.9; they only assign names to already-implied properties and link them into conformance.

0.1.9 — 2025-11-16

  • Tightened the TraceGraph node-set characterization:

    • Defined TraceGraph.nodes canonically as seeds EdgeNodes(edges) in §4.2, and made this equality explicit as an invariant in §4.3.
    • Noted that this is equivalent to the earlier Reachable_Q(G, S) EdgeNodes(E) view but expressed purely in terms of seeds and edges, which simplifies validation of persisted traces.
  • Strengthened the link between depths and edges:

    • Clarified in §3.6.1 that any node in G.Nodes with depth > 0 must have at least one “parent along some shortest path” edge consistent with Q.direction.
    • Reflected this in conformance (§8) by requiring that any such node appear incident (via from / to) to at least one edge in the corresponding TraceGraph.
  • Refined depth-aware trace relationships:

    • In §4.7, made explicit that T.nodes \ dom(prov_depths) is exactly EdgeNodes(T.edges) \ Reachable_Q(G, S), i.e. the additional nodes introduced as endpoints or payload nodes of edges that lie beyond the depth bound or appear only via incident-based inclusion.
  • Confirmed that all changes are semantically compatible with v0.1.8: the closure, depth, layering, and trace operators still compute the same results; the new material only sharpens equalities, invariants, and conformance obligations.

0.1.8 — 2025-11-16

  • Made the depth map and layering views first-class:

    • Introduced DepthMap, ProvLayer, and ProvLayering types in §0.3.
    • Defined a normative prov_depths(G, S, Q) operator (§3.6.1) whose domain is exactly Reachable_Q(G, S) and whose values are the minimum hop-count depths already implicit in depth_Q.
    • Defined a normative prov_layers(G, S, Q) operator (§3.6.2) that partitions the reachable node set into non-empty depth-labeled layers, ordered by increasing depth.
  • Extended the kernel invariants to cover depth views:

    • Updated TGK/PROV-TRACE-KERNEL/1 (§5.1) and TGK/PROV-DET/1 (§5.2) to include prov_depths and prov_layers alongside prov_closure_nodes and prov_trace.
  • Clarified depth-aware trace usage:

    • Added §4.7 to describe how TraceGraph, DepthMap, and ProvLayering relate, and to note that nodes outside dom(prov_depths) are precisely those introduced as endpoints of incident edges beyond the depth bound or via incident-only inclusion.
  • Updated the conformance section (§8) so that, when implementations choose to expose depth or layering, those views MUST be derived exactly from depth_Q as defined in §3.2§3.3.

All changes were purely definitional and did not alter the semantics of prov_closure_nodes or prov_trace as defined in v0.1.7.

0.1.7 — 2025-11-16

  • Extended the closure properties in §3.4 with an explicit seed-union law and idempotence of Reachable_Q when used as a seed set, making the algebraic behavior of the closure operator easier to reference from higher layers.
  • Introduced a new §4.6 to collect the corresponding TraceGraph monotonicity and idempotence properties for prov_trace, showing how changes in seeds and depth limits affect both node and edge sets and how re-seeding with the closure leaves the trace unchanged.
  • Kept all changes purely definitional and algebraic; no operational semantics were altered, and all behaviors remain compatible with v0.1.6.

0.1.6 — 2025-11-16

  • Corrected the treatment of seeds that are not in G.Nodes (§3.3), making it explicit that such seeds cannot appear as endpoints of edges in G.Edges by the definition of ProvenanceGraph.Nodes in TGK/1-CORE, and therefore influence only the seed/closure node sets and not the traversed portion of the graph.
  • Introduced an explicit definition of admissible Q-paths in §3.4 and rewrote the path characterization there to refer to these paths, tightening the connection between the inductive depth construction and the intuitive “shortest path length” interpretation.
  • Performed minor editorial cleanups around the closure properties to keep the relationship between depth_Q, admissible paths, and BFS-style semantics precise and easy to reference from higher-level profiles.

0.1.5 — 2025-11-16

  • Clarified DepthLimit semantics, making the D = 0 “seeds-only” special case explicit and aligning the prose more clearly with the hop-count view already implied by the inductive depth definition.
  • Tightened the description of how empty seed sets and empty EffectiveTypeSet interact with depth limits, ensuring the resulting closure behavior is fully spelled out and obviously consistent across cases.
  • Added an explicit finiteness invariant for TraceGraph, noting that prov_trace always returns a finite result when ProvenanceGraph is finite and seed sets are finite in practice.
  • Introduced an informative fixpoint characterization (§3.5) of unbounded closures as the least fixpoint of a monotone operator, to make reasoning about monotonicity and idempotence easier for higher-level profiles.
  • Strengthened the TGK/STORE/1 interaction discussion (§6) to state explicitly that EdgeTypeFilter here shares semantics with TGK/STORE/1s filter, and to map ProvDirection to GraphDirection-style OUT/IN/BOTH neighbor semantics more concretely.

0.1.4 — 2025-11-16

  • Made the behavior of prov_closure_nodes under special cases explicit and normative:

    • Seeds not in G.Nodes are always depth-0 but never generate neighbors; they appear only as seeds and closure nodes.
    • An empty seed set yields an empty closure for any Q.
    • If EffectiveTypeSet(filter, G) is empty, node closure is exactly the seed set.
  • Re-expressed the trace edge set E in §4.2 directly as:

    E =  { IncEdges(G, n, filter) | n ∈ N }
    

    and noted its equivalence to the previous comprehension, so that SelectedEdges, IncEdges, and the neighbor definitions form an obviously composable family of helpers.

  • Performed minor editorial cleanups to keep the distinction between sets, lists, and ordering explicit, and to underline that any ordering concerns live in TGK/STORE/1, not in this provenance kernel.

0.1.3 — 2025-11-16

  • Made the role of SelectedEdges(G, filter) and IncEdges(G, n, filter) explicit and reused them consistently between closure and trace definitions, to make the composition of definitions clearer and reduce repetition.
  • Clarified the inductive definition of depth_Q so that seeds that are not in G.Nodes are explicitly acknowledged as depth-0 but never appear in any Frontier_d, while still being included in Reachable_Q(G, S) (§3.2§3.3).
  • Tightened language around set semantics (no ordering, no multiplicity) and made it explicit that any ordering concerns live in TGK/STORE/1, not this profile.
  • Slightly sharpened the mapping between prov_trace and the derived helper prov_closure_edges, and made the use of SelectedEdges in §4.2 structurally parallel to the node-closure definitions.

0.1.2 — 2025-11-16

  • Added an explicit closure properties subsection (§3.4) to capture:

    • seed monotonicity (more seeds → superset of reachable nodes),
    • depth-limit monotonicity (larger depth bound → superset of reachable nodes),
    • and the path characterization of depth_Q (BFS-style minimum path length).
  • Introduced a small, purely definitional helper prov_closure_edges(G, S, Q) (§4.5) to make it easier for profiles to talk separately about edge-closure vs node-closure, while keeping it normatively equal to prov_trace(...).edges.

  • Clarified in §2.2 / §3.2 that neighbor and frontier sets are always taken over G.Nodes, and that seeds outside G.Nodes never expand the frontier but remain part of the closure and in TraceGraph.nodes.

0.1.1 — 2025-11-16

  • Clarified TraceGraph construction so that:

    • TraceGraph.edges is the set of selected-type edges incident (via from/to) to at least one node in the closure Reachable_Q(G, S).
    • TraceGraph.nodes is the union of the closure nodes and all endpoint/payload nodes of those edges (EdgeNodes(E)), ensuring that all nodes incident to edges in the trace are present in TraceGraph.nodes.
  • Fixed the node-set definition to guarantee the “node closure over edges” invariant:

    { incident nodes of T.edges } ⊆ T.nodes
    
  • Made the incident-based nature of edge inclusion explicit in §4.4, and noted that nodes adjacent to closure nodes via included edges may lie beyond the depth bound but are still part of TraceGraph as edge endpoints.

  • Introduced helper notions SelectedEdges(G, filter), IncEdges(G, n, filter), and EdgeNodes(E) for clearer, reusable definitions.

  • Tightened the description of seeds not in G.Nodes: they receive depth 0, do not expand the traversal, but remain in the closure and in TraceGraph.nodes.

0.1.0 — 2025-11-16

  • Initial draft of TGK/PROV/1 as a kernel provenance semantics layer over TGK/1-CORE.

  • Defined core parameter model:

    • ProvDirection (BACKWARD, FORWARD, BOTH),
    • EdgeTypeFilter (edge-type selection),
    • DepthLimit (hop-count bound).
  • Introduced prov_closure_nodes as a deterministic node-set closure operator over a fixed ProvenanceGraph and query.

  • Introduced TraceGraph and prov_trace as a subgraph projection induced by node closure, with:

    • incident-based edge inclusion for selected edge types, and
    • payload nodes included but not traversed.
  • Aligned semantics explicitly with TGK kernel invariants:

    • TGK/PROV-KERNEL/CORE/1,
    • TGK/DET/CORE/1,
    • TGK/NO-OFF-GRAPH-PROV/CORE/1.
  • Documented the relationship with TGK/STORE/1 (graph store) as informative only, keeping TGK/PROV/1 store-neutral and purely semantic.


Document History

  • 0.1.20 (2025-11-16): Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.