amduat/tier1/opreg-tgk-docgraph-1.md
2025-12-20 11:32:17 +01:00

8.3 KiB
Raw Blame History

OPREG/TGK-DOCGRAPH/1 — Document Graph Registry

Status: Draft
Owner: Architecture
Version: 0.1.0
SoT: Plan
Last Updated: 2025-12-01
Linked Phase Pack: PH12
Tags: [registry, tgk, docgraph]

Document ID: OPREG/TGK-DOCGRAPH/1
Layer: L1 Profile (TGK Doc Graph Registry over TGK/1-CORE + ENC/TGK1-EDGE/1)

Depends on (normative):

  • ASL/1-CORE v0.4.xArtifact, Reference, TypeTag, HashId
  • ENC/ASL1-CORE v1.x — canonical encodings for Artifacts and References
  • HASH/ASL1 v0.2.x — ASL1 hash family (HASH-ASL1-256)
  • TGK/1-CORE v0.7.x — trace graph kernel: Node, EdgeBody, EdgeTypeId
  • ENC/TGK1-EDGE/1 v0.1.x — canonical encoding for EdgeBody / EdgeArtifacts
  • AMDUAT-DOCID (Tier-0) — document identity and SoT/surface model

Integrates with (informative):

  • TGK/STORE/1 — graph store/query profile over ASL/1-STORE + TGK
  • ADR-032 and PH10/PH12 import designs (RΩ / export)
  • Future doc graph consumers (assistant overlays, IDX, provenance views)

© 2025 Amduat Programme.

License

Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.

Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.


0. Purpose and Non-Goals

0.1 Purpose

OPREG/TGK-DOCGRAPH/1 defines a doc/import/navigation graph registry for Amduat:

  • It names node concepts (as ASL/1 Artifacts) for:
    • conceptual documents (DOCID lineages),
    • document versions at a given snapshot (e.g. RΩ),
    • Git commits and blobs,
    • Amduat SoT instances.
  • It names edge types (EdgeTypeIds) that connect those concepts:
    • document ↔ version, surface, SoT state,
    • version ↔ Git blob/commit,
    • document ↔ Amduat instance.
  • It constrains how those edges are represented as EdgeArtifacts under ENC/TGK1-EDGE/1 and consumed via TGK/STORE/1.

This registry is intentionally doc/import scoped. Execution, fact, and certificate edges live in their own TGK/OPREG registries and MUST NOT reuse EdgeTypeId assignments from this doc graph registry.

This Tier-1 stub is the canonical registry companion to the PH12 design note PH12-EV-IMPORT-001 — Doc Graph OPREG Profile Design (/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md), which records design intent and sandbox experience; this document is the SoT for the node and edge vocabulary.

0.2 Non-goals

This registry does not define:

  • any storage API (ASL/1-STORE, TGK/STORE/1 already cover that),
  • any provenance algorithms or queries (TGK/PROV/1 and higher layers),
  • any assistant or overlay behavior (those consume this registry),
  • concrete import/export profiles (ADR-032 handles those).

It only defines concepts and edge types; encoding and storage use existing Tier-1 profiles.


1. Node Concepts (Informative overview)

This section summarizes node concepts; canonical encodings and type_tags are defined in companion encoding profiles (TBD).

1.1 DOC_CONCEPT

Conceptual governed document identity per AMDUAT-DOCID:

  • identity_authority (string),
  • lineage_id (string),
  • optional doc_code (string),
  • optional code_status (e.g. tentative, stable).

There is exactly one DOC_CONCEPT node per (identity_authority, lineage_id).

1.2 DOC_VERSION

Versioned SoT slice of a governed document at a snapshot commit:

  • identity_authority, lineage_id, doc_code, code_status,
  • g_commit (Git commit id),
  • sha256 (content hash of the doc bytes at g_commit),
  • path (repository path at g_commit, e.g. /amduat/tier0/docid.md),
  • surface, sot (SoT state) per DOCID header.

Multiple DOC_VERSION nodes may exist for a DOC_CONCEPT across commits.

1.3 GIT_COMMIT

Git commit metadata:

  • commit (sha1),
  • parents (list of parent commit ids),
  • tree (tree id),
  • author_name, author_email, authored_at,
  • committer_name, committer_email, committed_at,
  • summary or truncated message.

1.4 GIT_BLOB

Content snapshot for a single blob at g_commit:

  • blob_sha (sha1),
  • sha256 (content hash),
  • size_bytes,
  • mode (tree mode, including exec/symlink bits),
  • path at g_commit.

1.5 AMDUAT_INSTANCE

Descriptor for an Amduat SoT instance:

  • g_commit (RΩ commit),
  • store_root (SoT store root),
  • store_backend_id,
  • references to RΩ FER/1 receipts and manifests,
  • optional labels (environment, hostname, etc.).

1.6 Helper nodes

  • SURFACE — surface classification nodes (e.g. tier0, tier1, phase, evidence).
  • SOT_STATE — SoT state nodes (Yes, Plan, Ref).

2. Edge Types (Doc Graph Domain)

EdgeTypeId values in this registry are reserved for doc/import/navigation edges. Concrete numeric assignments live in the encoding/catalogue layer.

Implementations and other OPREG registries MUST treat these EdgeTypeIds as belonging exclusively to the Amduat doc graph domain:

  • the eventual allocation for this registry is expected to reserve a contiguous EdgeTypeId band (informally: an AMDUAT-DOCGRAPH band),
  • only doc/import/navigation semantics (edges in §§2.12.4) may occupy that band,
  • PEL execution, FER/1, CIL, FCT, and other TGK domains MUST use their own registries and bands.

2.1 Identity & version edges

  • EDGE_DOC_HAS_VERSION
    DOC_CONCEPT → DOC_VERSION — this version belongs to this conceptual document.

  • EDGE_VERSION_OF
    DOC_VERSION → DOC_CONCEPT — reverse link; derivable from EDGE_DOC_HAS_VERSION.

  • EDGE_DOC_HAS_IDENTITY
    DOC_VERSION → DOC_CONCEPT — DOCID identity is attached to this version.

2.2 Surface & SoT edges

  • EDGE_DOC_ON_SURFACE
    DOC_VERSION → SURFACE — surface classification (governance/spec/phase/evidence).

  • EDGE_DOC_SOT
    DOC_VERSION → SOT_STATE — SoT status (Yes, Plan, Ref) for this version.

2.3 Git provenance edges

  • EDGE_VERSION_HAS_BLOB
    DOC_VERSION → GIT_BLOB — ties a document version to the blob at g_commit.

  • EDGE_VERSION_FROM_COMMIT
    DOC_VERSION → GIT_COMMIT — last commit that touched this path at/before the snapshot.

2.4 SoT instance edges

  • EDGE_DOC_MEMBER_OF_AMDUAT
    DOC_CONCEPT → AMDUAT_INSTANCE — this document is part of a particular Amduat instance.

3. Encoding & Store Integration (Summary)

All doc-graph edges:

  • are represented as TGK EdgeBody values with EdgeTypeId from this registry,
  • are encoded as EdgeArtifacts via ENC/TGK1-EDGE/1 using TYPE_TAG_TGK1_EDGE_V1,
  • derive EdgeRef identities via HASH/ASL1 over EdgeBytes,
  • live in ASL/1-STORE instances alongside other Artifacts.

Nodes (DOC_CONCEPT, DOC_VERSION, GIT_COMMIT, GIT_BLOB, AMDUAT_INSTANCE, etc.) are ordinary ASL/1 Artifacts; their References are the TGK nodes.

TGK/STORE/1 provides query semantics over the resulting graph.

JSON overlays or other projected views (for example, PH12 doc graph sandboxes) MAY be emitted for human navigation and experiments, but they are always derived from the underlying node Artifacts and EdgeArtifacts governed by this registry and ENC/TGK1-EDGE/1; overlays are never the source of truth for doc graph semantics.


4. Ingest & Encoder Interaction (Informative)

Implementations are expected to:

  • materialise node Artifacts per this registry (and companion encoding profiles),
  • emit FER/1 receipts for ingest pipelines,
  • emit an idempotent edge worklist (doc-edge queue) that references EdgeTypeIds from this registry and node References,
  • use a separate encoder to turn worklist items into EdgeArtifacts using ENC/TGK1-EDGE/1, writing them into ASL/1-STORE for consumption via TGK/STORE/1.

Details of worklist format and encoder scheduling are left to PH12/PHB01 implementation notes; this registry only fixes the conceptual node/edge space.