241 lines
8.3 KiB
Markdown
241 lines
8.3 KiB
Markdown
|
|
# OPREG/TGK-DOCGRAPH/1 — Document Graph Registry
|
|||
|
|
|
|||
|
|
Status: Draft
|
|||
|
|
Owner: Architecture
|
|||
|
|
Version: 0.1.0
|
|||
|
|
SoT: Plan
|
|||
|
|
Last Updated: 2025-12-01
|
|||
|
|
Linked Phase Pack: PH12
|
|||
|
|
Tags: [registry, tgk, docgraph]
|
|||
|
|
|
|||
|
|
<!-- Source: /amduat/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md | Canonical: /amduat/tier1/opreg-tgk-docgraph-1.md -->
|
|||
|
|
|
|||
|
|
**Document ID:** `OPREG/TGK-DOCGRAPH/1`
|
|||
|
|
**Layer:** L1 Profile (TGK Doc Graph Registry over `TGK/1-CORE` + `ENC/TGK1-EDGE/1`)
|
|||
|
|
|
|||
|
|
**Depends on (normative):**
|
|||
|
|
|
|||
|
|
* `ASL/1-CORE v0.4.x` — `Artifact`, `Reference`, `TypeTag`, `HashId`
|
|||
|
|
* `ENC/ASL1-CORE v1.x` — canonical encodings for Artifacts and References
|
|||
|
|
* `HASH/ASL1 v0.2.x` — ASL1 hash family (`HASH-ASL1-256`)
|
|||
|
|
* `TGK/1-CORE v0.7.x` — trace graph kernel: `Node`, `EdgeBody`, `EdgeTypeId`
|
|||
|
|
* `ENC/TGK1-EDGE/1 v0.1.x` — canonical encoding for `EdgeBody` / EdgeArtifacts
|
|||
|
|
* `AMDUAT-DOCID` (Tier-0) — document identity and SoT/surface model
|
|||
|
|
|
|||
|
|
**Integrates with (informative):**
|
|||
|
|
|
|||
|
|
* `TGK/STORE/1` — graph store/query profile over ASL/1-STORE + TGK
|
|||
|
|
* ADR-032 and PH10/PH12 import designs (RΩ / export)
|
|||
|
|
* Future doc graph consumers (assistant overlays, IDX, provenance views)
|
|||
|
|
|
|||
|
|
© 2025 Amduat Programme.
|
|||
|
|
|
|||
|
|
## License
|
|||
|
|
|
|||
|
|
Except where otherwise noted, this document (text and diagrams) is licensed under
|
|||
|
|
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
|||
|
|
|
|||
|
|
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
|||
|
|
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
|||
|
|
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
|||
|
|
specifications.
|
|||
|
|
|
|||
|
|
Code examples in this document are provided under the Apache License 2.0 unless
|
|||
|
|
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
|||
|
|
public domain under CC0 1.0.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 0. Purpose and Non-Goals
|
|||
|
|
|
|||
|
|
### 0.1 Purpose
|
|||
|
|
|
|||
|
|
`OPREG/TGK-DOCGRAPH/1` defines a **doc/import/navigation graph registry** for Amduat:
|
|||
|
|
|
|||
|
|
* It names **node concepts** (as ASL/1 Artifacts) for:
|
|||
|
|
* conceptual documents (DOCID lineages),
|
|||
|
|
* document versions at a given snapshot (e.g. RΩ),
|
|||
|
|
* Git commits and blobs,
|
|||
|
|
* Amduat SoT instances.
|
|||
|
|
* It names **edge types** (`EdgeTypeId`s) that connect those concepts:
|
|||
|
|
* document ↔ version, surface, SoT state,
|
|||
|
|
* version ↔ Git blob/commit,
|
|||
|
|
* document ↔ Amduat instance.
|
|||
|
|
* It constrains how those edges are represented as EdgeArtifacts under
|
|||
|
|
`ENC/TGK1-EDGE/1` and consumed via `TGK/STORE/1`.
|
|||
|
|
|
|||
|
|
This registry is intentionally **doc/import scoped**. Execution, fact, and
|
|||
|
|
certificate edges live in their own TGK/OPREG registries and MUST NOT reuse
|
|||
|
|
`EdgeTypeId` assignments from this doc graph registry.
|
|||
|
|
|
|||
|
|
This Tier-1 stub is the **canonical registry companion** to the PH12 design
|
|||
|
|
note `PH12-EV-IMPORT-001 — Doc Graph OPREG Profile Design
|
|||
|
|
(/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md)`,
|
|||
|
|
which records design intent and sandbox experience; this document is the SoT
|
|||
|
|
for the node and edge vocabulary.
|
|||
|
|
|
|||
|
|
### 0.2 Non-goals
|
|||
|
|
|
|||
|
|
This registry does **not** define:
|
|||
|
|
|
|||
|
|
* any storage API (`ASL/1-STORE`, `TGK/STORE/1` already cover that),
|
|||
|
|
* any provenance algorithms or queries (`TGK/PROV/1` and higher layers),
|
|||
|
|
* any assistant or overlay behavior (those consume this registry),
|
|||
|
|
* concrete import/export profiles (ADR-032 handles those).
|
|||
|
|
|
|||
|
|
It only defines **concepts and edge types**; encoding and storage use existing
|
|||
|
|
Tier-1 profiles.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Node Concepts (Informative overview)
|
|||
|
|
|
|||
|
|
This section summarizes node concepts; canonical encodings and type_tags are
|
|||
|
|
defined in companion encoding profiles (TBD).
|
|||
|
|
|
|||
|
|
### 1.1 DOC_CONCEPT
|
|||
|
|
|
|||
|
|
Conceptual governed document identity per `AMDUAT-DOCID`:
|
|||
|
|
|
|||
|
|
* `identity_authority` (string),
|
|||
|
|
* `lineage_id` (string),
|
|||
|
|
* optional `doc_code` (string),
|
|||
|
|
* optional `code_status` (e.g. `tentative`, `stable`).
|
|||
|
|
|
|||
|
|
There is exactly one `DOC_CONCEPT` node per `(identity_authority, lineage_id)`.
|
|||
|
|
|
|||
|
|
### 1.2 DOC_VERSION
|
|||
|
|
|
|||
|
|
Versioned SoT slice of a governed document at a snapshot commit:
|
|||
|
|
|
|||
|
|
* `identity_authority`, `lineage_id`, `doc_code`, `code_status`,
|
|||
|
|
* `g_commit` (Git commit id),
|
|||
|
|
* `sha256` (content hash of the doc bytes at `g_commit`),
|
|||
|
|
* `path` (repository path at `g_commit`, e.g. `/amduat/tier0/docid.md`),
|
|||
|
|
* `surface`, `sot` (SoT state) per DOCID header.
|
|||
|
|
|
|||
|
|
Multiple `DOC_VERSION` nodes may exist for a `DOC_CONCEPT` across commits.
|
|||
|
|
|
|||
|
|
### 1.3 GIT_COMMIT
|
|||
|
|
|
|||
|
|
Git commit metadata:
|
|||
|
|
|
|||
|
|
* `commit` (sha1),
|
|||
|
|
* `parents` (list of parent commit ids),
|
|||
|
|
* `tree` (tree id),
|
|||
|
|
* `author_name`, `author_email`, `authored_at`,
|
|||
|
|
* `committer_name`, `committer_email`, `committed_at`,
|
|||
|
|
* summary or truncated message.
|
|||
|
|
|
|||
|
|
### 1.4 GIT_BLOB
|
|||
|
|
|
|||
|
|
Content snapshot for a single blob at `g_commit`:
|
|||
|
|
|
|||
|
|
* `blob_sha` (sha1),
|
|||
|
|
* `sha256` (content hash),
|
|||
|
|
* `size_bytes`,
|
|||
|
|
* `mode` (tree mode, including exec/symlink bits),
|
|||
|
|
* `path` at `g_commit`.
|
|||
|
|
|
|||
|
|
### 1.5 AMDUAT_INSTANCE
|
|||
|
|
|
|||
|
|
Descriptor for an Amduat SoT instance:
|
|||
|
|
|
|||
|
|
* `g_commit` (RΩ commit),
|
|||
|
|
* `store_root` (SoT store root),
|
|||
|
|
* `store_backend_id`,
|
|||
|
|
* references to RΩ FER/1 receipts and manifests,
|
|||
|
|
* optional labels (environment, hostname, etc.).
|
|||
|
|
|
|||
|
|
### 1.6 Helper nodes
|
|||
|
|
|
|||
|
|
* `SURFACE` — surface classification nodes (e.g. `tier0`, `tier1`, `phase`, `evidence`).
|
|||
|
|
* `SOT_STATE` — SoT state nodes (`Yes`, `Plan`, `Ref`).
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Edge Types (Doc Graph Domain)
|
|||
|
|
|
|||
|
|
`EdgeTypeId` values in this registry are reserved for doc/import/navigation
|
|||
|
|
edges. Concrete numeric assignments live in the encoding/catalogue layer.
|
|||
|
|
|
|||
|
|
Implementations and other OPREG registries MUST treat these `EdgeTypeId`s as
|
|||
|
|
belonging exclusively to the **Amduat doc graph domain**:
|
|||
|
|
|
|||
|
|
* the eventual allocation for this registry is expected to reserve a contiguous
|
|||
|
|
`EdgeTypeId` band (informally: an `AMDUAT-DOCGRAPH` band),
|
|||
|
|
* only doc/import/navigation semantics (edges in §§2.1–2.4) may occupy that
|
|||
|
|
band,
|
|||
|
|
* PEL execution, FER/1, CIL, FCT, and other TGK domains MUST use their own
|
|||
|
|
registries and bands.
|
|||
|
|
|
|||
|
|
### 2.1 Identity & version edges
|
|||
|
|
|
|||
|
|
* `EDGE_DOC_HAS_VERSION`
|
|||
|
|
`DOC_CONCEPT → DOC_VERSION` — this version belongs to this conceptual document.
|
|||
|
|
|
|||
|
|
* `EDGE_VERSION_OF`
|
|||
|
|
`DOC_VERSION → DOC_CONCEPT` — reverse link; derivable from `EDGE_DOC_HAS_VERSION`.
|
|||
|
|
|
|||
|
|
* `EDGE_DOC_HAS_IDENTITY`
|
|||
|
|
`DOC_VERSION → DOC_CONCEPT` — DOCID identity is attached to this version.
|
|||
|
|
|
|||
|
|
### 2.2 Surface & SoT edges
|
|||
|
|
|
|||
|
|
* `EDGE_DOC_ON_SURFACE`
|
|||
|
|
`DOC_VERSION → SURFACE` — surface classification (governance/spec/phase/evidence).
|
|||
|
|
|
|||
|
|
* `EDGE_DOC_SOT`
|
|||
|
|
`DOC_VERSION → SOT_STATE` — SoT status (`Yes`, `Plan`, `Ref`) for this version.
|
|||
|
|
|
|||
|
|
### 2.3 Git provenance edges
|
|||
|
|
|
|||
|
|
* `EDGE_VERSION_HAS_BLOB`
|
|||
|
|
`DOC_VERSION → GIT_BLOB` — ties a document version to the blob at `g_commit`.
|
|||
|
|
|
|||
|
|
* `EDGE_VERSION_FROM_COMMIT`
|
|||
|
|
`DOC_VERSION → GIT_COMMIT` — last commit that touched this path at/before the snapshot.
|
|||
|
|
|
|||
|
|
### 2.4 SoT instance edges
|
|||
|
|
|
|||
|
|
* `EDGE_DOC_MEMBER_OF_AMDUAT`
|
|||
|
|
`DOC_CONCEPT → AMDUAT_INSTANCE` — this document is part of a particular Amduat instance.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Encoding & Store Integration (Summary)
|
|||
|
|
|
|||
|
|
All doc-graph edges:
|
|||
|
|
|
|||
|
|
* are represented as TGK `EdgeBody` values with `EdgeTypeId` from this registry,
|
|||
|
|
* are encoded as EdgeArtifacts via `ENC/TGK1-EDGE/1` using `TYPE_TAG_TGK1_EDGE_V1`,
|
|||
|
|
* derive `EdgeRef` identities via `HASH/ASL1` over `EdgeBytes`,
|
|||
|
|
* live in ASL/1-STORE instances alongside other Artifacts.
|
|||
|
|
|
|||
|
|
Nodes (`DOC_CONCEPT`, `DOC_VERSION`, `GIT_COMMIT`, `GIT_BLOB`, `AMDUAT_INSTANCE`, etc.) are ordinary
|
|||
|
|
ASL/1 Artifacts; their `Reference`s are the TGK nodes.
|
|||
|
|
|
|||
|
|
`TGK/STORE/1` provides query semantics over the resulting graph.
|
|||
|
|
|
|||
|
|
JSON overlays or other projected views (for example, PH12 doc graph sandboxes)
|
|||
|
|
MAY be emitted for human navigation and experiments, but they are always
|
|||
|
|
derived from the underlying node Artifacts and EdgeArtifacts governed by this
|
|||
|
|
registry and `ENC/TGK1-EDGE/1`; overlays are never the source of truth for
|
|||
|
|
doc graph semantics.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Ingest & Encoder Interaction (Informative)
|
|||
|
|
|
|||
|
|
Implementations are expected to:
|
|||
|
|
|
|||
|
|
* materialise node Artifacts per this registry (and companion encoding profiles),
|
|||
|
|
* emit FER/1 receipts for ingest pipelines,
|
|||
|
|
* emit an idempotent edge worklist (doc-edge queue) that references `EdgeTypeId`s
|
|||
|
|
from this registry and node `Reference`s,
|
|||
|
|
* use a separate encoder to turn worklist items into EdgeArtifacts using `ENC/TGK1-EDGE/1`,
|
|||
|
|
writing them into ASL/1-STORE for consumption via `TGK/STORE/1`.
|
|||
|
|
|
|||
|
|
Details of worklist format and encoder scheduling are left to PH12/PHB01
|
|||
|
|
implementation notes; this registry only fixes the conceptual node/edge space.
|