241 lines
8.3 KiB
Markdown
241 lines
8.3 KiB
Markdown
# OPREG/TGK-DOCGRAPH/1 — Document Graph Registry
|
||
|
||
Status: Draft
|
||
Owner: Architecture
|
||
Version: 0.1.0
|
||
SoT: Plan
|
||
Last Updated: 2025-12-01
|
||
Linked Phase Pack: PH12
|
||
Tags: [registry, tgk, docgraph]
|
||
|
||
<!-- Source: /amduat/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md | Canonical: /amduat/tier1/opreg-tgk-docgraph-1.md -->
|
||
|
||
**Document ID:** `OPREG/TGK-DOCGRAPH/1`
|
||
**Layer:** L1 Profile (TGK Doc Graph Registry over `TGK/1-CORE` + `ENC/TGK1-EDGE/1`)
|
||
|
||
**Depends on (normative):**
|
||
|
||
* `ASL/1-CORE v0.4.x` — `Artifact`, `Reference`, `TypeTag`, `HashId`
|
||
* `ENC/ASL1-CORE v1.x` — canonical encodings for Artifacts and References
|
||
* `HASH/ASL1 v0.2.x` — ASL1 hash family (`HASH-ASL1-256`)
|
||
* `TGK/1-CORE v0.7.x` — trace graph kernel: `Node`, `EdgeBody`, `EdgeTypeId`
|
||
* `ENC/TGK1-EDGE/1 v0.1.x` — canonical encoding for `EdgeBody` / EdgeArtifacts
|
||
* `AMDUAT-DOCID` (Tier-0) — document identity and SoT/surface model
|
||
|
||
**Integrates with (informative):**
|
||
|
||
* `TGK/STORE/1` — graph store/query profile over ASL/1-STORE + TGK
|
||
* ADR-032 and PH10/PH12 import designs (RΩ / export)
|
||
* Future doc graph consumers (assistant overlays, IDX, provenance views)
|
||
|
||
© 2025 Amduat Programme.
|
||
|
||
## License
|
||
|
||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||
|
||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||
specifications.
|
||
|
||
Code examples in this document are provided under the Apache License 2.0 unless
|
||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||
public domain under CC0 1.0.
|
||
|
||
---
|
||
|
||
## 0. Purpose and Non-Goals
|
||
|
||
### 0.1 Purpose
|
||
|
||
`OPREG/TGK-DOCGRAPH/1` defines a **doc/import/navigation graph registry** for Amduat:
|
||
|
||
* It names **node concepts** (as ASL/1 Artifacts) for:
|
||
* conceptual documents (DOCID lineages),
|
||
* document versions at a given snapshot (e.g. RΩ),
|
||
* Git commits and blobs,
|
||
* Amduat SoT instances.
|
||
* It names **edge types** (`EdgeTypeId`s) that connect those concepts:
|
||
* document ↔ version, surface, SoT state,
|
||
* version ↔ Git blob/commit,
|
||
* document ↔ Amduat instance.
|
||
* It constrains how those edges are represented as EdgeArtifacts under
|
||
`ENC/TGK1-EDGE/1` and consumed via `TGK/STORE/1`.
|
||
|
||
This registry is intentionally **doc/import scoped**. Execution, fact, and
|
||
certificate edges live in their own TGK/OPREG registries and MUST NOT reuse
|
||
`EdgeTypeId` assignments from this doc graph registry.
|
||
|
||
This Tier-1 stub is the **canonical registry companion** to the PH12 design
|
||
note `PH12-EV-IMPORT-001 — Doc Graph OPREG Profile Design
|
||
(/logs/ph12/evidence/import/PH12-EV-IMPORT-001/opreg-tgk-docgraph-design-20251201.md)`,
|
||
which records design intent and sandbox experience; this document is the SoT
|
||
for the node and edge vocabulary.
|
||
|
||
### 0.2 Non-goals
|
||
|
||
This registry does **not** define:
|
||
|
||
* any storage API (`ASL/1-STORE`, `TGK/STORE/1` already cover that),
|
||
* any provenance algorithms or queries (`TGK/PROV/1` and higher layers),
|
||
* any assistant or overlay behavior (those consume this registry),
|
||
* concrete import/export profiles (ADR-032 handles those).
|
||
|
||
It only defines **concepts and edge types**; encoding and storage use existing
|
||
Tier-1 profiles.
|
||
|
||
---
|
||
|
||
## 1. Node Concepts (Informative overview)
|
||
|
||
This section summarizes node concepts; canonical encodings and type_tags are
|
||
defined in companion encoding profiles (TBD).
|
||
|
||
### 1.1 DOC_CONCEPT
|
||
|
||
Conceptual governed document identity per `AMDUAT-DOCID`:
|
||
|
||
* `identity_authority` (string),
|
||
* `lineage_id` (string),
|
||
* optional `doc_code` (string),
|
||
* optional `code_status` (e.g. `tentative`, `stable`).
|
||
|
||
There is exactly one `DOC_CONCEPT` node per `(identity_authority, lineage_id)`.
|
||
|
||
### 1.2 DOC_VERSION
|
||
|
||
Versioned SoT slice of a governed document at a snapshot commit:
|
||
|
||
* `identity_authority`, `lineage_id`, `doc_code`, `code_status`,
|
||
* `g_commit` (Git commit id),
|
||
* `sha256` (content hash of the doc bytes at `g_commit`),
|
||
* `path` (repository path at `g_commit`, e.g. `/amduat/tier0/docid.md`),
|
||
* `surface`, `sot` (SoT state) per DOCID header.
|
||
|
||
Multiple `DOC_VERSION` nodes may exist for a `DOC_CONCEPT` across commits.
|
||
|
||
### 1.3 GIT_COMMIT
|
||
|
||
Git commit metadata:
|
||
|
||
* `commit` (sha1),
|
||
* `parents` (list of parent commit ids),
|
||
* `tree` (tree id),
|
||
* `author_name`, `author_email`, `authored_at`,
|
||
* `committer_name`, `committer_email`, `committed_at`,
|
||
* summary or truncated message.
|
||
|
||
### 1.4 GIT_BLOB
|
||
|
||
Content snapshot for a single blob at `g_commit`:
|
||
|
||
* `blob_sha` (sha1),
|
||
* `sha256` (content hash),
|
||
* `size_bytes`,
|
||
* `mode` (tree mode, including exec/symlink bits),
|
||
* `path` at `g_commit`.
|
||
|
||
### 1.5 AMDUAT_INSTANCE
|
||
|
||
Descriptor for an Amduat SoT instance:
|
||
|
||
* `g_commit` (RΩ commit),
|
||
* `store_root` (SoT store root),
|
||
* `store_backend_id`,
|
||
* references to RΩ FER/1 receipts and manifests,
|
||
* optional labels (environment, hostname, etc.).
|
||
|
||
### 1.6 Helper nodes
|
||
|
||
* `SURFACE` — surface classification nodes (e.g. `tier0`, `tier1`, `phase`, `evidence`).
|
||
* `SOT_STATE` — SoT state nodes (`Yes`, `Plan`, `Ref`).
|
||
|
||
---
|
||
|
||
## 2. Edge Types (Doc Graph Domain)
|
||
|
||
`EdgeTypeId` values in this registry are reserved for doc/import/navigation
|
||
edges. Concrete numeric assignments live in the encoding/catalogue layer.
|
||
|
||
Implementations and other OPREG registries MUST treat these `EdgeTypeId`s as
|
||
belonging exclusively to the **Amduat doc graph domain**:
|
||
|
||
* the eventual allocation for this registry is expected to reserve a contiguous
|
||
`EdgeTypeId` band (informally: an `AMDUAT-DOCGRAPH` band),
|
||
* only doc/import/navigation semantics (edges in §§2.1–2.4) may occupy that
|
||
band,
|
||
* PEL execution, FER/1, CIL, FCT, and other TGK domains MUST use their own
|
||
registries and bands.
|
||
|
||
### 2.1 Identity & version edges
|
||
|
||
* `EDGE_DOC_HAS_VERSION`
|
||
`DOC_CONCEPT → DOC_VERSION` — this version belongs to this conceptual document.
|
||
|
||
* `EDGE_VERSION_OF`
|
||
`DOC_VERSION → DOC_CONCEPT` — reverse link; derivable from `EDGE_DOC_HAS_VERSION`.
|
||
|
||
* `EDGE_DOC_HAS_IDENTITY`
|
||
`DOC_VERSION → DOC_CONCEPT` — DOCID identity is attached to this version.
|
||
|
||
### 2.2 Surface & SoT edges
|
||
|
||
* `EDGE_DOC_ON_SURFACE`
|
||
`DOC_VERSION → SURFACE` — surface classification (governance/spec/phase/evidence).
|
||
|
||
* `EDGE_DOC_SOT`
|
||
`DOC_VERSION → SOT_STATE` — SoT status (`Yes`, `Plan`, `Ref`) for this version.
|
||
|
||
### 2.3 Git provenance edges
|
||
|
||
* `EDGE_VERSION_HAS_BLOB`
|
||
`DOC_VERSION → GIT_BLOB` — ties a document version to the blob at `g_commit`.
|
||
|
||
* `EDGE_VERSION_FROM_COMMIT`
|
||
`DOC_VERSION → GIT_COMMIT` — last commit that touched this path at/before the snapshot.
|
||
|
||
### 2.4 SoT instance edges
|
||
|
||
* `EDGE_DOC_MEMBER_OF_AMDUAT`
|
||
`DOC_CONCEPT → AMDUAT_INSTANCE` — this document is part of a particular Amduat instance.
|
||
|
||
---
|
||
|
||
## 3. Encoding & Store Integration (Summary)
|
||
|
||
All doc-graph edges:
|
||
|
||
* are represented as TGK `EdgeBody` values with `EdgeTypeId` from this registry,
|
||
* are encoded as EdgeArtifacts via `ENC/TGK1-EDGE/1` using `TYPE_TAG_TGK1_EDGE_V1`,
|
||
* derive `EdgeRef` identities via `HASH/ASL1` over `EdgeBytes`,
|
||
* live in ASL/1-STORE instances alongside other Artifacts.
|
||
|
||
Nodes (`DOC_CONCEPT`, `DOC_VERSION`, `GIT_COMMIT`, `GIT_BLOB`, `AMDUAT_INSTANCE`, etc.) are ordinary
|
||
ASL/1 Artifacts; their `Reference`s are the TGK nodes.
|
||
|
||
`TGK/STORE/1` provides query semantics over the resulting graph.
|
||
|
||
JSON overlays or other projected views (for example, PH12 doc graph sandboxes)
|
||
MAY be emitted for human navigation and experiments, but they are always
|
||
derived from the underlying node Artifacts and EdgeArtifacts governed by this
|
||
registry and `ENC/TGK1-EDGE/1`; overlays are never the source of truth for
|
||
doc graph semantics.
|
||
|
||
---
|
||
|
||
## 4. Ingest & Encoder Interaction (Informative)
|
||
|
||
Implementations are expected to:
|
||
|
||
* materialise node Artifacts per this registry (and companion encoding profiles),
|
||
* emit FER/1 receipts for ingest pipelines,
|
||
* emit an idempotent edge worklist (doc-edge queue) that references `EdgeTypeId`s
|
||
from this registry and node `Reference`s,
|
||
* use a separate encoder to turn worklist items into EdgeArtifacts using `ENC/TGK1-EDGE/1`,
|
||
writing them into ASL/1-STORE for consumption via `TGK/STORE/1`.
|
||
|
||
Details of worklist format and encoder scheduling are left to PH12/PHB01
|
||
implementation notes; this registry only fixes the conceptual node/edge space.
|