# AMDUAT-MS/1 — Mapping Surface Specification Status: Draft Owner: Architecture Version: 0.2.2 SoT: Yes Last Updated: 2025-11-30 Linked Phase Pack: PH07 Tags: [composition, execution, deterministic] identity_authority: amduat.programme lineage_id: L-PENDING-MS1 doc_code: AMDUAT-MS/1 code_status: tentative doc_code_aliases: [] location: /amduat/tier1/ms.md surface: developer internal_revision: 3 provenance_mode: full --- ## Overview **AMDUAT-MS/1** standardises the executable mapping surface that turns a **Concept** plus a **Context Frame** into deterministic **Data** bytes. The surface orchestrates **FPS/1 primitives** through **FCS/1 recipes** and records runs through **FER/1**; certification, provenance, and policy decisions remain with **FCT/1** and phase evidence packs. MS/1 governs observable mapping behaviour and context binding rules without authorising new governance or relation taxonomies. --- ## Core Model MS/1 retains Amduat's two primitive node kinds and treats all other structure as concept-typed relations. * **Concept node (C):** Abstract identifier that can describe, type, or govern any graph element. A concept can own multiple materialisations without losing identity. * **Data node (D):** Immutable byte sequence addressed by CID (SHA-256). All executions produce Data nodes. * **Relation instances:** Every edge is annotated with a `relation_concept` that identifies its semantics. Implementations MUST NOT embed a fixed relation enumeration; relation concepts are first-class concepts. Non-normative relation concept examples include `represents` (C→D), `materializesAs` (C→D), `requiresKey` (C→C), `withinDomain` (C→C), `computedBy` (D→C), and `hasProvenance` (D→D). Ecosystems MAY register additional concepts and MUST treat each registered concept as a normal graph node. --- ## Context Frames A **Context Frame (CF)** is a deterministic multimap of `{ key_concept → value }` that constrains execution. * **Keys** are concept identifiers; strings are optional. * **Values** are canonical scalars (integers, enums keyed by concept ID) or CIDs to Data nodes for larger payloads. * **Context CID:** `CID_context = sha256(canonical_encode(CF))`. The canonical encoding MUST order keys by concept ID and normalise values so identical frames generate identical CIDs. ### Scoping and Refinement * Frames form a tree during pipeline execution. Each mapping step receives one frame. * Child frames inherit bindings from the parent. A child MAY add new bindings or narrow an existing binding but MUST NOT contradict established bindings. * Branch communication is explicit. Publishing a refinement to siblings requires constructing a new **published frame** and referencing it; no sideways state is implicit. ### Gaps and Ambiguity * Missing required bindings yield a **Gap Artifact** (Data) that records the missing keys and the decision point. No payload bytes are emitted. * When multiple admissible bindings remain, the step returns an **Ambiguity Artifact** (Data) enumerating admissible alternatives and the rule needed to disambiguate. Resolution demands a refined frame. Gap and Ambiguity artifacts are ordinary Data nodes with CIDs and MAY be audited like any other output. --- ## Mapping Surface Semantics The surface defines the total function: ``` MS_map : (Concept C, ContextFrame CF) -> Result Result = Produced(Data bytes) | Gap(Data) | Ambiguity(Data) ``` ### Executable Form * Every mapping MUST be realised as an **FCS/1 recipe** parameterised by **PCB1** blocks. * Recipes MAY only invoke **FPS/1 primitives** (`put`, `get`, `slice`, `concatenate`, `reverse`, `splice`) and compositions thereof. ### Determinism and Replayability For fixed `(C, CF)` inputs and fixed referenced Data CIDs, implementations MUST produce identical bytes. Each run MUST emit a **FER/1 run record** (Data) that declares: * the recipe concept ID (FCS/1), * the PCB1 parameter block (as Data CID), * all input CIDs (including `CID_context`), * the output CID, and * relevant environment hashes (e.g., FPS/1 library surfaces) that affect determinism. Replaying the FCS/1 recipe with the same inputs MUST yield the same CID. ### Fidelity Predicates Each mapping declares (via relation concepts) an **equivalence predicate** that states when the produced bytes faithfully represent the target concept. Fidelity predicates are themselves concepts and MAY be certified through **FCT/1**. ### Domain Keys Disambiguation relies on **Domain concepts** expressed as context keys (e.g., `withinDomain`). Recipes MAY require domain bindings such as classical vs. quantum state spaces to guarantee deterministic interpretation. ### Function Interface Patterns Implementations SHOULD expose a narrow callable surface so concept inputs and Determinism guarantees remain auditable. 1. **Primary signature:** * `Result ms_map(concept_id, context_frame)` is the canonical entry point. * `concept_id` MUST be a stable concept node reference (e.g., CID or registry handle). Implementations MUST NOT infer the target concept from global mutable state. * `context_frame` MUST be the complete set of bindings used for the run. Any binding derived from other inputs MUST be reflected back into the frame before execution. 2. **Auxiliary parameters:** * Additional parameters are only permitted when they can be canonicalised to deterministic Data CIDs and recorded in the FER/1 run record. * Preferred pattern: `attachments: Sequence[DataCID] = ()`. Each attachment MUST already exist as a Data node, and the mapping MUST reference the CIDs explicitly in the FER/1 log. * Alternative pattern: `concept_inputs: Sequence[ConceptID] = ()` for secondary concept handles. Each element MUST also appear in the effective context frame (e.g., under a relation-specific key). Passing the concept ID without updating the frame is a violation. 3. **Rejected pattern:** Opaque keyword arguments or process-level environment variables MUST NOT influence execution. Implementations discovering such usage MUST emit an Ambiguity artifact describing the missing context binding instead. The following pseudocode illustrates a compliant wrapper: ``` def ms_map(concept_id: ConceptID, context_frame: ContextFrame, *, attachments: Sequence[DataCID] = (), concept_inputs: Sequence[ConceptID] = ()) -> Result: frame = context_frame.with_bindings({ relation_for(ci): ci for ci in concept_inputs }) frame = frame.with_bindings({ relation_for(cid): cid for cid in attachments }) record = run_fcs_recipe(concept_id, frame, attachments) return normalise_result(record) ``` `relation_for` denotes a deterministic lookup from relation concept to the key under which the binding is stored. Implementations MAY inline more efficient mechanisms, but the observable effect MUST match first updating the frame and then invoking the recipe. ### Media (MIME) Types vs. MS Context **MIME/media types** label already-produced Data so downstream systems can interpret byte strings (e.g., `text/plain; charset=utf-8`, `image/png`). An **MS context frame** captures the *pre-execution* bindings that shape which bytes will be emitted. They relate but are not interchangeable: * **Where they live:** Context bindings exist before execution and are hashed into `CID_context`. MIME labels are attached after Data exists (e.g., CIL payload metadata, HTTP headers, or relation edges such as `ms.produces_media_type`). * **What they encode:** Context keys describe decision levers (encoding, domain, fidelity policy) that an FCS recipe uses to deterministically produce bytes. MIME types describe how consumers should parse already-materialised bytes. A context MAY carry a desired media-type concept ("emit this as `application/pdf`") but it is still a binding that constrains execution, not a replacement for Content-Type headers. * **Interoperability:** When MS outputs feed MIME-governed ecosystems, recipes SHOULD register a concept such as `ms.media_type` and declare it via `ms.requires_key` if the choice affects determinism (e.g., choosing between `text/csv` vs. `application/json`). FER/1 receipts then bind the produced Data both to the governing context frame and to a media-type relation so downstream MIME routers, storage overlays, and catalogues reconcile the two perspectives. MIME types are one example of **interpretation contracts** that MS contexts can align with. The same pattern applies to CAD kernels with STEP schemas, audio encoders with bitstream levels, or ML artefacts with `model.format` descriptors. MS/1 keeps them deterministic by requiring the selection to be a context binding (recorded before execution) while also allowing publication surfaces to mirror the choice via their native metadata channels. --- ## Pipelines and Composition A **Pipeline** is a concept that composes ordered mapping steps. * Each step executes `MS_map(C_i, CF_i)`. * Pipelines return either Produced(Data) or an Artifact (Gap/Ambiguity). * Given identical inputs, pipelines MUST be byte-stable. * **Branch isolation:** each branch operates on a forked frame. Publishing updates requires emitting a new frame with a fresh `CID_context`, preserving immutability. --- ## Conformance Criteria Implementations conform to MS/1 when they satisfy all of the following: 1. Accept `(C, CF)` and emit either Produced(Data) or a Gap/Ambiguity artifact. 2. Produce outputs reproducibly via FCS/1 + FPS/1, and record each run via FER/1. 3. Encode mutable decisions through concept-typed context keys only; no hidden flags. 4. Address all bytes by SHA-256 CIDs and guarantee identical inputs replay to identical CIDs. 5. Treat relation types as concepts without hard-coded enumerations. ## Dependencies (pinned drafts) MS/1 relies on upstream drafts; consumers MUST treat these versions as pinned for this specification and handle later changes as upstream drift: * FCS/1 v0.2.1 (Draft) * FPS/1 v0.4.3 (Draft) * FLS/1 v0.1.0 (Draft) Any adoption in PH10+ MUST re-pin consciously if these upstream specs change. ## Evidence (PH08 runtime) PH08 reference executions demonstrate MS/1 behaviour; cite these when asserting readiness or approval: * `logs/ph08/evidence/ms/PH08-EV-MS-RUNTIME-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-ACCEPT-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-BIND-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-CORE-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-GATES-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-LADDER-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-PIPE-001/` * `logs/ph08/evidence/ms/PH08-EV-MS-ML-EVAL-001/` --- ## Worked Ladder Example (Zero → Byte) The following non-normative ladder illustrates how MS/1 composes mappings from abstract numbers to textual bytes. 1. Concepts: `C_Number65`, `C_CodePoint(U+0041)`, `C_LetterA`, `C_WordA`. 2. Context frame bindings: `withinDomain → C_Unicode15`, `C_TextEncoding → C_UTF8`. 3. Steps: * Interpret `C_Number65` as `C_CodePoint(U+0041)` by applying a mapping rule concept. * Map `C_CodePoint(U+0041)` within the Unicode/UTF-8 frame to Data `0x41`. * Map `C_WordA` under the same frame to Data `0x41`. Each run records its FER/1 linkage so the byte `0x41` can be replayed. --- ## Context Evolution and Replay Discipline ### Context Revisions During Pipelines When a branch detects missing bindings (e.g., absent `C_TextEncoding`), it MUST emit an Ambiguity artifact detailing admissible encodings. Progress resumes by creating a refined frame `CF' = CF ∪ { C_TextEncoding → C_UTF8 }` and re-running only the affected branch. Publishing the refinement to siblings is explicit and produces a new parent frame with a distinct `CID_context`. ### Replay-First Evolution MS/1 guards against semantic drift when the knowledge base or registry grows: * **Frame immutability:** Every FER/1 record captures the `CID_context` that was in force at execution time. New bindings (e.g., alternative encodings) produce new frame hashes and therefore new provenance edges. Historical runs replay byte-for-byte because their frames never mutate. * **Required bindings:** Recipes MUST register mandatory keys via `ms.requires_key`. When governance tightens (for example, by requiring a `string.encoding` key), existing frames lacking the key yield deterministic Gap/Ambiguity artifacts instead of silently adopting defaults. * **Replay-first migration:** If an ecosystem needs the new semantics, it reruns the recorded FCS/1 recipe with a refined frame. The new FER/1 record cites the refined `CID_context`, making comparisons between legacy and refreshed runs explicit and auditable. * **Artifact parity:** Gap and Ambiguity outputs are stored as first-class Data nodes. Audits can therefore demonstrate exactly where a richer knowledge base demanded additional context, keeping provenance complete even when execution paused. Collectively, these rules let operators expand the registry or tighten policies without jeopardising determinism or traceability. --- ## Hashing Bits and Abstract Values MS/1 hashes bytes, not abstract concepts. A bit becomes hashable only once it is materialised (e.g., via `C_BitAsOctet`). When domain or packing decisions remain unbound, the mapping MUST emit an Ambiguity or Gap artifact rather than guess. --- ## Risk Controls MS/1 aligns with existing Amduat controls: * **Semantic drift:** Context keys are concepts with versioned materialisations; frame hashes reveal mismatches. * **Provenance loss:** Every Produced(Data) links to a FER/1 record via `hasProvenance`. * **Undocumented mutation:** Frames are immutable; refinements create new contexts. * **Rights obligations:** Licences and attribution obligations appear as context keys referencing policy Data. Recipes MAY decline to execute when obligations are unmet, provided the decision is deterministic. --- ## Acceptance Checks An MS/1 implementation SHOULD ship the following self-tests: 1. **Replay test:** identical `(C, CF)` inputs produce the same CID and FER/1 log. 2. **Gap test:** missing required keys yield deterministic Gap artifacts. 3. **Ambiguity test:** admissible alternatives are enumerated stably. 4. **Scope test:** sibling branches remain unchanged unless a published frame is adopted. 5. **Fidelity test:** the declared equivalence predicate is machine-verifiable. --- ## Registry Alignment MS/1 is production-registered in the CRS/1 concept registry. Implementations MUST treat the following handles and digests as canonical when emitting or validating graphs: | Symbol | Registry Handle | Kind | SHA-256 Digest | Notes | | --- | --- | --- | --- | --- | | `MS/1` | `crs:concept/amduat.ms.surface@1` | Concept | `d140ac54367a88fa2459e3fedf0b2fde934f9ac73568f8a159e2b0c1c1828c70` | Primary mapping surface concept anchoring `MS_map` executions. | | `ms.produces` | `crs:concept/amduat.ms.relation.produces@1` | Relation concept | `447a9f454d78f5b2ee300fe416138a864789e133b2eb9a84e32592aa9dd47965` | Annotates `Concept → Data` edges that capture Produced(Data) bytes. | | `ms.requires_key` | `crs:concept/amduat.ms.relation.requires_key@1` | Relation concept | `a90295a8ca3006e062a5a1d5a6220330e53ba00677736b6f4a18efcec1169f6a` | Declares mandatory context key bindings for deterministic execution. | | `ms.within_domain` | `crs:concept/amduat.ms.relation.within_domain@1` | Relation concept | `0993dff2531dd32ea32b98925bf8a5cbc88c88ed28cd3c3575e8affc84d7fa2d` | Expresses domain refinements used to disambiguate mappings. | | `ms.fidelity_predicate` | `crs:concept/amduat.ms.relation.fidelity_predicate@1` | Relation concept | `7e159182789d89269b743c19da58d34acc3279d86650cfef83efd4f2c210c66a` | Binds outputs to the declared fidelity predicate concept. | | `ms.byteValue` | `crs:concept/amduat.ms.relation.byte_value@1` | Relation concept | `4c43dd3a37ae695bac476e4dc62d8d8c2abda6c555668d4d863810d0053056c3` | Records byte concepts and their literal value bindings within the ladder corpus. | | `ms.codePoint` | `crs:concept/amduat.ms.relation.code_point@1` | Relation concept | `bb7301d2fa0c5058cbd53019625bfd38ecb59e42010b64a9f0b0ada7dc494117` | Links textual concepts to Unicode code points under registered contexts. | | `ms.symbolSequence` | `crs:concept/amduat.ms.relation.symbol_sequence@1` | Relation concept | `115035e5dc2db7e9ab2e4255f50aee56d222b04f7b7434dbbec76620cf36aa6d` | Declares ordered relations between code points/bytes when emitting strings. | | `ms.upperCasePolicy` | `crs:concept/amduat.ms.relation.upper_case_policy@1` | Relation concept | `6f5165648c1069178c2e8a615bd24bd02a3a367df6f8e6ac21050f42eeea484c` | Encodes casing policies that constrain textual ladders. | | `ms.titleCasePolicy` | `crs:concept/amduat.ms.relation.title_case_policy@1` | Relation concept | `7f10f109b578b69b079fc9f284ed73c69b82434b31db0a999a872d2b47958ae5` | Encodes title-casing policies enforced during deterministic mapping. | The registry sidecar at `/amduat/registry/predicates.jsonl` mirrors these entries, allowing auditors to verify digests independent of this specification. Implementations MUST fail closed when encountering unregistered aliases for these handles. --- ## Phase 07 Cross-Stream Integration Phase 07 workstreams bind their certificates, receipts, facts, overlays, and domain dossiers to MS/1 contexts through a shared TypeTag grid: * `/amduat/phases/ph07/notes/PH07-CIL-XMAP-001.md` (`CIL-X1`) and `vectors/ph07/cil/PH07-CIL-XMAP-001.json` declare the authoritative cross-stream mapping entries and TypeTag ranges that all semantic surfaces must cite before reaching Draft Ready gates. The JSON registry is logged under `PH07-EV-CIL-ATTEST-001` so downstream profiles can dereference the same identifiers without ambiguity. * `/amduat/phases/ph07/notes/PH07-CROSS-CHECK-001.md` records the harness that enforces these bindings. The checklist stored in `logs/ph07/evidence/cil/PH07-EV-CIL-ATTEST-001/PH07-CROSS-TV-001.md` validates every FER/FCT/OI manifest against the refreshed XMAP IDs before the `*-5` ledger exits (`FER-5`, `FCT-5`, `OI-5`) can advance. * `/amduat/phases/ph07/notes/PH07-FER-SCHEMA-001.md` (ledger `FER-1`, evidence `PH07-EV-FER-RUN-001`) pins `xmap_refs[]`, certificate anchors, and replay bundles to the XMAP rows so every receipt explicitly declares which MS/1 context governed execution. * `/amduat/phases/ph07/notes/PH07-FCT-SCHEMA-001.md` (ledger `FCT-1`, evidence `PH07-EV-FCT-FACTS-001`) introduces the `trust_spine` block that carries the required `xmap_ref`, `receipt_refs[]`, and `anchor_certs[]`, keeping fact acceptance policies tied to the same mapping IDs. * `/amduat/phases/ph07/notes/PH07-OI-HARNESS-001.md` (ledger `OI-5`, evidence `PH07-EV-OI-VIEWS-001`) proves overlay descriptors and workspace views publish `mapping_profile.xmap_refs[]` plus TGK edge expectations aligned with `XMAP-CIL-CUSTOM-V1`. * `/amduat/phases/ph07/notes/PH07-DOM-HARNESS-001.md` (ledger `DOM-5`, evidence `PH07-EV-DOM-APPS-001`) extends the same guarantees to domain pilot dossiers, confirming their overlays, facts, and receipts cite approved TypeTags and TGK edge sets. MS/1 implementations participating in PH07 MUST therefore emit the same mapping handles and evidence references recorded in these notes so provenance can be validated across CIL/FER/FCT/OI boundaries. --- ## Phase 05 Textual Ladder Scope and Evidence Phase 05 extends MS/1 from abstract examples to a production ladder that binds textual concepts to deterministic bytes. The ladder introduces: * A `byte` concept family with 256 child concepts (`byte/0x00` … `byte/0xFF`) whose CRS/1 relations emit single-octet Data nodes and optionally record the radix context via `ms.byteValue` predicates. * UTF-8 code point concepts that sequence byte concepts through `ms.symbolSequence` relations while asserting `ms.within_domain` bindings to Unicode 15 and UTF-8 domain concepts. * Casing policy concepts (e.g., `allCaps`, `titleCase`) that require the `string.casingPolicy` context key and advertise fidelity predicates so downstream tooling can reject ambiguous casing decisions via `ms.upperCasePolicy` and `ms.titleCasePolicy` handles. * Dictionary word concepts implemented as FCS/1 recipes that concatenate code points into byte strings, emitting FER/1 receipts that cite the governing casing policy and context frame. Predicate registries gain canonical handles (`ms.byteValue`, `ms.codePoint`, `ms.symbolSequence`, `ms.produces`, `ms.requires_key`, `ms.upperCasePolicy`, `ms.titleCasePolicy`) so tooling can resolve ladder edges without bespoke enumerations. Missing casing bindings MUST yield Ambiguity artifacts (`ERR_MS_POLICY_MISSING`/`ERR_MS_AMBIGUITY`); absent code points produce Gap artifacts (`ERR_MS_GAP`); undeclared predicates raise `ERR_MS_UNDECLARED_PREDICATE`. Evidence for the ladder is captured under `/amduat/vectors/ph05/ms1-text/manifest.json` and the reserved `/amduat/logs/ph05/evidence/ms1/` surfaces: * `PH05-EV-MS-CTX-001/` — CTX/1 context frames with predicate registry vectors (domain separator `AMDUAT:CTX\0`, reject `ERR_CTX_UNKNOWN_KEY`). * `PH05-EV-MS-LADDER-001/` — Dual-run FER/1-backed positive ladders for bytes, code points, and dictionary outputs with SA/PA guardrails. * `PH05-EV-MS-ERRORS-001/` — Gap/ambiguity/missing policy/undeclared predicate receipts mapped to ADR-006. --- ## Phase Alignment and Readiness * **PH07 (Semantic Surfaces):** MS/1 is authoritative for every PH07 semantic surface (CIL/FER/FCT/OI) and must be cited wherever mapping semantics are referenced. PH07 workstreams MAY extend context vocabularies or registry rows so long as they remain MS/1-conformant; no runtime commits are expected in this phase beyond harness stubs and governance evidence. * **PH08 (Reference Implementation):** The reference `ms_map` runtime, parity harnesses, and subsystem wiring reside in the `KRN-2` campaign slotted for Phase 08. PH08 SHALL use this specification verbatim, emitting FER/1 records and exercising the acceptance checks in §Acceptance Checks. * **Downstream pilots (PH09+):** Reproducible ML, CI/CD, data mesh, and notebook pilots inherit the MS/1 contract by invoking the PH08 reference surface. Those phases MAY add domain-specific context keys or ladders, but they MUST register them through CRS/1 and capture evidence via the surfaces reserved in §Phase 05 Textual Ladder Scope and Evidence. Declaring these boundaries keeps the approval pathway clear: PH07 completes the specification, PH08 proves the executable substrate, and later phases consume it without reopening MS/1 fundamentals. --- ## Document History * **0.1.0 (2025-11-10):** Initial draft capturing deterministic concept-to-data mapping surface. * **0.1.1 (2025-11-11):** Add interface patterns for mapping functions and constrain auxiliary parameters. * **0.1.2 (2025-11-12):** Formalize registry concept handles and digests for MS/1. * **0.1.3 (2025-11-14):** Documented PH05 textual ladder scope, predicate handles, and evidence surfaces. * **0.1.4 (2025-11-15):** Aligned MS/1 evidence references with CTX/1, ladder, and error reservations. * **0.1.5 (2025-11-18):** Added ms.* predicate handles, CTX/1 domain separator, ADR-006 error mapping, and dual-run evidence guardrails. * **0.1.6 (2025-11-18):** Added context-evolution guardrails and PH07→PH08 readiness boundaries. * **0.1.7 (2025-11-19):** Clarified MIME/media-type relationship to MS context frames and execution bindings. * **0.2.0 (2025-11-19):** Standardized metadata, headings, and evidence alignment per DOCSTD. * **0.2.1 (2025-11-19):** Synced registry handles and documented PH07 XMAP/harness integration. * **0.2.2 (2025-11-30):** Added DOCID header, pinned upstream draft dependencies, and referenced PH08 MS/1 evidence bundles.