amduat/docs/spec-clarifications.md
2026-01-17 21:34:24 +01:00

6.9 KiB

Spec Clarifications

This document records implementation-level clarifications for draft Tier-1 specs. These notes do not change the specs; they document concrete choices for the implementation in this repository.

Glossary and Abbreviations

Term Meaning
CURRENT Effective index state after replaying a log position on a snapshot.
LogPosition Inclusive logseq upper bound for replay (not a byte offset).
SnapshotID Opaque uint64_t identifier persisted in SNAPSHOT_ANCHOR.
Segment seal Log record admitting a segment via (segment_id, segment_hash).
Segment hash SHA-256 over exact on-disk segment bytes, including footer.
Tombstone Visibility policy record applied during replay.
Tombstone lift Cancels a specific tombstone record for the same artifact.
Exec plan Serialized plan format; executor out of scope for core library.

Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG)

Decision:

  • LogPosition is the log sequence number (logseq), not a byte offset.
  • SnapshotID is an opaque store-assigned uint64_t, persisted in the SNAPSHOT_ANCHOR payload.

Implications:

  • IndexState = (SnapshotID, LogPosition) uses an inclusive logseq upper bound when replaying log[0:LogPosition].
  • The log's record envelope already carries logseq, so snapshot anchors use the anchor record's logseq as the snapshot log position.
  • If no snapshot exists, treat SnapshotID as 0 and LogPosition as 0.

Rationale:

  • ASL/LOG/1 defines replay and visibility in terms of logseq ordering.
  • ASL/TGK-EXEC-PLAN/1 orders results by logseq and uses log_prefix bounds.
  • ASL/STORE-INDEX/1 defines LogPosition as a monotonic integer position and replay as log[0:LogPosition], which maps directly to logseq.

References:

  • tier1/asl-log-1.md
  • tier1/enc-asl-log-1.md
  • tier1/asl-store-index-1.md
  • tier1/asl-tgk-execution-plan-1.md
  • tier1/enc-asl-tgk-exec-plan-1.md

Index Segment Identity and Seals (ASL/STORE-INDEX + ASL/LOG)

Decision:

  • segment_id is a store-local, monotonic uint64_t assigned when a segment is created (before writing records), and persisted by naming/metadata outside the segment file.
  • segment_hash is SHA-256 over the exact segment file bytes as stored on disk, including header, records, digest bytes, extents, and footer.

Implications:

  • The seal record (SEGMENT_SEAL) binds a specific persisted segment file to the log via (segment_id, segment_hash). Hashing occurs after the footer is written so the hash commits to seal metadata (CRC, seal snapshot, timestamp).
  • Replay uses segment_id to locate the segment file and verifies segment_hash before admitting it as visible.

Rationale:

  • ENC/ASL-LOG/1 defines the seal payload as a segment ID plus a hash of the segment bytes; the log is the visibility gate, so the hash must cover the complete on-disk segment.
  • ENC/ASL-CORE-INDEX/1 does not embed a segment ID, so the ID must be an external, store-managed handle (filename or catalog entry).

References:

  • tier1/asl-log-1.md
  • tier1/enc-asl-log-1.md
  • tier1/asl-store-index-1.md
  • tier1/enc-asl-core-index-1.md

Tombstone Semantics (ASL/LOG + ASL/STORE-INDEX)

Decision:

  • scope and reason_code are opaque metadata and do not affect shadowing.
  • A TOMBSTONE_LIFT cancels only the referenced tombstone record for the same artifact; other tombstones for that artifact remain effective.

Across snapshots:

  • Snapshots capture the effective tombstone state as of the snapshot's logseq.
  • Lifts recorded after a snapshot become effective only when replay reaches their logseq.

References:

  • tier1/asl-log-1.md
  • tier1/asl-store-index-1.md

Federation Fields (ENC/ASL-CORE-INDEX)

Decision:

  • Version 3 encoders must always emit federation fields in both headers and records. They are required, not optional, in v3.
  • Decoders accept legacy versions that omit federation fields and apply default local/internal values as defined in the encoding spec.

References:

  • tier1/enc-asl-core-index-1.md

Execution Plan Scope (ASL/TGK-EXEC-PLAN + ENC/ASL-TGK-EXEC-PLAN)

Decision:

  • The implementation treats execution plans as a serialized/transport artifact and semantic contract only. A plan executor is out of scope for the core library.

References:

  • tier1/asl-tgk-execution-plan-1.md
  • tier1/enc-asl-tgk-exec-plan-1.md

Publish/Unpublish Scope (ASL/LOG + ASL/SYSTEM)

Decision:

  • ARTIFACT_PUBLISH and ARTIFACT_UNPUBLISH are treated as reserved record types in the core replay path and do not alter ASL index state.
  • Publishing is modeled as moving artifacts and index segments between stores, advancing the destination store's snapshot/log.

Implications:

  • Core replay ignores publish/unpublish records.
  • Any visibility policy tied to publishing is handled by higher-level tooling or system-layer orchestration, not ASL/1 core semantics.

References:

  • tier1/asl-log-1.md
  • tier1/asl-system-1.md

Receipt Output Reference Fallback (FER/1 + PEL/1)

Decision:

  • When a PEL run produces no output artifacts (e.g. failed execution), the receipt's output_ref falls back to the stored PEL result artifact reference.

Implications:

  • Receipts can be emitted for both successful and failed runs using a single canonical output reference.
  • Callers using amduat_fer1_receipt_from_pel_run should expect output_ref to match result_ref when output_refs_len == 0.

References:

  • tier1/enc-fer1-receipt-1.md
  • tier1/srs.md

FER/1 v1.1 Determinism and Validation (FER/1 + SRS)

Decision:

  • run_id is a deterministic hash over stable inputs only and MUST exclude timestamps, logs, or mutable metadata.
  • Typed logs are optional; if present they MUST be ordered and size-bounded.
  • Limits are a single required record when the limits TLV is present.
  • Executor set verification is strict when a policy-provided set exists.

Concrete rules:

  • run_id = H("AMDUAT:RUN\0" || EncRef(function) || EncRef(input_manifest) || EncRef(environment) || EncRef(executor_fingerprint)), where EncRef is ENC/ASL1-CORE canonical bytes and executor_fingerprint is the canonical digest reference. No other fields are included.
  • logs (if present): order by (kind, cid) byte-lexicographically; cap to 64 entries; cap total log payload references to 1 MiB aggregate of capsule bytes. Reject out-of-order or oversized sets.
  • limits (if present): exactly one TLV containing all numeric fields (cpu_ms, wall_ms, max_rss_kib, io_reads, io_writes) with fixed units. Reject missing or duplicate fields.
  • Executor set validation:
    • If an expected executor set is supplied by policy, receipt executor_refs MUST match it exactly (same members, byte-order, no extras).
    • Otherwise, validate strict ordering and uniqueness, and require parity_len == executor_refs_len with aligned ordering and output_ref equality for every parity entry.

References:

  • tier1/srs.md
  • tier1/enc-fer1-receipt-1.md