# Spec Clarifications This document records implementation-level clarifications for draft Tier-1 specs. These notes do not change the specs; they document concrete choices for the implementation in this repository. ## Glossary and Abbreviations | Term | Meaning | | --- | --- | | CURRENT | Effective index state after replaying a log position on a snapshot. | | LogPosition | Inclusive `logseq` upper bound for replay (not a byte offset). | | SnapshotID | Opaque `uint64_t` identifier persisted in `SNAPSHOT_ANCHOR`. | | Segment seal | Log record admitting a segment via `(segment_id, segment_hash)`. | | Segment hash | SHA-256 over exact on-disk segment bytes, including footer. | | Tombstone | Visibility policy record applied during replay. | | Tombstone lift | Cancels a specific tombstone record for the same artifact. | | Exec plan | Serialized plan format; executor out of scope for core library. | ## Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG) Decision: - LogPosition is the log sequence number (`logseq`), not a byte offset. - SnapshotID is an opaque store-assigned `uint64_t`, persisted in the `SNAPSHOT_ANCHOR` payload. Implications: - `IndexState = (SnapshotID, LogPosition)` uses an inclusive logseq upper bound when replaying `log[0:LogPosition]`. - The log's record envelope already carries `logseq`, so snapshot anchors use the anchor record's `logseq` as the snapshot log position. - If no snapshot exists, treat SnapshotID as `0` and LogPosition as `0`. Rationale: - `ASL/LOG/1` defines replay and visibility in terms of `logseq` ordering. - `ASL/TGK-EXEC-PLAN/1` orders results by `logseq` and uses `log_prefix` bounds. - `ASL/STORE-INDEX/1` defines LogPosition as a monotonic integer position and replay as `log[0:LogPosition]`, which maps directly to logseq. References: - `tier1/asl-log-1.md` - `tier1/enc-asl-log-1.md` - `tier1/asl-store-index-1.md` - `tier1/asl-tgk-execution-plan-1.md` - `tier1/enc-asl-tgk-exec-plan-1.md` ## Index Segment Identity and Seals (ASL/STORE-INDEX + ASL/LOG) Decision: - `segment_id` is a store-local, monotonic `uint64_t` assigned when a segment is created (before writing records), and persisted by naming/metadata outside the segment file. - `segment_hash` is SHA-256 over the exact segment file bytes as stored on disk, including header, records, digest bytes, extents, and footer. Implications: - The seal record (`SEGMENT_SEAL`) binds a specific persisted segment file to the log via `(segment_id, segment_hash)`. Hashing occurs after the footer is written so the hash commits to seal metadata (CRC, seal snapshot, timestamp). - Replay uses `segment_id` to locate the segment file and verifies `segment_hash` before admitting it as visible. Rationale: - `ENC/ASL-LOG/1` defines the seal payload as a segment ID plus a hash of the segment bytes; the log is the visibility gate, so the hash must cover the complete on-disk segment. - `ENC/ASL-CORE-INDEX/1` does not embed a segment ID, so the ID must be an external, store-managed handle (filename or catalog entry). References: - `tier1/asl-log-1.md` - `tier1/enc-asl-log-1.md` - `tier1/asl-store-index-1.md` - `tier1/enc-asl-core-index-1.md` ## Tombstone Semantics (ASL/LOG + ASL/STORE-INDEX) Decision: - `scope` and `reason_code` are opaque metadata and do not affect shadowing. - A `TOMBSTONE_LIFT` cancels only the referenced tombstone record for the same artifact; other tombstones for that artifact remain effective. Across snapshots: - Snapshots capture the effective tombstone state as of the snapshot's `logseq`. - Lifts recorded after a snapshot become effective only when replay reaches their `logseq`. References: - `tier1/asl-log-1.md` - `tier1/asl-store-index-1.md` ## Federation Fields (ENC/ASL-CORE-INDEX) Decision: - Version 3 encoders must always emit federation fields in both headers and records. They are required, not optional, in v3. - Decoders accept legacy versions that omit federation fields and apply default local/internal values as defined in the encoding spec. References: - `tier1/enc-asl-core-index-1.md` ## Execution Plan Scope (ASL/TGK-EXEC-PLAN + ENC/ASL-TGK-EXEC-PLAN) Decision: - The implementation treats execution plans as a serialized/transport artifact and semantic contract only. A plan executor is out of scope for the core library. References: - `tier1/asl-tgk-execution-plan-1.md` - `tier1/enc-asl-tgk-exec-plan-1.md` ## Publish/Unpublish Scope (ASL/LOG + ASL/SYSTEM) Decision: - `ARTIFACT_PUBLISH` and `ARTIFACT_UNPUBLISH` are treated as reserved record types in the core replay path and do not alter ASL index state. - Publishing is modeled as moving artifacts and index segments between stores, advancing the destination store's snapshot/log. Implications: - Core replay ignores publish/unpublish records. - Any visibility policy tied to publishing is handled by higher-level tooling or system-layer orchestration, not ASL/1 core semantics. References: - `tier1/asl-log-1.md` - `tier1/asl-system-1.md` ## Receipt Output Reference Fallback (FER/1 + PEL/1) Decision: - When a PEL run produces no output artifacts (e.g. failed execution), the receipt's `output_ref` falls back to the stored PEL result artifact reference. Implications: - Receipts can be emitted for both successful and failed runs using a single canonical output reference. - Callers using `amduat_fer1_receipt_from_pel_run` should expect `output_ref` to match `result_ref` when `output_refs_len == 0`. References: - `tier1/enc-fer1-receipt-1.md` - `tier1/srs.md` ## FER/1 v1.1 Determinism and Validation (FER/1 + SRS) Decision: - `run_id` is a deterministic hash over stable inputs only and MUST exclude timestamps, logs, or mutable metadata. - Typed logs are optional; if present they MUST be ordered and size-bounded. - Limits are a single required record when the `limits` TLV is present. - Executor set verification is strict when a policy-provided set exists. Concrete rules: - `run_id = H("AMDUAT:RUN\0" || EncRef(function) || EncRef(input_manifest) || EncRef(environment) || EncRef(executor_fingerprint))`, where `EncRef` is `ENC/ASL1-CORE` canonical bytes and `executor_fingerprint` is the canonical digest reference. No other fields are included. - `logs` (if present): order by `(kind, cid)` byte-lexicographically; cap to 64 entries; cap total log payload references to 1 MiB aggregate of capsule bytes. Reject out-of-order or oversized sets. - `limits` (if present): exactly one TLV containing all numeric fields (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`) with fixed units. Reject missing or duplicate fields. - Executor set validation: - If an expected executor set is supplied by policy, receipt executor_refs MUST match it exactly (same members, byte-order, no extras). - Otherwise, validate strict ordering and uniqueness, and require `parity_len == executor_refs_len` with aligned ordering and `output_ref` equality for every parity entry. References: - `tier1/srs.md` - `tier1/enc-fer1-receipt-1.md` ## FER/1 v1.1 Encoding Notes (Implementation) Decision: - The v1.1 encoder appends a TLV extension block after the v1 base layout. - Unknown or duplicate TLV tags are rejected during decode. TLV tags (implementation): - `0x0001` executor fingerprint reference (encoded reference bytes). - `0x0002` run id (`U32` length + bytes). - `0x0003` logs (`U32` count; per entry: `U32 kind`, encoded ref, `U32` sha256 length + bytes). Entries must be ordered by `(kind, ref)` byte order. - `0x0004` limits (`U64` cpu_ms, `U64` wall_ms, `U64` max_rss_kib, `U64` io_reads, `U64` io_writes). - `0x0005` determinism (`U8` level, `U32` seed_len + seed bytes). - `0x0006` signature (opaque bytes). Helper usage: - `amduat_fer1_receipt_from_pel_run_v1_1` emits v1.1 receipts and uses the same output_ref fallback as v1: when no outputs exist, `output_ref` is the stored PEL result reference. References: - `include/amduat/enc/fer1_receipt.h` - `src/near_core/enc/fer1_receipt.c`