amduat/docs/spec-clarifications.md

# Spec Clarifications

This document records implementation-level clarifications for draft Tier-1
specs. These notes do not change the specs; they document concrete choices for
the implementation in this repository.

## Glossary and Abbreviations

| Term | Meaning |
| --- | --- |
| CURRENT | Effective index state after replaying a log position on a snapshot. |
| LogPosition | Inclusive `logseq` upper bound for replay (not a byte offset). |
| SnapshotID | Opaque `uint64_t` identifier persisted in `SNAPSHOT_ANCHOR`. |
| Segment seal | Log record admitting a segment via `(segment_id, segment_hash)`. |
| Segment hash | SHA-256 over exact on-disk segment bytes, including footer. |
| Tombstone | Visibility policy record applied during replay. |
| Tombstone lift | Cancels a specific tombstone record for the same artifact. |
| Exec plan | Serialized plan format; executor out of scope for core library. |

## Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG)

Decision:
- LogPosition is the log sequence number (`logseq`), not a byte offset.
- SnapshotID is an opaque store-assigned `uint64_t`, persisted in the
  `SNAPSHOT_ANCHOR` payload.

Implications:
- `IndexState = (SnapshotID, LogPosition)` uses an inclusive logseq upper bound
  when replaying `log[0:LogPosition]`.
- The log's record envelope already carries `logseq`, so snapshot anchors use
  the anchor record's `logseq` as the snapshot log position.
- If no snapshot exists, treat SnapshotID as `0` and LogPosition as `0`.

Rationale:
- `ASL/LOG/1` defines replay and visibility in terms of `logseq` ordering.
- `ASL/TGK-EXEC-PLAN/1` orders results by `logseq` and uses `log_prefix` bounds.
- `ASL/STORE-INDEX/1` defines LogPosition as a monotonic integer position and
  replay as `log[0:LogPosition]`, which maps directly to logseq.

References:
- `tier1/asl-log-1.md`
- `tier1/enc-asl-log-1.md`
- `tier1/asl-store-index-1.md`
- `tier1/asl-tgk-execution-plan-1.md`
- `tier1/enc-asl-tgk-exec-plan-1.md`

## Index Segment Identity and Seals (ASL/STORE-INDEX + ASL/LOG)

Decision:
- `segment_id` is a store-local, monotonic `uint64_t` assigned when a segment is
  created (before writing records), and persisted by naming/metadata outside the
  segment file.
- `segment_hash` is SHA-256 over the exact segment file bytes as stored on disk,
  including header, records, digest bytes, extents, and footer.

Implications:
- The seal record (`SEGMENT_SEAL`) binds a specific persisted segment file to the
  log via `(segment_id, segment_hash)`. Hashing occurs after the footer is
  written so the hash commits to seal metadata (CRC, seal snapshot, timestamp).
- Replay uses `segment_id` to locate the segment file and verifies
  `segment_hash` before admitting it as visible.

Rationale:
- `ENC/ASL-LOG/1` defines the seal payload as a segment ID plus a hash of the
  segment bytes; the log is the visibility gate, so the hash must cover the
  complete on-disk segment.
- `ENC/ASL-CORE-INDEX/1` does not embed a segment ID, so the ID must be an
  external, store-managed handle (filename or catalog entry).

References:
- `tier1/asl-log-1.md`
- `tier1/enc-asl-log-1.md`
- `tier1/asl-store-index-1.md`
- `tier1/enc-asl-core-index-1.md`

## Tombstone Semantics (ASL/LOG + ASL/STORE-INDEX)

Decision:
- `scope` and `reason_code` are opaque metadata and do not affect shadowing.
- A `TOMBSTONE_LIFT` cancels only the referenced tombstone record for the same
  artifact; other tombstones for that artifact remain effective.

Across snapshots:
- Snapshots capture the effective tombstone state as of the snapshot's `logseq`.
- Lifts recorded after a snapshot become effective only when replay reaches
  their `logseq`.

References:
- `tier1/asl-log-1.md`
- `tier1/asl-store-index-1.md`

## Federation Fields (ENC/ASL-CORE-INDEX)

Decision:
- Version 3 encoders must always emit federation fields in both headers and
  records. They are required, not optional, in v3.
- Decoders accept legacy versions that omit federation fields and apply default
  local/internal values as defined in the encoding spec.

References:
- `tier1/enc-asl-core-index-1.md`

## Execution Plan Scope (ASL/TGK-EXEC-PLAN + ENC/ASL-TGK-EXEC-PLAN)

Decision:
- The implementation treats execution plans as a serialized/transport artifact
  and semantic contract only. A plan executor is out of scope for the core
  library.

References:
- `tier1/asl-tgk-execution-plan-1.md`
- `tier1/enc-asl-tgk-exec-plan-1.md`

## Publish/Unpublish Scope (ASL/LOG + ASL/SYSTEM)

Decision:
- `ARTIFACT_PUBLISH` and `ARTIFACT_UNPUBLISH` are treated as reserved record
  types in the core replay path and do not alter ASL index state.
- Publishing is modeled as moving artifacts and index segments between stores,
  advancing the destination store's snapshot/log.

Implications:
- Core replay ignores publish/unpublish records.
- Any visibility policy tied to publishing is handled by higher-level tooling
  or system-layer orchestration, not ASL/1 core semantics.

References:
- `tier1/asl-log-1.md`
- `tier1/asl-system-1.md`

## Receipt Output Reference Fallback (FER/1 + PEL/1)

Decision:
- When a PEL run produces no output artifacts (e.g. failed execution), the
  receipt's `output_ref` falls back to the stored PEL result artifact reference.

Implications:
- Receipts can be emitted for both successful and failed runs using a single
  canonical output reference.
- Callers using `amduat_fer1_receipt_from_pel_run` should expect `output_ref`
  to match `result_ref` when `output_refs_len == 0`.

References:
- `tier1/enc-fer1-receipt-1.md`
- `tier1/srs.md`

## FER/1 v1.1 Determinism and Validation (FER/1 + SRS)

Decision:
- `run_id` is a deterministic hash over stable inputs only and MUST exclude
  timestamps, logs, or mutable metadata.
- Typed logs are optional; if present they MUST be ordered and size-bounded.
- Limits are a single required record when the `limits` TLV is present.
- Executor set verification is strict when a policy-provided set exists.

Concrete rules:
- `run_id = H("AMDUAT:RUN\0" || EncRef(function) || EncRef(input_manifest) ||
  EncRef(environment) || EncRef(executor_fingerprint))`, where `EncRef` is
  `ENC/ASL1-CORE` canonical bytes and `executor_fingerprint` is the canonical
  digest reference. No other fields are included.
- `logs` (if present): order by `(kind, cid)` byte-lexicographically; cap to
  64 entries; cap total log payload references to 1 MiB aggregate of capsule
  bytes. Reject out-of-order or oversized sets.
- `limits` (if present): exactly one TLV containing all numeric fields
  (`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`) with fixed
  units. Reject missing or duplicate fields.
- Executor set validation:
  - If an expected executor set is supplied by policy, receipt executor_refs
    MUST match it exactly (same members, byte-order, no extras).
  - Otherwise, validate strict ordering and uniqueness, and require
    `parity_len == executor_refs_len` with aligned ordering and `output_ref`
    equality for every parity entry.

References:
- `tier1/srs.md`
- `tier1/enc-fer1-receipt-1.md`
Clarify ASL index/log semantics 2026-01-17 11:46:57 +01:00			`# Spec Clarifications`

			`This document records implementation-level clarifications for draft Tier-1`
			`specs. These notes do not change the specs; they document concrete choices for`
			`the implementation in this repository.`

Polish ASL index/log specs 2026-01-17 12:21:15 +01:00			`## Glossary and Abbreviations`

			`\| Term \| Meaning \|`
			`\| --- \| --- \|`
			`\| CURRENT \| Effective index state after replaying a log position on a snapshot. \|`
			\| LogPosition \| Inclusive `logseq` upper bound for replay (not a byte offset). \|
			\| SnapshotID \| Opaque `uint64_t` identifier persisted in `SNAPSHOT_ANCHOR`. \|
			\| Segment seal \| Log record admitting a segment via `(segment_id, segment_hash)`. \|
			`\| Segment hash \| SHA-256 over exact on-disk segment bytes, including footer. \|`
			`\| Tombstone \| Visibility policy record applied during replay. \|`
			`\| Tombstone lift \| Cancels a specific tombstone record for the same artifact. \|`
			`\| Exec plan \| Serialized plan format; executor out of scope for core library. \|`

Clarify ASL index/log semantics 2026-01-17 11:46:57 +01:00			`## Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG)`

			`Decision:`
			- LogPosition is the log sequence number (`logseq`), not a byte offset.
			- SnapshotID is an opaque store-assigned `uint64_t`, persisted in the
			`SNAPSHOT_ANCHOR` payload.

			`Implications:`
			- `IndexState = (SnapshotID, LogPosition)` uses an inclusive logseq upper bound
			when replaying `log[0:LogPosition]`.
			- The log's record envelope already carries `logseq`, so snapshot anchors use
			the anchor record's `logseq` as the snapshot log position.
			- If no snapshot exists, treat SnapshotID as `0` and LogPosition as `0`.

			`Rationale:`
			- `ASL/LOG/1` defines replay and visibility in terms of `logseq` ordering.
			- `ASL/TGK-EXEC-PLAN/1` orders results by `logseq` and uses `log_prefix` bounds.
			- `ASL/STORE-INDEX/1` defines LogPosition as a monotonic integer position and
			replay as `log[0:LogPosition]`, which maps directly to logseq.

			`References:`
			- `tier1/asl-log-1.md`
			- `tier1/enc-asl-log-1.md`
			- `tier1/asl-store-index-1.md`
			- `tier1/asl-tgk-execution-plan-1.md`
			- `tier1/enc-asl-tgk-exec-plan-1.md`

			`## Index Segment Identity and Seals (ASL/STORE-INDEX + ASL/LOG)`

			`Decision:`
			- `segment_id` is a store-local, monotonic `uint64_t` assigned when a segment is
			`created (before writing records), and persisted by naming/metadata outside the`
			`segment file.`
			- `segment_hash` is SHA-256 over the exact segment file bytes as stored on disk,
			`including header, records, digest bytes, extents, and footer.`

			`Implications:`
			- The seal record (`SEGMENT_SEAL`) binds a specific persisted segment file to the
			log via `(segment_id, segment_hash)`. Hashing occurs after the footer is
			`written so the hash commits to seal metadata (CRC, seal snapshot, timestamp).`
			- Replay uses `segment_id` to locate the segment file and verifies
			`segment_hash` before admitting it as visible.

			`Rationale:`
			- `ENC/ASL-LOG/1` defines the seal payload as a segment ID plus a hash of the
			`segment bytes; the log is the visibility gate, so the hash must cover the`
			`complete on-disk segment.`
			- `ENC/ASL-CORE-INDEX/1` does not embed a segment ID, so the ID must be an
			`external, store-managed handle (filename or catalog entry).`

			`References:`
			- `tier1/asl-log-1.md`
			- `tier1/enc-asl-log-1.md`
			- `tier1/asl-store-index-1.md`
			- `tier1/enc-asl-core-index-1.md`

			`## Tombstone Semantics (ASL/LOG + ASL/STORE-INDEX)`

			`Decision:`
			- `scope` and `reason_code` are opaque metadata and do not affect shadowing.
			- A `TOMBSTONE_LIFT` cancels only the referenced tombstone record for the same
			`artifact; other tombstones for that artifact remain effective.`

			`Across snapshots:`
			- Snapshots capture the effective tombstone state as of the snapshot's `logseq`.
			`- Lifts recorded after a snapshot become effective only when replay reaches`
			their `logseq`.

			`References:`
			- `tier1/asl-log-1.md`
			- `tier1/asl-store-index-1.md`

			`## Federation Fields (ENC/ASL-CORE-INDEX)`

			`Decision:`
			`- Version 3 encoders must always emit federation fields in both headers and`
			`records. They are required, not optional, in v3.`
			`- Decoders accept legacy versions that omit federation fields and apply default`
			`local/internal values as defined in the encoding spec.`

			`References:`
			- `tier1/enc-asl-core-index-1.md`

			`## Execution Plan Scope (ASL/TGK-EXEC-PLAN + ENC/ASL-TGK-EXEC-PLAN)`

			`Decision:`
			`- The implementation treats execution plans as a serialized/transport artifact`
			`and semantic contract only. A plan executor is out of scope for the core`
			`library.`

			`References:`
			- `tier1/asl-tgk-execution-plan-1.md`
			- `tier1/enc-asl-tgk-exec-plan-1.md`
Extend FER/1 receipts and TGK store support 2026-01-17 21:34:24 +01:00
			`## Publish/Unpublish Scope (ASL/LOG + ASL/SYSTEM)`

			`Decision:`
			- `ARTIFACT_PUBLISH` and `ARTIFACT_UNPUBLISH` are treated as reserved record
			`types in the core replay path and do not alter ASL index state.`
			`- Publishing is modeled as moving artifacts and index segments between stores,`
			`advancing the destination store's snapshot/log.`

			`Implications:`
			`- Core replay ignores publish/unpublish records.`
			`- Any visibility policy tied to publishing is handled by higher-level tooling`
			`or system-layer orchestration, not ASL/1 core semantics.`

			`References:`
			- `tier1/asl-log-1.md`
			- `tier1/asl-system-1.md`

			`## Receipt Output Reference Fallback (FER/1 + PEL/1)`

			`Decision:`
			`- When a PEL run produces no output artifacts (e.g. failed execution), the`
			receipt's `output_ref` falls back to the stored PEL result artifact reference.

			`Implications:`
			`- Receipts can be emitted for both successful and failed runs using a single`
			`canonical output reference.`
			- Callers using `amduat_fer1_receipt_from_pel_run` should expect `output_ref`
			to match `result_ref` when `output_refs_len == 0`.

			`References:`
			- `tier1/enc-fer1-receipt-1.md`
			- `tier1/srs.md`

			`## FER/1 v1.1 Determinism and Validation (FER/1 + SRS)`

			`Decision:`
			- `run_id` is a deterministic hash over stable inputs only and MUST exclude
			`timestamps, logs, or mutable metadata.`
			`- Typed logs are optional; if present they MUST be ordered and size-bounded.`
			- Limits are a single required record when the `limits` TLV is present.
			`- Executor set verification is strict when a policy-provided set exists.`

			`Concrete rules:`
			- `run_id = H("AMDUAT:RUN\0" \|\| EncRef(function) \|\| EncRef(input_manifest) \|\|
			EncRef(environment) \|\| EncRef(executor_fingerprint))`, where `EncRef` is
			`ENC/ASL1-CORE` canonical bytes and `executor_fingerprint` is the canonical
			`digest reference. No other fields are included.`
			- `logs` (if present): order by `(kind, cid)` byte-lexicographically; cap to
			`64 entries; cap total log payload references to 1 MiB aggregate of capsule`
			`bytes. Reject out-of-order or oversized sets.`
			- `limits` (if present): exactly one TLV containing all numeric fields
			(`cpu_ms`, `wall_ms`, `max_rss_kib`, `io_reads`, `io_writes`) with fixed
			`units. Reject missing or duplicate fields.`
			`- Executor set validation:`
			`- If an expected executor set is supplied by policy, receipt executor_refs`
			`MUST match it exactly (same members, byte-order, no extras).`
			`- Otherwise, validate strict ordering and uniqueness, and require`
			`parity_len == executor_refs_len` with aligned ordering and `output_ref`
			`equality for every parity entry.`

			`References:`
			- `tier1/srs.md`
			- `tier1/enc-fer1-receipt-1.md`