Polish ASL index/log specs
This commit is contained in:
parent
c595e2370a
commit
20f092606d
54
AUDITS.md
54
AUDITS.md
|
|
@ -22,6 +22,60 @@ Verification notes:
|
||||||
- Prefer explicit commands and paths (e.g., `ctest --test-dir build`).
|
- Prefer explicit commands and paths (e.g., `ctest --test-dir build`).
|
||||||
- If results are user-reported, note that explicitly.
|
- If results are user-reported, note that explicitly.
|
||||||
|
|
||||||
|
Note: the filesystem ASL store (`asl_store_fs`) is a legacy convenience backend
|
||||||
|
and will be considered non-conformant to ASL index/log specs once the index/log
|
||||||
|
store is introduced. Audits for ASL index/log specs target the new backend only.
|
||||||
|
|
||||||
|
## Test Expectations (Planned)
|
||||||
|
|
||||||
|
These tests are planned to validate index/log behavior once implemented:
|
||||||
|
|
||||||
|
| Area | Example tests |
|
||||||
|
| --- | --- |
|
||||||
|
| Segment encoding | Round-trip encode/decode; CRC mismatch rejection; offset bounds checks |
|
||||||
|
| Log encoding | Hash-chain validation; unknown record type skip; truncated record rejection |
|
||||||
|
| Replay | Snapshot anchor + log replay determinism; segment seal visibility |
|
||||||
|
| Tombstones | Shadowing and lift across snapshots; domain-local shadowing rules |
|
||||||
|
| Visibility | CURRENT computed by `(SnapshotID, LogPosition)`; reverse seal-log order |
|
||||||
|
| Recovery | Crash with open segment; replay yields deterministic CURRENT |
|
||||||
|
|
||||||
|
## Spec Coverage (Implementation Status)
|
||||||
|
|
||||||
|
Status legend: ✅ implemented, 🟡 planned/in-progress, ⬜ not started.
|
||||||
|
|
||||||
|
| Spec | Status | Notes |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| `ASL/1-CORE` | ✅ | Core artifact semantics implemented. |
|
||||||
|
| `ASL/1-STORE` | ✅ | Store semantics + fs backend. |
|
||||||
|
| `ENC/ASL1-CORE` | ✅ | Artifact/Reference encoding. |
|
||||||
|
| `HASH/ASL1` | ✅ | Hash registry + streaming API. |
|
||||||
|
| `PEL/1-CORE` | ✅ | Core execution semantics. |
|
||||||
|
| `PEL/1-SURF` | ✅ | Store-backed surface execution. |
|
||||||
|
| `PEL/PROGRAM-DAG/1` | ✅ | DAG scheme execution. |
|
||||||
|
| `PEL/PROGRAM-DAG-DESC/1` | ✅ | Scheme descriptor codec + wiring. |
|
||||||
|
| `ENC/PEL-PROGRAM-DAG/1` | ✅ | Program encoding. |
|
||||||
|
| `ENC/PEL1-RESULT/1` | ✅ | Result encoding. |
|
||||||
|
| `PEL/TRACE-DAG/1` | ✅ | Trace semantics + wiring. |
|
||||||
|
| `ENC/PEL-TRACE-DAG/1` | ✅ | Trace encoding. |
|
||||||
|
| `TGK/1-CORE` | ✅ | Edge semantics + validation. |
|
||||||
|
| `ENC/TGK1-EDGE/1` | ✅ | Edge encoding. |
|
||||||
|
| `TGK/STORE/1` | ✅ | Store semantics. |
|
||||||
|
| `TGK/PROV/1` | ✅ | Provenance operators. |
|
||||||
|
| `OPREG/PEL1-KERNEL` | ✅ | Kernel op registry. |
|
||||||
|
| `OPREG/PEL1-KERNEL-PARAMS/1` | ✅ | Kernel params encoding. |
|
||||||
|
| `AMDUAT20-STACK-OVERVIEW` | ✅ | Orientation surface aligned. |
|
||||||
|
| `ASL/1-CORE-INDEX` | 🟡 | Spec clarified; implementation pending. |
|
||||||
|
| `ASL/STORE-INDEX/1` | 🟡 | Spec clarified; implementation pending. |
|
||||||
|
| `ENC/ASL-CORE-INDEX/1` | 🟡 | Encoding planned. |
|
||||||
|
| `ASL/LOG/1` | 🟡 | Log semantics planned. |
|
||||||
|
| `ENC/ASL-LOG/1` | 🟡 | Encoding planned. |
|
||||||
|
| `ASL/INDEX-ACCEL/1` | 🟡 | Semantics planned. |
|
||||||
|
| `ASL/INDEXES/1` | 🟡 | Taxonomy planned. |
|
||||||
|
| `ASL/TGK-EXEC-PLAN/1` | 🟡 | Encoding-only plan; executor out of scope. |
|
||||||
|
| `ENC/ASL-TGK-EXEC-PLAN/1` | 🟡 | Encoding planned. |
|
||||||
|
| `ASL/SYSTEM/1` | 🟡 | Cross-cutting view planned. |
|
||||||
|
| `TGK/1` | 🟡 | Semantic layer planned. |
|
||||||
|
|
||||||
## Audit Plan
|
## Audit Plan
|
||||||
|
|
||||||
Status legend: ✅ completed, ⬜ pending.
|
Status legend: ✅ completed, ⬜ pending.
|
||||||
|
|
|
||||||
13
README.md
13
README.md
|
|
@ -65,6 +65,19 @@ status and refs are printed to stderr.
|
||||||
when not using `--output-raw`.
|
when not using `--output-raw`.
|
||||||
- The filesystem ASL store layout expects digests at least 2 bytes long
|
- The filesystem ASL store layout expects digests at least 2 bytes long
|
||||||
(two directory levels). Experimental shorter digests need a different store.
|
(two directory levels). Experimental shorter digests need a different store.
|
||||||
|
- The filesystem ASL store (`amduat-asl ... --root`) is a legacy convenience
|
||||||
|
backend; once the index/log store is introduced it is considered
|
||||||
|
non-conformant to ASL index/log specs and should be used only for quickstart
|
||||||
|
demos.
|
||||||
|
- Compatibility & migration: existing `asl_store_fs` stores will not be
|
||||||
|
automatically upgraded. Plan to re-ingest artifacts into the index/log store
|
||||||
|
when it lands.
|
||||||
|
|
||||||
|
## Documentation
|
||||||
|
|
||||||
|
- Implementation clarifications: `docs/spec-clarifications.md`
|
||||||
|
- Spec coverage matrix: `AUDITS.md` (Spec Coverage section)
|
||||||
|
- Index/log API sketch: `docs/index-log-api-sketch.md`
|
||||||
|
|
||||||
## PEL reference
|
## PEL reference
|
||||||
|
|
||||||
|
|
|
||||||
58
docs/index-log-api-sketch.md
Normal file
58
docs/index-log-api-sketch.md
Normal file
|
|
@ -0,0 +1,58 @@
|
||||||
|
# Index/Log API Surface (Sketch)
|
||||||
|
|
||||||
|
This document is a one-page sketch of the planned public API for ASL index/log
|
||||||
|
support. It is non-normative and intended to guide header design.
|
||||||
|
|
||||||
|
## ASL Index/Log Types (Draft)
|
||||||
|
|
||||||
|
```
|
||||||
|
typedef uint64_t amduat_asl_snapshot_id_t;
|
||||||
|
typedef uint64_t amduat_asl_log_position_t; // inclusive logseq upper bound
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
amduat_asl_snapshot_id_t snapshot_id;
|
||||||
|
amduat_asl_log_position_t log_position;
|
||||||
|
} amduat_asl_index_state_t;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Core Store API (Draft)
|
||||||
|
|
||||||
|
```
|
||||||
|
// Initialization and config.
|
||||||
|
bool amduat_asl_store_index_init(...);
|
||||||
|
|
||||||
|
// PUT/GET with index state reporting.
|
||||||
|
amduat_asl_store_error_t amduat_asl_store_put_indexed(
|
||||||
|
amduat_asl_store_t *store,
|
||||||
|
amduat_artifact_t artifact,
|
||||||
|
amduat_reference_t *out_ref,
|
||||||
|
amduat_asl_index_state_t *out_state);
|
||||||
|
|
||||||
|
amduat_asl_store_error_t amduat_asl_store_get_indexed(
|
||||||
|
amduat_asl_store_t *store,
|
||||||
|
amduat_reference_t ref,
|
||||||
|
amduat_asl_index_state_t state,
|
||||||
|
amduat_artifact_t *out_artifact);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Index/Log Introspection (Draft)
|
||||||
|
|
||||||
|
```
|
||||||
|
// Snapshot/log position queries.
|
||||||
|
bool amduat_asl_index_current_state(amduat_asl_store_t *store,
|
||||||
|
amduat_asl_index_state_t *out_state);
|
||||||
|
|
||||||
|
// Segment and log inspection (read-only).
|
||||||
|
bool amduat_asl_log_scan(amduat_asl_store_t *store, ...);
|
||||||
|
bool amduat_asl_segment_scan(amduat_asl_store_t *store, ...);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Expected Error Surfaces
|
||||||
|
|
||||||
|
* `AMDUAT_ASL_STORE_ERR_INTEGRITY` for malformed index segments or log records.
|
||||||
|
* `AMDUAT_ASL_STORE_ERR_IO` for underlying I/O faults.
|
||||||
|
* `AMDUAT_ASL_STORE_ERR_NOT_FOUND` for absent artifacts or missing segments.
|
||||||
|
* `AMDUAT_ASL_STORE_ERR_UNSUPPORTED` for unsupported encoding versions.
|
||||||
|
|
||||||
|
These are illustrative; exact error codes and mapping will be finalized when
|
||||||
|
headers are introduced.
|
||||||
|
|
@ -4,6 +4,19 @@ This document records implementation-level clarifications for draft Tier-1
|
||||||
specs. These notes do not change the specs; they document concrete choices for
|
specs. These notes do not change the specs; they document concrete choices for
|
||||||
the implementation in this repository.
|
the implementation in this repository.
|
||||||
|
|
||||||
|
## Glossary and Abbreviations
|
||||||
|
|
||||||
|
| Term | Meaning |
|
||||||
|
| --- | --- |
|
||||||
|
| CURRENT | Effective index state after replaying a log position on a snapshot. |
|
||||||
|
| LogPosition | Inclusive `logseq` upper bound for replay (not a byte offset). |
|
||||||
|
| SnapshotID | Opaque `uint64_t` identifier persisted in `SNAPSHOT_ANCHOR`. |
|
||||||
|
| Segment seal | Log record admitting a segment via `(segment_id, segment_hash)`. |
|
||||||
|
| Segment hash | SHA-256 over exact on-disk segment bytes, including footer. |
|
||||||
|
| Tombstone | Visibility policy record applied during replay. |
|
||||||
|
| Tombstone lift | Cancels a specific tombstone record for the same artifact. |
|
||||||
|
| Exec plan | Serialized plan format; executor out of scope for core library. |
|
||||||
|
|
||||||
## Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG)
|
## Snapshot and Log Identity (ASL/STORE-INDEX + ASL/LOG)
|
||||||
|
|
||||||
Decision:
|
Decision:
|
||||||
|
|
|
||||||
|
|
@ -61,7 +61,7 @@ ASL/1-CORE-INDEX defines the **semantic model** for indexing artifacts:
|
||||||
|
|
||||||
* It specifies what it means to map an artifact identity to a byte location.
|
* It specifies what it means to map an artifact identity to a byte location.
|
||||||
* It defines visibility, immutability, and shadowing semantics.
|
* It defines visibility, immutability, and shadowing semantics.
|
||||||
* It ensures deterministic lookup for a fixed snapshot and log prefix.
|
* It ensures deterministic lookup for a fixed snapshot and log position.
|
||||||
|
|
||||||
### 1.2 Non-goals
|
### 1.2 Non-goals
|
||||||
|
|
||||||
|
|
@ -84,9 +84,11 @@ ASL/1-CORE-INDEX explicitly does **not** define:
|
||||||
* **BlockID** — opaque identifier for a block.
|
* **BlockID** — opaque identifier for a block.
|
||||||
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
|
* **ArtifactExtent** — `(BlockID, offset, length)` identifying a byte slice within a block.
|
||||||
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
|
* **ArtifactLocation** — ordered list of `ArtifactExtent` values that, when concatenated, produce the artifact bytes.
|
||||||
|
* **Degenerate store** — a store that treats each artifact as its own block,
|
||||||
|
with a single extent covering the entire blob.
|
||||||
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
|
* **Snapshot** — a checkpointed StoreSnapshot (ASL/1-STORE) used as a base state.
|
||||||
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
|
* **Append-Only Log** — ordered sequence of index-visible mutations after a snapshot.
|
||||||
* **CURRENT** — effective state after replaying a log prefix on a snapshot.
|
* **CURRENT** — effective state after replaying a log position on a snapshot.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -104,7 +106,7 @@ For any visible `Reference`, there is exactly one `ArtifactLocation` at a given
|
||||||
|
|
||||||
### 3.2 Determinism
|
### 3.2 Determinism
|
||||||
|
|
||||||
For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
|
For a fixed `{StoreConfig, Snapshot, LogPosition}`, lookup results MUST be deterministic. No nondeterministic input may affect index semantics.
|
||||||
|
|
||||||
### 3.3 StoreConfig Consistency
|
### 3.3 StoreConfig Consistency
|
||||||
|
|
||||||
|
|
@ -123,6 +125,8 @@ All references in an index view are interpreted under a fixed StoreConfig. Imple
|
||||||
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
||||||
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
||||||
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
|
* ASL/1-CORE-INDEX does not define how blocks are allocated or sealed; it only requires that referenced bytes are immutable for the lifetime of the mapping.
|
||||||
|
* In a degenerate store, an ArtifactLocation consists of a single extent that
|
||||||
|
spans the full blob in its dedicated block.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -130,11 +134,19 @@ All references in an index view are interpreted under a fixed StoreConfig. Imple
|
||||||
|
|
||||||
An index entry is **visible** at CURRENT if and only if:
|
An index entry is **visible** at CURRENT if and only if:
|
||||||
|
|
||||||
1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
|
1. The entry is admitted by the store's visibility mechanism as defined in
|
||||||
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
|
`ASL/STORE-INDEX/1` (e.g., via sealed segments and an append-only log), for
|
||||||
|
the given snapshot/log position.
|
||||||
|
2. The referenced bytes are immutable (e.g., the underlying block is sealed by
|
||||||
|
store rules).
|
||||||
|
|
||||||
Visibility is binary; entries are either visible or not visible.
|
Visibility is binary; entries are either visible or not visible.
|
||||||
|
|
||||||
|
**Implementation note:** A store MAY implement a degenerate visibility
|
||||||
|
mechanism (e.g., a single implicit segment that is always sealed and a trivial
|
||||||
|
log position), which is sufficient for simple filesystem-backed stores such as
|
||||||
|
`asl_store_fs`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Snapshot and Log Semantics
|
## 6. Snapshot and Log Semantics
|
||||||
|
|
@ -144,7 +156,7 @@ Snapshots provide a base mapping of sealed segments; the append-only log admits
|
||||||
The index state for a given CURRENT is defined as:
|
The index state for a given CURRENT is defined as:
|
||||||
|
|
||||||
```
|
```
|
||||||
Index(CURRENT) = Index(snapshot) + replay(log_prefix)
|
Index(CURRENT) = Index(snapshot) + replay(log_position)
|
||||||
```
|
```
|
||||||
|
|
||||||
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
|
Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entries are semantically equivalent once replayed.
|
||||||
|
|
@ -169,11 +181,14 @@ Replay is strictly ordered, deterministic, and idempotent. Snapshot and log entr
|
||||||
|
|
||||||
## 8. Tombstones (Optional)
|
## 8. Tombstones (Optional)
|
||||||
|
|
||||||
Tombstone entries MAY be used to invalidate prior mappings.
|
Tombstones MAY be used to invalidate prior mappings.
|
||||||
|
|
||||||
* A tombstone shadows earlier entries for the same Reference.
|
* A tombstone shadows earlier entries for the same Reference.
|
||||||
* Visibility rules are identical to regular entries.
|
* Tombstones are visibility policy records (see `ASL/LOG/1`) and are applied
|
||||||
* Encoding is optional and defined by ENC-ASL-CORE-INDEX if used.
|
during replay; they are not required to appear as index entries.
|
||||||
|
* If an encoding chooses to materialize tombstones in index segments, they MUST
|
||||||
|
have no `ArtifactLocation` and MUST follow the same visibility rules as other
|
||||||
|
entries.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -261,6 +261,21 @@ To reconstruct CURRENT:
|
||||||
|
|
||||||
Replay MUST be deterministic.
|
Replay MUST be deterministic.
|
||||||
|
|
||||||
|
### 5.1 Example: Tombstone + Lift Across Snapshots (Informative)
|
||||||
|
|
||||||
|
Let `R` be an artifact reference. Consider the following log sequence:
|
||||||
|
|
||||||
|
1. `logseq = 10`: `SEGMENT_SEAL` admits a segment containing `R`.
|
||||||
|
2. `logseq = 20`: `TOMBSTONE(R)` shadows `R`.
|
||||||
|
3. `logseq = 30`: `SNAPSHOT_ANCHOR(snapshot_id = 7)` is recorded.
|
||||||
|
4. `logseq = 40`: `TOMBSTONE_LIFT(R, tombstone_logseq = 20)` is recorded.
|
||||||
|
|
||||||
|
Replay rules:
|
||||||
|
|
||||||
|
* CURRENT at `(snapshot_id = 7, log_position = 30)` includes the tombstone,
|
||||||
|
because the lift occurs after the snapshot.
|
||||||
|
* CURRENT at `(snapshot_id = 7, log_position = 40)` lifts the tombstone and `R`
|
||||||
|
becomes visible again (assuming no later tombstones).
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Index Interaction
|
## 6. Index Interaction
|
||||||
|
|
|
||||||
|
|
@ -58,6 +58,10 @@ It specifies:
|
||||||
|
|
||||||
It **does not define encoding** (see `ENC/ASL-CORE-INDEX/1`) or semantic mapping (see `ASL/1-CORE-INDEX`).
|
It **does not define encoding** (see `ENC/ASL-CORE-INDEX/1`) or semantic mapping (see `ASL/1-CORE-INDEX`).
|
||||||
|
|
||||||
|
**Implementation note:** A degenerate store that skips segments/log replay (for
|
||||||
|
example, simple filesystem backends) is non-conformant to ASL/STORE-INDEX/1 and
|
||||||
|
is intended only for quickstart or legacy use.
|
||||||
|
|
||||||
**Informative references:**
|
**Informative references:**
|
||||||
|
|
||||||
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
* `ASL/SYSTEM/1` — unified system view (PEL/TGK/federation alignment)
|
||||||
|
|
@ -183,7 +187,7 @@ get(ArtifactKey, IndexState?) -> bytes | NOT_FOUND
|
||||||
|
|
||||||
### 4.5 GET Semantics
|
### 4.5 GET Semantics
|
||||||
|
|
||||||
1. Resolve `ArtifactKey -> ArtifactLocation` using `Index(snapshot, log_prefix)`.
|
1. Resolve `ArtifactKey -> ArtifactLocation` using `Index(snapshot, log_position)`.
|
||||||
2. If no entry exists, return `NOT_FOUND`.
|
2. If no entry exists, return `NOT_FOUND`.
|
||||||
3. Otherwise, read exactly the referenced `(BlockID, offset, length)` bytes and return them verbatim.
|
3. Otherwise, read exactly the referenced `(BlockID, offset, length)` bytes and return them verbatim.
|
||||||
|
|
||||||
|
|
@ -235,6 +239,30 @@ Notes:
|
||||||
* Open segments need not survive snapshot.
|
* Open segments need not survive snapshot.
|
||||||
* Segments below snapshot are replay anchors.
|
* Segments below snapshot are replay anchors.
|
||||||
|
|
||||||
|
### 5.3.1 Segment State Machine (Informative)
|
||||||
|
|
||||||
|
```
|
||||||
|
OPEN -> SEALED -> VISIBLE -> GC_ELIGIBLE
|
||||||
|
```
|
||||||
|
|
||||||
|
* **OPEN:** accepting new index records; not visible.
|
||||||
|
* **SEALED:** immutable on disk; not yet visible until log-admitted.
|
||||||
|
* **VISIBLE:** seal record admitted by log replay; visible for lookup.
|
||||||
|
* **GC_ELIGIBLE:** no snapshots/log positions reference the segment.
|
||||||
|
|
||||||
|
### 5.4 Index/Log Bootstrap Flow (Informative)
|
||||||
|
|
||||||
|
1. **Initialize store**: load latest snapshot anchor (if any); otherwise start
|
||||||
|
with an empty index.
|
||||||
|
2. **Load sealed segments**: from snapshot metadata, locate segment files and
|
||||||
|
verify their hashes before admitting them.
|
||||||
|
3. **Replay log**: scan records with `logseq > snapshot.logseq` in order and
|
||||||
|
apply `SEGMENT_SEAL`, tombstones, and lifts.
|
||||||
|
4. **Compute CURRENT**: resolve visibility and shadowing to produce the
|
||||||
|
effective index view for queries.
|
||||||
|
|
||||||
|
This flow is deterministic and idempotent; re-running it yields the same
|
||||||
|
CURRENT state for a fixed `(SnapshotID, LogPosition)`.
|
||||||
---
|
---
|
||||||
|
|
||||||
## 7. Visibility and Lookup Semantics
|
## 7. Visibility and Lookup Semantics
|
||||||
|
|
@ -260,7 +288,7 @@ To resolve an `ArtifactKey`:
|
||||||
|
|
||||||
Determinism:
|
Determinism:
|
||||||
|
|
||||||
* Lookup results are identical across platforms given the same snapshot and log prefix.
|
* Lookup results are identical across platforms given the same snapshot and log position.
|
||||||
* Accelerations (bloom filters, sharding, SIMD) **do not alter correctness**.
|
* Accelerations (bloom filters, sharding, SIMD) **do not alter correctness**.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -394,6 +422,16 @@ Invariant: GC must never remove bytes still referenced by CURRENT or snapshots.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 13.1 Conformance Checklist (Informative)
|
||||||
|
|
||||||
|
* Reject visibility for any entry not admitted by replay.
|
||||||
|
* Enforce immutability of sealed blocks and visible segments.
|
||||||
|
* Ensure replay is deterministic and idempotent for a fixed index state.
|
||||||
|
* Verify tombstone + lift behavior across snapshots.
|
||||||
|
* Prevent GC of segments/blocks referenced by CURRENT or snapshots.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 14. Non-Goals
|
## 14. Non-Goals
|
||||||
|
|
||||||
* Disk-level encoding (ENC-ASL-CORE-INDEX).
|
* Disk-level encoding (ENC-ASL-CORE-INDEX).
|
||||||
|
|
|
||||||
|
|
@ -94,12 +94,12 @@ All of these objects are addressed and stored via the same index semantics.
|
||||||
|
|
||||||
## 3. Determinism & Snapshot Boundaries
|
## 3. Determinism & Snapshot Boundaries
|
||||||
|
|
||||||
For a fixed `(SnapshotID, LogPrefix)`:
|
For a fixed `(SnapshotID, LogPosition)`:
|
||||||
|
|
||||||
* Index lookup is deterministic (ASL/1-CORE-INDEX).
|
* Index lookup is deterministic (ASL/1-CORE-INDEX).
|
||||||
* TGK traversal is deterministic when bounded by the same snapshot/log prefix.
|
* TGK traversal is deterministic when bounded by the same snapshot/log position.
|
||||||
* PEL execution is deterministic when its inputs are bounded by the same
|
* PEL execution is deterministic when its inputs are bounded by the same
|
||||||
snapshot/log prefix.
|
snapshot/log position.
|
||||||
|
|
||||||
PEL MUST read only snapshot-scoped artifacts and receipts. It MUST NOT depend
|
PEL MUST read only snapshot-scoped artifacts and receipts. It MUST NOT depend
|
||||||
on storage layout, block packing, or non-snapshot metadata.
|
on storage layout, block packing, or non-snapshot metadata.
|
||||||
|
|
@ -144,11 +144,11 @@ receipt annotations, not by changing the execution language.
|
||||||
## 5.1 PERs and Snapshot State (Clarification)
|
## 5.1 PERs and Snapshot State (Clarification)
|
||||||
|
|
||||||
PERs are artifacts that bind deterministic execution to a specific snapshot
|
PERs are artifacts that bind deterministic execution to a specific snapshot
|
||||||
and log prefix. They do not introduce a separate storage layer:
|
and log position. They do not introduce a separate storage layer:
|
||||||
|
|
||||||
* The sequential log and snapshot define CURRENT.
|
* The sequential log and snapshot define CURRENT.
|
||||||
* A PER records that execution observed CURRENT at a specific log prefix.
|
* A PER records that execution observed CURRENT at a specific log position.
|
||||||
* Replay uses the same snapshot + log prefix to reconstruct inputs.
|
* Replay uses the same snapshot + log position to reconstruct inputs.
|
||||||
* PERs are artifacts and MAY be used as inputs, but programs embedded in
|
* PERs are artifacts and MAY be used as inputs, but programs embedded in
|
||||||
receipts MUST NOT be executed implicitly.
|
receipts MUST NOT be executed implicitly.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -76,7 +76,7 @@ Each operator includes:
|
||||||
* `op_id`: unique identifier
|
* `op_id`: unique identifier
|
||||||
* `op_type`: operator type
|
* `op_type`: operator type
|
||||||
* `inputs`: upstream operator outputs
|
* `inputs`: upstream operator outputs
|
||||||
* `snapshot`: `(SnapshotID, LogPrefix)`
|
* `snapshot`: `(SnapshotID, LogPosition)` (inclusive logseq upper bound)
|
||||||
* `constraints`: canonical filters
|
* `constraints`: canonical filters
|
||||||
* `projections`: output fields
|
* `projections`: output fields
|
||||||
* `traversal`: optional traversal parameters
|
* `traversal`: optional traversal parameters
|
||||||
|
|
@ -122,7 +122,7 @@ Parallel execution MUST preserve this order.
|
||||||
|
|
||||||
Records are visible if and only if:
|
Records are visible if and only if:
|
||||||
|
|
||||||
* `record.logseq <= snapshot.log_prefix`
|
* `record.logseq <= snapshot.log_position`
|
||||||
* The record is not shadowed by a later tombstone
|
* The record is not shadowed by a later tombstone
|
||||||
|
|
||||||
Unknown record types MUST be skipped without breaking determinism.
|
Unknown record types MUST be skipped without breaking determinism.
|
||||||
|
|
@ -136,7 +136,7 @@ Unknown record types MUST be skipped without breaking determinism.
|
||||||
* Inputs: sealed segments
|
* Inputs: sealed segments
|
||||||
* Outputs: raw record references
|
* Outputs: raw record references
|
||||||
* Rules:
|
* Rules:
|
||||||
* Only segments with `segment.logseq_min <= snapshot.log_prefix` are scanned.
|
* Only segments with `segment.logseq_min <= snapshot.log_position` are scanned.
|
||||||
* Advisory filters MAY be applied but MUST NOT introduce false negatives.
|
* Advisory filters MAY be applied but MUST NOT introduce false negatives.
|
||||||
* Shard routing MAY be applied prior to scan if deterministic.
|
* Shard routing MAY be applied prior to scan if deterministic.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -99,6 +99,24 @@ Each index segment file is laid out as follows:
|
||||||
+------------------+
|
+------------------+
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Boxed sketch:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌───────────────────────┐
|
||||||
|
│ SegmentHeader │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ BloomFilter[] (opt) │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ IndexRecord[] │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ DigestBytes[] │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ ExtentRecord[] │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ SegmentFooter │
|
||||||
|
└───────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
* **SegmentHeader**: fixed-size, mandatory
|
* **SegmentHeader**: fixed-size, mandatory
|
||||||
* **BloomFilter**: optional, opaque, segment-local
|
* **BloomFilter**: optional, opaque, segment-local
|
||||||
* **IndexRecord[]**: array of index entries
|
* **IndexRecord[]**: array of index entries
|
||||||
|
|
@ -336,6 +354,20 @@ must occur after the footer is finalized.
|
||||||
* Legacy segments without federation fields are treated as local/internal (see 3.2).
|
* Legacy segments without federation fields are treated as local/internal (see 3.2).
|
||||||
* Tombstones MUST NOT shadow artifacts from other domains; domain matching is required.
|
* Tombstones MUST NOT shadow artifacts from other domains; domain matching is required.
|
||||||
|
|
||||||
|
### 10.2 Error Handling (Normative)
|
||||||
|
|
||||||
|
Readers MUST treat malformed segment files as invalid and MUST reject them.
|
||||||
|
Examples include (non-exhaustive):
|
||||||
|
|
||||||
|
* Incorrect magic/version/header size
|
||||||
|
* Offsets not aligned or not pointing to the expected arrays
|
||||||
|
* Out-of-range lengths or overflows in size calculations
|
||||||
|
* CRC mismatch for the segment payload
|
||||||
|
* Invalid federation fields or flag bits
|
||||||
|
|
||||||
|
Rejected segments MUST NOT be admitted for lookup or replay. Implementations MAY
|
||||||
|
surface diagnostic errors, but MUST NOT attempt partial salvage.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 11. Alignment and Packing
|
## 11. Alignment and Packing
|
||||||
|
|
@ -359,6 +391,15 @@ The ENC-ASL-CORE-INDEX specification ensures:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 12.1 Error Mapping (Informative)
|
||||||
|
|
||||||
|
Decoding failures (invalid magic/version, malformed offsets, CRC mismatch,
|
||||||
|
invalid federation fields) MUST be surfaced to callers as decode errors. The
|
||||||
|
exact error codes are implementation-specific; examples include
|
||||||
|
`ERR_ASL_INDEX_ENC_INVALID`, `ERR_ASL_INDEX_CRC_MISMATCH`, or a generic
|
||||||
|
`ERR_INTEGRITY`. Encoders/decoders MUST NOT treat malformed segments as valid
|
||||||
|
or partially recoverable.
|
||||||
|
|
||||||
## 13. Relationship to Other Layers
|
## 13. Relationship to Other Layers
|
||||||
|
|
||||||
| Layer | Responsibility |
|
| Layer | Responsibility |
|
||||||
|
|
|
||||||
|
|
@ -69,6 +69,16 @@ It does **not** define log semantics (see `ASL/LOG/1`).
|
||||||
+----------------+
|
+----------------+
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Boxed sketch:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌───────────────────────┐
|
||||||
|
│ LogHeader │
|
||||||
|
├───────────────────────┤
|
||||||
|
│ LogRecord[] │
|
||||||
|
└───────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
* **LogHeader**: fixed-size, mandatory, begins file
|
* **LogHeader**: fixed-size, mandatory, begins file
|
||||||
* **LogRecord[]**: append-only entries, variable number
|
* **LogRecord[]**: append-only entries, variable number
|
||||||
|
|
||||||
|
|
@ -123,6 +133,14 @@ record_hash = H(prev_record_hash || logseq || record_type || payload_len || payl
|
||||||
Readers MUST skip unknown `record_type` values using `payload_len` and MUST
|
Readers MUST skip unknown `record_type` values using `payload_len` and MUST
|
||||||
continue replay without failure.
|
continue replay without failure.
|
||||||
|
|
||||||
|
**Error handling (normative):**
|
||||||
|
|
||||||
|
* Malformed log headers or records (bad magic/version, truncated payload,
|
||||||
|
invalid `payload_len`, hash-chain mismatch) MUST cause the log to be rejected
|
||||||
|
for replay.
|
||||||
|
* Unknown `record_type` values are the only exception: they MUST be skipped
|
||||||
|
using `payload_len` and MUST NOT break replay determinism.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 6. Record Type IDs (v1)
|
## 6. Record Type IDs (v1)
|
||||||
|
|
@ -245,6 +263,21 @@ typedef struct {
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 7.1 Error Mapping (Informative)
|
||||||
|
|
||||||
|
Decoding failures (invalid magic/version, truncated records, invalid payload
|
||||||
|
lengths, hash-chain mismatches) MUST be surfaced to callers as decode errors.
|
||||||
|
The exact error codes are implementation-specific; examples include
|
||||||
|
`ERR_ASL_LOG_ENC_INVALID`, `ERR_ASL_LOG_HASH_MISMATCH`, or a generic
|
||||||
|
`ERR_INTEGRITY`. Unknown record types are not errors and must be skipped.
|
||||||
|
|
||||||
|
## 7.2 Conformance Checklist (Informative)
|
||||||
|
|
||||||
|
* Reject logs with invalid magic/version or truncated records.
|
||||||
|
* Enforce hash-chain validation across all records.
|
||||||
|
* Skip unknown record types using `payload_len` without breaking replay.
|
||||||
|
* Treat malformed payload lengths as fatal decode errors.
|
||||||
|
|
||||||
## 8. Relationship to Other Layers
|
## 8. Relationship to Other Layers
|
||||||
|
|
||||||
| Layer | Responsibility |
|
| Layer | Responsibility |
|
||||||
|
|
|
||||||
|
|
@ -90,6 +90,9 @@ typedef struct {
|
||||||
} snapshot_range_t;
|
} snapshot_range_t;
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Note:** `logseq_max` corresponds to the `LogPosition` upper bound referenced
|
||||||
|
by `ASL/TGK-EXEC-PLAN/1`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. Operator Parameter Union
|
## 4. Operator Parameter Union
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue