Refine index specs for variable digests and visibility
This commit is contained in:
parent
f2225f7a73
commit
c2000cb6d7
|
|
@ -87,7 +87,7 @@ For a fixed `{StoreConfig, Snapshot, LogPrefix}`, lookup results MUST be determi
|
||||||
|
|
||||||
### 3.3 StoreConfig Consistency
|
### 3.3 StoreConfig Consistency
|
||||||
|
|
||||||
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`.
|
All references in an index view are interpreted under a fixed StoreConfig. Implementations MAY store only the digest portion in the index when `hash_id` is fixed by StoreConfig, but the semantic key is always a full `Reference`. Encoding profiles MUST allow variable-length digests; the digest length MUST be either explicit in the encoding or derivable from `hash_id` and StoreConfig.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -97,6 +97,7 @@ All references in an index view are interpreted under a fixed StoreConfig. Imple
|
||||||
* Each extent references immutable bytes within a block.
|
* Each extent references immutable bytes within a block.
|
||||||
* The artifact bytes are defined by **concatenating extents in order**.
|
* The artifact bytes are defined by **concatenating extents in order**.
|
||||||
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
|
* A visible ArtifactLocation MUST be **non-empty** and MUST fully cover the artifact byte sequence with no gaps or extra bytes.
|
||||||
|
* Tombstone entries are visible but MUST have no ArtifactLocation; they only shadow prior entries.
|
||||||
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
|
* Extents MUST have `length > 0` and MUST reference valid byte ranges within their blocks.
|
||||||
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
* Extents MAY refer to the same BlockID multiple times, but the ordered concatenation MUST be deterministic and exact.
|
||||||
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
* An ArtifactLocation is valid only while all referenced blocks are retained.
|
||||||
|
|
@ -108,7 +109,7 @@ All references in an index view are interpreted under a fixed StoreConfig. Imple
|
||||||
|
|
||||||
An index entry is **visible** at CURRENT if and only if:
|
An index entry is **visible** at CURRENT if and only if:
|
||||||
|
|
||||||
1. The entry is admitted in the ordered log prefix for CURRENT.
|
1. The entry is contained in a sealed segment whose seal record is admitted in the ordered log prefix for CURRENT (or anchored in the snapshot).
|
||||||
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
|
2. The referenced bytes are immutable (e.g., the underlying block is sealed by store rules).
|
||||||
|
|
||||||
Visibility is binary; entries are either visible or not visible.
|
Visibility is binary; entries are either visible or not visible.
|
||||||
|
|
@ -117,7 +118,7 @@ Visibility is binary; entries are either visible or not visible.
|
||||||
|
|
||||||
## 6. Snapshot and Log Semantics
|
## 6. Snapshot and Log Semantics
|
||||||
|
|
||||||
Snapshots provide a base mapping; the append-only log defines subsequent changes.
|
Snapshots provide a base mapping of sealed segments; the append-only log admits later segment seals and policy records that define subsequent changes.
|
||||||
|
|
||||||
The index state for a given CURRENT is defined as:
|
The index state for a given CURRENT is defined as:
|
||||||
|
|
||||||
|
|
@ -175,12 +176,12 @@ ASL/1-CORE-INDEX guarantees:
|
||||||
|
|
||||||
Conforming implementations MUST enforce:
|
Conforming implementations MUST enforce:
|
||||||
|
|
||||||
1. No visibility without a log-admitted entry.
|
1. No visibility without a sealed segment whose seal record is log-admitted (or snapshot-anchored).
|
||||||
2. No mutation of visible index entries.
|
2. No mutation of visible index entries.
|
||||||
3. Referenced bytes remain immutable for the entry’s lifetime.
|
3. Referenced bytes remain immutable for the entry’s lifetime.
|
||||||
4. Shadowing follows strict log order.
|
4. Shadowing follows strict log order.
|
||||||
5. Snapshot + log replay uniquely defines CURRENT.
|
5. Snapshot + log replay uniquely defines CURRENT.
|
||||||
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun).
|
6. Visible ArtifactLocations are non-empty and byte-exact (no gaps, no overrun), except for tombstones which have no ArtifactLocation.
|
||||||
|
|
||||||
Violation of any invariant constitutes index corruption.
|
Violation of any invariant constitutes index corruption.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -154,7 +154,7 @@ Notes:
|
||||||
To resolve an `ArtifactKey`:
|
To resolve an `ArtifactKey`:
|
||||||
|
|
||||||
1. Identify all visible segments ≤ CURRENT.
|
1. Identify all visible segments ≤ CURRENT.
|
||||||
2. Search segments in **reverse creation order** (newest first).
|
2. Search segments in **reverse seal-log order** (highest seal log position first).
|
||||||
3. Return first matching entry.
|
3. Return first matching entry.
|
||||||
4. Respect tombstones to shadow prior entries.
|
4. Respect tombstones to shadow prior entries.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -9,6 +9,7 @@
|
||||||
This document defines the **exact encoding of ASL index segments** and records for storage and interoperability.
|
This document defines the **exact encoding of ASL index segments** and records for storage and interoperability.
|
||||||
|
|
||||||
It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**.
|
It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**.
|
||||||
|
Variable-length digest requirements are defined in ASL/1-CORE-INDEX (`tier1/asl-core-index.md`).
|
||||||
|
|
||||||
It is intended for:
|
It is intended for:
|
||||||
|
|
||||||
|
|
@ -50,6 +51,8 @@ Each index segment file is laid out as follows:
|
||||||
+------------------+
|
+------------------+
|
||||||
| IndexRecord[] |
|
| IndexRecord[] |
|
||||||
+------------------+
|
+------------------+
|
||||||
|
| DigestBytes[] |
|
||||||
|
+------------------+
|
||||||
| ExtentRecord[] |
|
| ExtentRecord[] |
|
||||||
+------------------+
|
+------------------+
|
||||||
| SegmentFooter |
|
| SegmentFooter |
|
||||||
|
|
@ -59,6 +62,7 @@ Each index segment file is laid out as follows:
|
||||||
* **SegmentHeader**: fixed-size, mandatory
|
* **SegmentHeader**: fixed-size, mandatory
|
||||||
* **BloomFilter**: optional, opaque, segment-local
|
* **BloomFilter**: optional, opaque, segment-local
|
||||||
* **IndexRecord[]**: array of index entries
|
* **IndexRecord[]**: array of index entries
|
||||||
|
* **DigestBytes[]**: concatenated digest bytes referenced by IndexRecord
|
||||||
* **ExtentRecord[]**: concatenated extent lists referenced by IndexRecord
|
* **ExtentRecord[]**: concatenated extent lists referenced by IndexRecord
|
||||||
* **SegmentFooter**: fixed-size, mandatory
|
* **SegmentFooter**: fixed-size, mandatory
|
||||||
|
|
||||||
|
|
@ -85,6 +89,9 @@ typedef struct {
|
||||||
uint64_t bloom_offset; // File offset of bloom filter (0 if none)
|
uint64_t bloom_offset; // File offset of bloom filter (0 if none)
|
||||||
uint64_t bloom_size; // Size of bloom filter (0 if none)
|
uint64_t bloom_size; // Size of bloom filter (0 if none)
|
||||||
|
|
||||||
|
uint64_t digests_offset; // File offset of DigestBytes array
|
||||||
|
uint64_t digests_size; // Total size in bytes of DigestBytes
|
||||||
|
|
||||||
uint64_t extents_offset; // File offset of ExtentRecord array
|
uint64_t extents_offset; // File offset of ExtentRecord array
|
||||||
uint64_t extent_count; // Total number of ExtentRecord entries
|
uint64_t extent_count; // Total number of ExtentRecord entries
|
||||||
|
|
||||||
|
|
@ -97,7 +104,7 @@ typedef struct {
|
||||||
|
|
||||||
* `magic` ensures the reader validates the segment type.
|
* `magic` ensures the reader validates the segment type.
|
||||||
* `version` allows forward-compatible extension.
|
* `version` allows forward-compatible extension.
|
||||||
* `snapshot_min` / `snapshot_max` define visibility semantics.
|
* `snapshot_min` / `snapshot_max` are reserved for future use and carry no visibility semantics in version 3.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -106,10 +113,10 @@ typedef struct {
|
||||||
```c
|
```c
|
||||||
#pragma pack(push,1)
|
#pragma pack(push,1)
|
||||||
typedef struct {
|
typedef struct {
|
||||||
uint64_t hash_hi; // High 64 bits of artifact hash
|
uint32_t hash_id; // Hash algorithm identifier
|
||||||
uint64_t hash_mid; // Middle 64 bits
|
uint16_t digest_len; // Digest length in bytes
|
||||||
uint64_t hash_lo; // Low 64 bits
|
uint16_t reserved0; // Reserved for alignment/future use
|
||||||
uint32_t hash_tail; // Optional tail for full hash if larger than 192 bits
|
uint64_t digest_offset; // File offset of digest bytes for this entry
|
||||||
|
|
||||||
uint64_t extents_offset; // File offset of first ExtentRecord for this entry
|
uint64_t extents_offset; // File offset of first ExtentRecord for this entry
|
||||||
uint32_t extent_count; // Number of ExtentRecord entries for this artifact
|
uint32_t extent_count; // Number of ExtentRecord entries for this artifact
|
||||||
|
|
@ -123,9 +130,10 @@ typedef struct {
|
||||||
|
|
||||||
**Notes:**
|
**Notes:**
|
||||||
|
|
||||||
* `hash_*` fields store the artifact key deterministically.
|
* `hash_id` + `digest_len` + `digest_offset` store the artifact key deterministically.
|
||||||
|
* `digest_len` MUST be explicit in the encoding and MUST match the length implied by `hash_id` and StoreConfig.
|
||||||
* `extents_offset` references the first ExtentRecord for this entry.
|
* `extents_offset` references the first ExtentRecord for this entry.
|
||||||
* `extent_count` defines how many extents to read (may be 0 for tombstones).
|
* `extent_count` defines how many extents to read (may be 0 for tombstones; see ASL/1-CORE-INDEX in `tier1/asl-core-index.md`).
|
||||||
* `total_length` is the exact artifact size in bytes.
|
* `total_length` is the exact artifact size in bytes.
|
||||||
* Flags may indicate tombstone or other special status.
|
* Flags may indicate tombstone or other special status.
|
||||||
|
|
||||||
|
|
@ -156,7 +164,7 @@ typedef struct {
|
||||||
```c
|
```c
|
||||||
#pragma pack(push,1)
|
#pragma pack(push,1)
|
||||||
typedef struct {
|
typedef struct {
|
||||||
uint64_t crc64; // CRC over header + records + bloom filter
|
uint64_t crc64; // CRC over header + bloom filter + index records + digest bytes + extents
|
||||||
uint64_t seal_snapshot; // Snapshot ID when segment was sealed
|
uint64_t seal_snapshot; // Snapshot ID when segment was sealed
|
||||||
uint64_t seal_time_ns; // High-resolution seal timestamp
|
uint64_t seal_time_ns; // High-resolution seal timestamp
|
||||||
} SegmentFooter;
|
} SegmentFooter;
|
||||||
|
|
@ -165,12 +173,20 @@ typedef struct {
|
||||||
|
|
||||||
**Notes:**
|
**Notes:**
|
||||||
|
|
||||||
* CRC ensures corruption detection during reads.
|
* CRC ensures corruption detection during reads, covering all segment contents except the footer.
|
||||||
* Seal information allows deterministic reconstruction of CURRENT state.
|
* Seal information allows deterministic reconstruction of CURRENT state.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 8. Bloom Filter
|
## 8. DigestBytes
|
||||||
|
|
||||||
|
* Digest bytes are concatenated in a single byte array.
|
||||||
|
* Each IndexRecord references its digest via `digest_offset` and `digest_len`.
|
||||||
|
* The digest bytes MUST be immutable once the segment is sealed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Bloom Filter
|
||||||
|
|
||||||
* The bloom filter is **optional** and opaque to semantics.
|
* The bloom filter is **optional** and opaque to semantics.
|
||||||
* Its purpose is **lookup acceleration**.
|
* Its purpose is **lookup acceleration**.
|
||||||
|
|
@ -179,7 +195,7 @@ typedef struct {
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 9. Versioning and Compatibility
|
## 10. Versioning and Compatibility
|
||||||
|
|
||||||
* `version` field in header defines encoding.
|
* `version` field in header defines encoding.
|
||||||
* Readers must **reject unsupported versions**.
|
* Readers must **reject unsupported versions**.
|
||||||
|
|
@ -187,10 +203,11 @@ typedef struct {
|
||||||
* Existing fields must **never change meaning**.
|
* Existing fields must **never change meaning**.
|
||||||
* Version `1` implies single-extent layout (legacy).
|
* Version `1` implies single-extent layout (legacy).
|
||||||
* Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`.
|
* Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`.
|
||||||
|
* Version `3` introduces variable-length digest bytes with `hash_id` and `digest_offset`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 10. Alignment and Packing
|
## 11. Alignment and Packing
|
||||||
|
|
||||||
* All structures are **packed** (no compiler padding)
|
* All structures are **packed** (no compiler padding)
|
||||||
* Multi-byte integers are **little-endian**
|
* Multi-byte integers are **little-endian**
|
||||||
|
|
@ -199,7 +216,7 @@ typedef struct {
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 11. Summary of Encoding Guarantees
|
## 12. Summary of Encoding Guarantees
|
||||||
|
|
||||||
The ENC-ASL-CORE-INDEX specification ensures:
|
The ENC-ASL-CORE-INDEX specification ensures:
|
||||||
|
|
||||||
|
|
@ -211,7 +228,7 @@ The ENC-ASL-CORE-INDEX specification ensures:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 12. Relationship to Other Layers
|
## 13. Relationship to Other Layers
|
||||||
|
|
||||||
| Layer | Responsibility |
|
| Layer | Responsibility |
|
||||||
| ------------------ | ---------------------------------------------------------- |
|
| ------------------ | ---------------------------------------------------------- |
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue