Merge federation addendum into core index encoding

This commit is contained in:
Carl Niklas Rydberg 2026-01-17 07:13:06 +01:00
parent c2000cb6d7
commit 063a1835b9
2 changed files with 85 additions and 3 deletions

View file

@ -2,6 +2,8 @@
Base spec: `tier1/enc-asl-core-index.md` Base spec: `tier1/enc-asl-core-index.md`
Status: Merged into `tier1/enc-asl-core-index.md`
--- ---
## 1. Purpose ## 1. Purpose

View file

@ -10,6 +10,7 @@ This document defines the **exact encoding of ASL index segments** and records f
It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**. It translates the **semantic model of ASL/1-CORE-INDEX** and **store contracts of ASL-STORE-INDEX** into a deterministic **bytes-on-disk layout**.
Variable-length digest requirements are defined in ASL/1-CORE-INDEX (`tier1/asl-core-index.md`). Variable-length digest requirements are defined in ASL/1-CORE-INDEX (`tier1/asl-core-index.md`).
This document incorporates the federation encoding addendum.
It is intended for: It is intended for:
@ -23,6 +24,7 @@ It does **not** define:
* Index semantics (see ASL/1-CORE-INDEX) * Index semantics (see ASL/1-CORE-INDEX)
* Store lifecycle behavior (see ASL-STORE-INDEX) * Store lifecycle behavior (see ASL-STORE-INDEX)
* Acceleration semantics (see ASL/INDEX-ACCEL/1) * Acceleration semantics (see ASL/INDEX-ACCEL/1)
* Federation semantics (see federation/domain policy layers)
--- ---
@ -34,6 +36,7 @@ It does **not** define:
4. **Packed structures**; no compiler-introduced padding 4. **Packed structures**; no compiler-introduced padding
5. **Forward compatibility** via version field 5. **Forward compatibility** via version field
6. **CRC or checksum protection** for corruption detection 6. **CRC or checksum protection** for corruption detection
7. **Federation metadata** embedded in index records for deterministic cross-domain replay
All multi-byte integers are little-endian unless explicitly noted. All multi-byte integers are little-endian unless explicitly noted.
@ -68,6 +71,44 @@ Each index segment file is laid out as follows:
Offsets in the header define locations of Bloom filter and index records. Offsets in the header define locations of Bloom filter and index records.
### 3.1 Fixed Constants and Sizes
**Magic bytes (SegmentHeader.magic):** `ASLIDX03`
* ASCII bytes: `0x41 0x53 0x4c 0x49 0x44 0x58 0x30 0x33`
* Little-endian uint64 value: `0x33305844494c5341`
**Current encoding version:** `3`
**Fixed struct sizes (bytes):**
* `SegmentHeader`: 112
* `IndexRecord`: 48
* `ExtentRecord`: 16
* `SegmentFooter`: 24
**Section packing (no gaps):**
* `records_offset = header_size + bloom_size`
* `digests_offset = records_offset + (record_count * sizeof(IndexRecord))`
* `extents_offset = digests_offset + digests_size`
* `SegmentFooter` starts at `extents_offset + (extent_count * sizeof(ExtentRecord))`
All offsets MUST be file-relative, 8-byte aligned, and point to their respective arrays exactly as above.
### 3.2 Federation Defaults
This encoding integrates federation metadata into segments and records.
Legacy segments without federation fields MUST be treated as:
* `segment_domain_id = local`
* `segment_visibility = internal`
* `domain_id = local`
* `visibility = internal`
* `has_cross_domain_source = 0`
* `cross_domain_source = 0`
--- ---
## 4. SegmentHeader ## 4. SegmentHeader
@ -95,7 +136,12 @@ typedef struct {
uint64_t extents_offset; // File offset of ExtentRecord array uint64_t extents_offset; // File offset of ExtentRecord array
uint64_t extent_count; // Total number of ExtentRecord entries uint64_t extent_count; // Total number of ExtentRecord entries
uint64_t flags; // Reserved for future use uint32_t segment_domain_id; // Domain owning this segment
uint8_t segment_visibility; // 0 = internal, 1 = published
uint8_t federation_version; // 0 if unused
uint16_t reserved0; // Reserved (must be 0)
uint64_t flags; // Segment flags (must be 0 in version 3)
} SegmentHeader; } SegmentHeader;
#pragma pack(pop) #pragma pack(pop)
``` ```
@ -105,6 +151,12 @@ typedef struct {
* `magic` ensures the reader validates the segment type. * `magic` ensures the reader validates the segment type.
* `version` allows forward-compatible extension. * `version` allows forward-compatible extension.
* `snapshot_min` / `snapshot_max` are reserved for future use and carry no visibility semantics in version 3. * `snapshot_min` / `snapshot_max` are reserved for future use and carry no visibility semantics in version 3.
* `segment_domain_id` identifies the owning domain for all records in this segment.
* `segment_visibility` MUST be the maximum visibility of all records in the segment.
* `federation_version` MUST be `0` unless a future federation encoding version is defined.
* `reserved0` MUST be `0`.
* `header_size` MUST be `112`.
* `flags` MUST be `0`. Readers MUST reject non-zero values.
--- ---
@ -122,8 +174,13 @@ typedef struct {
uint32_t extent_count; // Number of ExtentRecord entries for this artifact uint32_t extent_count; // Number of ExtentRecord entries for this artifact
uint32_t total_length; // Total artifact length in bytes uint32_t total_length; // Total artifact length in bytes
uint32_t flags; // Optional flags (tombstone, reserved, etc.) uint32_t domain_id; // Domain identifier for this artifact
uint32_t reserved; // Reserved for alignment/future use uint8_t visibility; // 0 = internal, 1 = published
uint8_t has_cross_domain_source; // 0 or 1
uint16_t reserved1; // Reserved (must be 0)
uint32_t cross_domain_source; // Source domain if imported (valid if has_cross_domain_source=1)
uint32_t flags; // Optional flags (tombstone, reserved, etc.)
} IndexRecord; } IndexRecord;
#pragma pack(pop) #pragma pack(pop)
``` ```
@ -132,10 +189,26 @@ typedef struct {
* `hash_id` + `digest_len` + `digest_offset` store the artifact key deterministically. * `hash_id` + `digest_len` + `digest_offset` store the artifact key deterministically.
* `digest_len` MUST be explicit in the encoding and MUST match the length implied by `hash_id` and StoreConfig. * `digest_len` MUST be explicit in the encoding and MUST match the length implied by `hash_id` and StoreConfig.
* `digest_offset` MUST be within `[digests_offset, digests_offset + digests_size)`.
* `extents_offset` references the first ExtentRecord for this entry. * `extents_offset` references the first ExtentRecord for this entry.
* `extent_count` defines how many extents to read (may be 0 for tombstones; see ASL/1-CORE-INDEX in `tier1/asl-core-index.md`). * `extent_count` defines how many extents to read (may be 0 for tombstones; see ASL/1-CORE-INDEX in `tier1/asl-core-index.md`).
* `total_length` is the exact artifact size in bytes. * `total_length` is the exact artifact size in bytes.
* Flags may indicate tombstone or other special status. * Flags may indicate tombstone or other special status.
* `domain_id` MUST be present and stable across replay.
* `visibility` MUST be `0` or `1`.
* `has_cross_domain_source` MUST be `0` or `1`.
* `cross_domain_source` MUST be `0` when `has_cross_domain_source=0`.
* `reserved0` and `reserved1` MUST be `0`.
### 5.1 IndexRecord Flags
```
IDX_FLAG_TOMBSTONE = 0x00000001
```
* If `IDX_FLAG_TOMBSTONE` is set, then `extent_count`, `total_length`, and `extents_offset` MUST be `0`.
* All other bits are reserved and MUST be `0`. Readers MUST reject unknown flag bits.
* Tombstones MUST retain valid `domain_id` and `visibility` to ensure domain-local shadowing.
--- ---
@ -156,6 +229,7 @@ typedef struct {
* Extents are concatenated in order to produce artifact bytes. * Extents are concatenated in order to produce artifact bytes.
* `extent_count` MUST be > 0 for visible (non-tombstone) entries. * `extent_count` MUST be > 0 for visible (non-tombstone) entries.
* `total_length` MUST equal the sum of `length` across the extents. * `total_length` MUST equal the sum of `length` across the extents.
* `offset` and `length` MUST describe a contiguous slice within the referenced block.
--- ---
@ -204,6 +278,12 @@ typedef struct {
* Version `1` implies single-extent layout (legacy). * Version `1` implies single-extent layout (legacy).
* Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`. * Version `2` introduces `ExtentRecord` lists and `extents_offset` / `extent_count`.
* Version `3` introduces variable-length digest bytes with `hash_id` and `digest_offset`. * Version `3` introduces variable-length digest bytes with `hash_id` and `digest_offset`.
* Version `3` also integrates federation metadata in segment headers and index records.
### 10.1 Federation Compatibility Rules
* Legacy segments without federation fields are treated as local/internal (see 3.2).
* Tombstones MUST NOT shadow artifacts from other domains; domain matching is required.
--- ---