491 lines
16 KiB
Markdown
491 lines
16 KiB
Markdown
# HASH/ASL1 — ASL1 Hash Algorithm Registry
|
||
|
||
Status: Approved
|
||
Owner: Niklas Rydberg
|
||
Version: 0.2.4
|
||
SoT: Yes
|
||
Last Updated: 2025-11-16
|
||
Linked Phase Pack: N/A
|
||
Tags: [deterministic, registry]
|
||
|
||
<!-- Source: /amduat/docs/new/hash-asl.md | Canonical: /amduat/tier1/hash-asl1.md -->
|
||
|
||
**Document ID:** `HASH/ASL1`
|
||
**Layer:** Substrate primitive profile (over ASL/1-CORE)
|
||
|
||
**Depends on (normative):**
|
||
|
||
* `ASL/1-CORE v0.4.x` — value substrate: `HashId`, `Reference`, `Artifact`, `EncodingProfileId`
|
||
* `ENC/ASL1-CORE v1.x` — canonical encoding for `Reference` (`ReferenceBytes`)
|
||
|
||
**Informative references:**
|
||
|
||
* `ASL/1-STORE v0.4.x` — content-addressable store model
|
||
* `TGK/1-CORE v0.7.x` — trace graph kernel (uses `Reference`)
|
||
* `PEL/1` — execution substrate
|
||
* `CIL/1`, `FCT/1`, `FER/1`, `OI/1` — profiles that depend on stable `Reference` semantics
|
||
* (future) `CID/1` — content identifier and domain-separation rules
|
||
|
||
© 2025 Niklas Rydberg.
|
||
|
||
## License
|
||
|
||
Except where otherwise noted, this document (text and diagrams) is licensed under
|
||
the Creative Commons Attribution 4.0 International License (CC BY 4.0).
|
||
|
||
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId
|
||
assignments, EdgeTypeId tables) are additionally made available under CC0 1.0
|
||
Universal (CC0) to enable unrestricted reuse in implementations and derivative
|
||
specifications.
|
||
|
||
Code examples in this document are provided under the Apache License 2.0 unless
|
||
explicitly stated otherwise. Test vectors, where present, are dedicated to the
|
||
public domain under CC0 1.0.
|
||
|
||
|
||
---
|
||
|
||
## 0. Purpose & Context
|
||
|
||
`HASH/ASL1` defines the **ASL1 hash algorithm family** for Amduat 2.0:
|
||
|
||
* assigns stable `HashId` (`uint16`) values to concrete cryptographic hash algorithms;
|
||
* defines the **mandatory** baseline algorithm `HASH-ASL1-256`;
|
||
* reserves ranges for future classical and post-quantum algorithms;
|
||
* specifies how these algorithms are used when deriving `Reference` values:
|
||
|
||
* via `ASL/CORE-REF-DERIVE/1` in `ASL/1-CORE`, and
|
||
* via `ENC/ASL1-CORE v1` binary encoding of `ReferenceBytes`.
|
||
|
||
This is a **substrate primitive profile**, not kernel, but:
|
||
|
||
> In Amduat 2.0, all **identity-critical** `Reference.hash_id` values used by the standard stack (ASL/1-STORE, TGK/1-CORE, PEL/1, CIL/1, FER/1, FCT/1, OI/1) MUST be interpreted according to this registry.
|
||
|
||
---
|
||
|
||
## 1. Scope
|
||
|
||
### 1.1 In scope
|
||
|
||
This specification standardizes:
|
||
|
||
1. The **ASL1 hash family**: common properties all algorithms must satisfy.
|
||
|
||
2. A **registry** from `HashId` → algorithm descriptor:
|
||
|
||
* `HashId` (`uint16`),
|
||
* digest length (bytes),
|
||
* normative definition and status.
|
||
|
||
3. How these algorithms connect to:
|
||
|
||
* `ASL/1-CORE`’s Reference derivation rule (`ASL/CORE-REF-DERIVE/1`),
|
||
* `ENC/ASL1-CORE v1`’s `ReferenceBytes` encoding.
|
||
|
||
4. Rules for **algorithm evolution**:
|
||
|
||
* immutability of assignments,
|
||
* constraints for adding new algorithms.
|
||
|
||
### 1.2 Out of scope
|
||
|
||
This specification does **not** define:
|
||
|
||
* storage APIs, replication, or retention,
|
||
* execution runtimes, scheduling, or side effects,
|
||
* keyed constructions (MACs, KDFs, PRFs, etc.),
|
||
* non-cryptographic hashes,
|
||
* domain-separation rules at the CID layer (those belong in `CID/1` and/or encoding profiles),
|
||
* migration policy (it only provides primitives).
|
||
|
||
---
|
||
|
||
## 2. Terminology & Conventions
|
||
|
||
The RFC 2119 terms **MUST**, **SHOULD**, **MAY**, etc. apply.
|
||
|
||
From `ASL/1-CORE`:
|
||
|
||
* `OctetString` — finite byte sequence (`0x00–0xFF`),
|
||
* `HashId` — `uint16`, used as `Reference.hash_id`,
|
||
* `Reference` — `{ hash_id: HashId; digest: OctetString }`,
|
||
* `EncodingProfileId` — `uint16` identifying canonical encodings (e.g. `ASL_ENC_CORE_V1`),
|
||
* `ASL/CORE-REF-DERIVE/1` — normative Reference derivation rule.
|
||
|
||
From `ENC/ASL1-CORE v1` (current):
|
||
|
||
* `ReferenceBytes` — canonical encoding:
|
||
|
||
```text
|
||
u16 hash_id
|
||
digest[...] // remaining bytes in the frame are the digest
|
||
```
|
||
|
||
**Note:** `Reference` carries only `hash_id` and `digest`. There is no extra “family” field on-wire. For Amduat 2.0, `HashId` values in ASL/1 contexts are **globally** interpreted using this `HASH/ASL1` registry.
|
||
|
||
---
|
||
|
||
## 3. The ASL1 Hash Family
|
||
|
||
### 3.1 Family properties
|
||
|
||
All `"ASL1"` algorithms MUST be **cryptographic hash functions**:
|
||
|
||
* **Preimage resistance** – infeasible to find `x` for a given digest `d` with `H(x) = d`.
|
||
* **Second-preimage resistance** – infeasible, given `x`, to find `x' ≠ x` with `H(x') = H(x)`.
|
||
* **Collision resistance** – infeasible to find any `(x, x')`, `x ≠ x'` with `H(x) = H(x')`.
|
||
|
||
Each `"ASL1"` algorithm:
|
||
|
||
* accepts arbitrary-length `OctetString` inputs,
|
||
* produces a **fixed-length** `OctetString` digest,
|
||
* MUST support **incremental / streaming** operation:
|
||
|
||
* a single forward-only pass over input,
|
||
* no need to buffer entire input.
|
||
|
||
These properties allow:
|
||
|
||
* hashing large canonical encodings incrementally,
|
||
* use in streaming stores and execution engines.
|
||
|
||
### 3.2 Family name and global use
|
||
|
||
* Family name: `"ASL1"`.
|
||
|
||
Within Amduat 2.0:
|
||
|
||
* all **identity-critical** `Reference.hash_id` values used by the standard stack are interpreted as entries in this `"ASL1"` registry;
|
||
* `HASH/ASL1` is therefore the **global assignment** for `HashId` in ASL/1 identity contexts.
|
||
|
||
If other hash families are used in non-ASL contexts (e.g., external APIs), they **MUST NOT** reuse `HashId` values defined here for `Reference.hash_id` in ASL/1-CORE. They should either:
|
||
|
||
* live in separate fields / structures; or
|
||
* use distinct namespaces not confused with `Reference.hash_id`.
|
||
|
||
### 3.3 HashId space
|
||
|
||
`HashId` is `uint16` and appears in `Reference.hash_id` and in `ReferenceBytes.hash_id`.
|
||
|
||
This registry reserves:
|
||
|
||
* `0x0000` — **Reserved** (never a valid algorithm).
|
||
* `0x0001–0x7FFF` — classical (pre-quantum) `"ASL1"` algorithms.
|
||
* `0x8000–0xFFFF` — post-quantum or specialized `"ASL1"` algorithms.
|
||
|
||
Each algorithm has an intrinsic digest length `L` (>0 bytes), defined by its normative spec. This document does not impose an upper bound beyond “finite and practically representable in implementations.” (ENC/ASL1-CORE v1 does not carry the length explicitly; length is implied by framing and cross-checked against `L` when the algorithm is known.)
|
||
|
||
---
|
||
|
||
## 4. Algorithm Registry
|
||
|
||
### 4.1 Registry (v0.2.4)
|
||
|
||
The `"ASL1"` registry is a mapping:
|
||
|
||
```text
|
||
HashId (uint16) -> Algorithm descriptor
|
||
```
|
||
|
||
At version 0.2.4:
|
||
|
||
| HashId | Name | Digest (bytes) | Status | Notes |
|
||
| ------------: | ------------- | -------------- | --------- | ------------------------------------------ |
|
||
| **0x0001** | HASH-ASL1-256 | 32 | MANDATORY | Canonical default for `ASL_ENC_CORE_V1` |
|
||
| 0x0002 | HASH-ASL1-512 | 64 (reserved) | RESERVED | Intended classical 512-bit algorithm |
|
||
| 0x8001 | HASH-ASL1-PQ1 | TBD | RESERVED | First PQ algorithm placeholder |
|
||
| 0x8002–0x80FF | — | varies | RESERVED | Reserved range for future PQ / specialized |
|
||
|
||
Only `0x0001` is defined normatively at this version; others are reserved for future assignment.
|
||
|
||
### 4.2 HASH-ASL1-256 (mandatory)
|
||
|
||
* **Name:** `HASH-ASL1-256`
|
||
* **HashId:** `0x0001`
|
||
* **Digest length:** 32 bytes
|
||
* **Status:** MANDATORY for all Amduat 2.0–conformant implementations
|
||
|
||
#### 4.2.1 Normative definition
|
||
|
||
`HASH-ASL1-256` is **bit-for-bit identical** to SHA-256 as defined in FIPS 180-4 (or any successor that preserves SHA-256 semantics).
|
||
|
||
For all `data : OctetString`:
|
||
|
||
```text
|
||
HASH-ASL1-256(data) == SHA-256(data)
|
||
```
|
||
|
||
Any implementation whose output differs from SHA-256 for any input MUST NOT claim to implement `HASH-ASL1-256`.
|
||
|
||
`HASH-ASL1-256` MUST be deterministic and support incremental processing of input.
|
||
|
||
#### 4.2.2 Relationship to ASL/1-CORE & ASL_ENC_CORE_V1
|
||
|
||
`ASL/1-CORE` defines `ASL/CORE-REF-DERIVE/1`:
|
||
|
||
```text
|
||
ArtifactBytes = encode_P(A)
|
||
digest = H(ArtifactBytes)
|
||
Reference = { hash_id = HID, digest = digest }
|
||
```
|
||
|
||
For:
|
||
|
||
* `P = ASL_ENC_CORE_V1` (`EncodingProfileId = 0x0001`),
|
||
* `HID = 0x0001`,
|
||
* `H = HASH-ASL1-256`,
|
||
|
||
this becomes the **canonical default** Reference derivation for Amduat 2.0.
|
||
|
||
Unless a profile explicitly opts out, all identity-critical `Reference` values for Artifacts encoded under `ASL_ENC_CORE_V1` **MUST** use this `(P, H)` pair.
|
||
|
||
### 4.3 Reserved IDs
|
||
|
||
The following identifiers are reserved:
|
||
|
||
* `0x0002` — `HASH-ASL1-512`, digest length 64 bytes; classical 512-bit algorithm (e.g. SHA-512 or similar), TBD.
|
||
* `0x8001` — `HASH-ASL1-PQ1`; first post-quantum algorithm, TBD.
|
||
* `0x8002–0x80FF` — reserved block for additional post-quantum / specialized algorithms.
|
||
|
||
Implementations MUST NOT treat these IDs as usable until a future `HASH/ASL1` revision defines them normatively.
|
||
|
||
---
|
||
|
||
## 5. Interaction with ASL/1-CORE & ENC/ASL1-CORE v1
|
||
|
||
### 5.1 Reference derivation
|
||
|
||
`ASL/1-CORE` defines `ASL/CORE-REF-DERIVE/1`. `HASH/ASL1` simply supplies the `"ASL1"` algorithms and `HashId`s.
|
||
|
||
Given:
|
||
|
||
* Artifact `A`,
|
||
* encoding profile `P`,
|
||
* algorithm `H` with `HashId = HID`,
|
||
|
||
then:
|
||
|
||
```text
|
||
ArtifactBytes = encode_P(A)
|
||
digest = H(ArtifactBytes)
|
||
Reference = { hash_id = HID, digest = digest }
|
||
```
|
||
|
||
All ASL/1 conformant components **MUST** use this procedure for any `(EncodingProfileId, HashId)` pair they claim to support.
|
||
|
||
### 5.2 ReferenceBytes under ENC/ASL1-CORE v1
|
||
|
||
`ENC/ASL1-CORE v1` encodes a `Reference` as:
|
||
|
||
```text
|
||
u16 hash_id
|
||
digest[...] // remaining bytes in the enclosing frame are the digest
|
||
```
|
||
|
||
This profile does **not** carry an explicit digest length; framing is provided by the enclosing structure (e.g., length-prefix, message boundary).
|
||
|
||
When an implementation both:
|
||
|
||
* decodes `ReferenceBytes` under `ENC/ASL1-CORE v1`, and
|
||
* implements `HASH/ASL1` and recognizes `hash_id`,
|
||
|
||
then it MUST enforce:
|
||
|
||
```text
|
||
len(digest) == canonical_digest_length(hash_id)
|
||
```
|
||
|
||
where `canonical_digest_length(hash_id)` is taken from this registry.
|
||
|
||
Any mismatch MUST be treated as an encoding / integrity error by the consumer.
|
||
|
||
If a `hash_id` is unknown (or HASH/ASL1 is not implemented), an implementation MAY still treat the bytes as a generic `Reference { hash_id, digest }`, but:
|
||
|
||
* it cannot recompute or verify the digest cryptographically, and
|
||
* higher layers MAY treat such a `Reference` as unsupported or lower-trust.
|
||
|
||
---
|
||
|
||
## 6. Crypto Agility & Evolution
|
||
|
||
### 6.1 Immutability of assignments
|
||
|
||
Once a `HashId` is assigned to an algorithm, its:
|
||
|
||
* digest length,
|
||
* underlying construction,
|
||
* behavior on all inputs,
|
||
|
||
MUST NOT change in any way that alters output values for the **same input bytes**.
|
||
|
||
For example:
|
||
|
||
* `HashId = 0x0001` MUST always denote SHA-256 semantics; future revisions cannot redefine it as anything that changes the digest for the same input bytes (e.g. “SHA-256 plus domain separator”).
|
||
|
||
If domain separation or similar techniques are required, they MUST be expressed at the **input construction** level (e.g. in `CID/1` or encoding profiles), not by changing the hash function definition.
|
||
|
||
### 6.2 Adding new algorithms
|
||
|
||
A new `"ASL1"` algorithm MAY be added in a future `HASH/ASL1` version if and only if:
|
||
|
||
* it satisfies the family properties in §3.1;
|
||
|
||
* it has a fixed digest length `L > 0` bytes;
|
||
|
||
* its spec includes:
|
||
|
||
* assigned `HashId`,
|
||
* digest length,
|
||
* normative algorithm definition (via external standard or full spec),
|
||
* status (`MANDATORY`, `RECOMMENDED`, `OPTIONAL`, `EXPERIMENTAL`);
|
||
|
||
* it is introduced via:
|
||
|
||
* a new `HASH/ASL1` version,
|
||
* at least one ADR,
|
||
* published test vectors.
|
||
|
||
Existing `HashId` assignments MUST NOT be repurposed.
|
||
|
||
### 6.3 Coexistence and migration (informative)
|
||
|
||
Higher layers can use `"ASL1"`’s crypto agility by:
|
||
|
||
* computing more than one `Reference` for the same Artifact (multi-hash),
|
||
* storing those in receipts, overlays, or catalogs,
|
||
* defining profile-specific policies like:
|
||
|
||
* “from date D, compute both `HASH-ASL1-256` and `HASH-ASL1-PQ1` for all new Artifacts; prefer 0x8001 for new dependencies.”
|
||
|
||
`HASH/ASL1` itself:
|
||
|
||
* does not prescribe when to migrate,
|
||
* only guarantees that `HashId` mappings and algorithms are stable.
|
||
|
||
---
|
||
|
||
## 7. Conformance
|
||
|
||
An implementation is **HASH/ASL1–conformant** (v0.2.4) if:
|
||
|
||
1. **Correct HASH-ASL1-256 implementation**
|
||
|
||
* Provides a `HASH-ASL1-256` function:
|
||
|
||
* accepts arbitrary-length `OctetString` input,
|
||
* returns a 32-byte `OctetString` digest,
|
||
|
||
* matches SHA-256 exactly for all inputs,
|
||
|
||
* behaves deterministically and supports incremental operation.
|
||
|
||
2. **Consistent Reference use with ENC/ASL1-CORE v1**
|
||
|
||
* When encoding `ReferenceBytes`, emits:
|
||
|
||
* `hash_id` as `u16`,
|
||
* digest bytes equal in length to the algorithm’s canonical digest length.
|
||
|
||
* When decoding `ReferenceBytes`:
|
||
|
||
* for known `hash_id` values, enforces `len(digest) == canonical_digest_length(hash_id)` and treats mismatches as errors;
|
||
* for unknown `hash_id` values, MAY accept `Reference` structurally but MUST treat the algorithm as unsupported for verification.
|
||
|
||
3. **Registry immutability**
|
||
|
||
* Does not change the meaning of any assigned `HashId`,
|
||
* Does not use reserved IDs as custom algorithms outside the formal registry process.
|
||
|
||
4. **Family compliance for extra algorithms**
|
||
|
||
* For any additional `"ASL1"` algorithms claimed:
|
||
|
||
* ensures they satisfy §3.1,
|
||
* documents their digest length and behavior.
|
||
|
||
5. **Integration with ASL/1-CORE**
|
||
|
||
* Uses `ASL/CORE-REF-DERIVE/1` when deriving References in the ASL/1 context,
|
||
* For `ASL_ENC_CORE_V1` and `hash_id = 0x0001`, uses `HASH-ASL1-256` unless a profile explicitly specifies another algorithm.
|
||
|
||
---
|
||
|
||
## 8. Security Considerations
|
||
|
||
1. **Collision risk**
|
||
|
||
* Collisions in `HASH-ASL1-256` would be a severe substrate-level integrity issue for systems that rely only on `HashId = 0x0001`.
|
||
* Higher layers (CIL/1, FCT/1, FER/1, OI/1, TGK/PROV-style profiles) SHOULD:
|
||
|
||
* assume collisions are possible in principle,
|
||
* provide detection and mitigation strategies (e.g. optional dual-hash, anomaly logging).
|
||
|
||
2. **Algorithm deprecation**
|
||
|
||
* If `HASH-ASL1-256` becomes weak:
|
||
|
||
* future specs MAY introduce a new mandatory algorithm,
|
||
* migration strategies SHOULD be defined at profile / domain layers.
|
||
|
||
* Existing References with `HashId = 0x0001` remain valid as historical IDs; their meaning MUST NOT be changed.
|
||
|
||
3. **Side-channel resistance**
|
||
|
||
* Implementations SHOULD mitigate timing/cache/power side channels, especially in shared environments.
|
||
* Use well-reviewed crypto libraries where possible.
|
||
|
||
4. **Non-ASL1 hash usage**
|
||
|
||
* Systems MAY use other hash functions (e.g., for local caches, external APIs),
|
||
* Such functions MUST NOT reuse `HashId`s defined in this registry for `Reference.hash_id`,
|
||
* They MUST be clearly separated from ASL/1 identity semantics.
|
||
|
||
---
|
||
|
||
## 9. Example (Non-Normative)
|
||
|
||
Given:
|
||
|
||
* `EncodingProfileId = ASL_ENC_CORE_V1 (0x0001)`,
|
||
* algorithm `HASH-ASL1-256` (`HashId = 0x0001`),
|
||
* Artifact:
|
||
|
||
```text
|
||
Artifact {
|
||
bytes = 0xDE AD
|
||
type_tag = none
|
||
}
|
||
```
|
||
|
||
Assume `ENC/ASL1-CORE v1` canonical Artifact encoding:
|
||
|
||
```text
|
||
00 ; has_type_tag = false
|
||
0000000000000002 ; bytes_len = 2 (u64)
|
||
DEAD ; bytes
|
||
```
|
||
|
||
Then:
|
||
|
||
1. `ArtifactBytes = encode_artifact_core_v1(Artifact)`.
|
||
2. `digest = HASH-ASL1-256(ArtifactBytes)` (SHA-256).
|
||
3. `Reference = { hash_id = 0x0001, digest = digest }`.
|
||
4. `ReferenceBytes` under `ENC/ASL1-CORE v1`:
|
||
|
||
```text
|
||
0001 <32 bytes of digest>
|
||
```
|
||
|
||
The frame boundary (e.g., length prefix or message boundary) determines where the digest ends. A consumer that knows `hash_id = 0x0001` and implements HASH/ASL1 will:
|
||
|
||
* expect exactly 32 digest bytes,
|
||
* treat any other length as an error.
|
||
|
||
This `Reference` can be used consistently across `ASL/1-STORE`, `TGK/1-CORE`, `PEL/1`, `CIL/1`, `FER/1`, `FCT/1`, `OI/1`, with equality defined by `ASL/1-CORE`.
|
||
|
||
---
|
||
|
||
## Document History
|
||
|
||
* **0.2.4 (2025-11-16):** Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.
|