amduat/tier1/hash-asl1.md
2025-12-19 19:22:40 +01:00

16 KiB
Raw Permalink Blame History

HASH/ASL1 — ASL1 Hash Algorithm Registry

Status: Approved Owner: Niklas Rydberg Version: 0.2.4 SoT: Yes Last Updated: 2025-11-16 Linked Phase Pack: N/A Tags: [deterministic, registry]

Document ID: HASH/ASL1 Layer: Substrate primitive profile (over ASL/1-CORE)

Depends on (normative):

  • ASL/1-CORE v0.4.x — value substrate: HashId, Reference, Artifact, EncodingProfileId
  • ENC/ASL1-CORE v1.x — canonical encoding for Reference (ReferenceBytes)

Informative references:

  • ASL/1-STORE v0.4.x — content-addressable store model
  • TGK/1-CORE v0.7.x — trace graph kernel (uses Reference)
  • PEL/1 — execution substrate
  • CIL/1, FCT/1, FER/1, OI/1 — profiles that depend on stable Reference semantics
  • (future) CID/1 — content identifier and domain-separation rules

© 2025 Niklas Rydberg.

License

Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.

Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.


0. Purpose & Context

HASH/ASL1 defines the ASL1 hash algorithm family for Amduat 2.0:

  • assigns stable HashId (uint16) values to concrete cryptographic hash algorithms;

  • defines the mandatory baseline algorithm HASH-ASL1-256;

  • reserves ranges for future classical and post-quantum algorithms;

  • specifies how these algorithms are used when deriving Reference values:

    • via ASL/CORE-REF-DERIVE/1 in ASL/1-CORE, and
    • via ENC/ASL1-CORE v1 binary encoding of ReferenceBytes.

This is a substrate primitive profile, not kernel, but:

In Amduat 2.0, all identity-critical Reference.hash_id values used by the standard stack (ASL/1-STORE, TGK/1-CORE, PEL/1, CIL/1, FER/1, FCT/1, OI/1) MUST be interpreted according to this registry.


1. Scope

1.1 In scope

This specification standardizes:

  1. The ASL1 hash family: common properties all algorithms must satisfy.

  2. A registry from HashId → algorithm descriptor:

    • HashId (uint16),
    • digest length (bytes),
    • normative definition and status.
  3. How these algorithms connect to:

    • ASL/1-COREs Reference derivation rule (ASL/CORE-REF-DERIVE/1),
    • ENC/ASL1-CORE v1s ReferenceBytes encoding.
  4. Rules for algorithm evolution:

    • immutability of assignments,
    • constraints for adding new algorithms.

1.2 Out of scope

This specification does not define:

  • storage APIs, replication, or retention,
  • execution runtimes, scheduling, or side effects,
  • keyed constructions (MACs, KDFs, PRFs, etc.),
  • non-cryptographic hashes,
  • domain-separation rules at the CID layer (those belong in CID/1 and/or encoding profiles),
  • migration policy (it only provides primitives).

2. Terminology & Conventions

The RFC 2119 terms MUST, SHOULD, MAY, etc. apply.

From ASL/1-CORE:

  • OctetString — finite byte sequence (0x000xFF),
  • HashIduint16, used as Reference.hash_id,
  • Reference{ hash_id: HashId; digest: OctetString },
  • EncodingProfileIduint16 identifying canonical encodings (e.g. ASL_ENC_CORE_V1),
  • ASL/CORE-REF-DERIVE/1 — normative Reference derivation rule.

From ENC/ASL1-CORE v1 (current):

  • ReferenceBytes — canonical encoding:

    u16 hash_id
    digest[...]  // remaining bytes in the frame are the digest
    

Note: Reference carries only hash_id and digest. There is no extra “family” field on-wire. For Amduat 2.0, HashId values in ASL/1 contexts are globally interpreted using this HASH/ASL1 registry.


3. The ASL1 Hash Family

3.1 Family properties

All "ASL1" algorithms MUST be cryptographic hash functions:

  • Preimage resistance infeasible to find x for a given digest d with H(x) = d.
  • Second-preimage resistance infeasible, given x, to find x' ≠ x with H(x') = H(x).
  • Collision resistance infeasible to find any (x, x'), x ≠ x' with H(x) = H(x').

Each "ASL1" algorithm:

  • accepts arbitrary-length OctetString inputs,

  • produces a fixed-length OctetString digest,

  • MUST support incremental / streaming operation:

    • a single forward-only pass over input,
    • no need to buffer entire input.

These properties allow:

  • hashing large canonical encodings incrementally,
  • use in streaming stores and execution engines.

3.2 Family name and global use

  • Family name: "ASL1".

Within Amduat 2.0:

  • all identity-critical Reference.hash_id values used by the standard stack are interpreted as entries in this "ASL1" registry;
  • HASH/ASL1 is therefore the global assignment for HashId in ASL/1 identity contexts.

If other hash families are used in non-ASL contexts (e.g., external APIs), they MUST NOT reuse HashId values defined here for Reference.hash_id in ASL/1-CORE. They should either:

  • live in separate fields / structures; or
  • use distinct namespaces not confused with Reference.hash_id.

3.3 HashId space

HashId is uint16 and appears in Reference.hash_id and in ReferenceBytes.hash_id.

This registry reserves:

  • 0x0000Reserved (never a valid algorithm).
  • 0x00010x7FFF — classical (pre-quantum) "ASL1" algorithms.
  • 0x80000xFFFF — post-quantum or specialized "ASL1" algorithms.

Each algorithm has an intrinsic digest length L (>0 bytes), defined by its normative spec. This document does not impose an upper bound beyond “finite and practically representable in implementations.” (ENC/ASL1-CORE v1 does not carry the length explicitly; length is implied by framing and cross-checked against L when the algorithm is known.)


4. Algorithm Registry

4.1 Registry (v0.2.4)

The "ASL1" registry is a mapping:

HashId (uint16) -> Algorithm descriptor

At version 0.2.4:

HashId Name Digest (bytes) Status Notes
0x0001 HASH-ASL1-256 32 MANDATORY Canonical default for ASL_ENC_CORE_V1
0x0002 HASH-ASL1-512 64 (reserved) RESERVED Intended classical 512-bit algorithm
0x8001 HASH-ASL1-PQ1 TBD RESERVED First PQ algorithm placeholder
0x80020x80FF varies RESERVED Reserved range for future PQ / specialized

Only 0x0001 is defined normatively at this version; others are reserved for future assignment.

4.2 HASH-ASL1-256 (mandatory)

  • Name: HASH-ASL1-256
  • HashId: 0x0001
  • Digest length: 32 bytes
  • Status: MANDATORY for all Amduat 2.0conformant implementations

4.2.1 Normative definition

HASH-ASL1-256 is bit-for-bit identical to SHA-256 as defined in FIPS 180-4 (or any successor that preserves SHA-256 semantics).

For all data : OctetString:

HASH-ASL1-256(data) == SHA-256(data)

Any implementation whose output differs from SHA-256 for any input MUST NOT claim to implement HASH-ASL1-256.

HASH-ASL1-256 MUST be deterministic and support incremental processing of input.

4.2.2 Relationship to ASL/1-CORE & ASL_ENC_CORE_V1

ASL/1-CORE defines ASL/CORE-REF-DERIVE/1:

ArtifactBytes = encode_P(A)
digest        = H(ArtifactBytes)
Reference     = { hash_id = HID, digest = digest }

For:

  • P = ASL_ENC_CORE_V1 (EncodingProfileId = 0x0001),
  • HID = 0x0001,
  • H = HASH-ASL1-256,

this becomes the canonical default Reference derivation for Amduat 2.0.

Unless a profile explicitly opts out, all identity-critical Reference values for Artifacts encoded under ASL_ENC_CORE_V1 MUST use this (P, H) pair.

4.3 Reserved IDs

The following identifiers are reserved:

  • 0x0002HASH-ASL1-512, digest length 64 bytes; classical 512-bit algorithm (e.g. SHA-512 or similar), TBD.
  • 0x8001HASH-ASL1-PQ1; first post-quantum algorithm, TBD.
  • 0x80020x80FF — reserved block for additional post-quantum / specialized algorithms.

Implementations MUST NOT treat these IDs as usable until a future HASH/ASL1 revision defines them normatively.


5. Interaction with ASL/1-CORE & ENC/ASL1-CORE v1

5.1 Reference derivation

ASL/1-CORE defines ASL/CORE-REF-DERIVE/1. HASH/ASL1 simply supplies the "ASL1" algorithms and HashIds.

Given:

  • Artifact A,
  • encoding profile P,
  • algorithm H with HashId = HID,

then:

ArtifactBytes = encode_P(A)
digest        = H(ArtifactBytes)
Reference     = { hash_id = HID, digest = digest }

All ASL/1 conformant components MUST use this procedure for any (EncodingProfileId, HashId) pair they claim to support.

5.2 ReferenceBytes under ENC/ASL1-CORE v1

ENC/ASL1-CORE v1 encodes a Reference as:

u16 hash_id
digest[...]  // remaining bytes in the enclosing frame are the digest

This profile does not carry an explicit digest length; framing is provided by the enclosing structure (e.g., length-prefix, message boundary).

When an implementation both:

  • decodes ReferenceBytes under ENC/ASL1-CORE v1, and
  • implements HASH/ASL1 and recognizes hash_id,

then it MUST enforce:

len(digest) == canonical_digest_length(hash_id)

where canonical_digest_length(hash_id) is taken from this registry.

Any mismatch MUST be treated as an encoding / integrity error by the consumer.

If a hash_id is unknown (or HASH/ASL1 is not implemented), an implementation MAY still treat the bytes as a generic Reference { hash_id, digest }, but:

  • it cannot recompute or verify the digest cryptographically, and
  • higher layers MAY treat such a Reference as unsupported or lower-trust.

6. Crypto Agility & Evolution

6.1 Immutability of assignments

Once a HashId is assigned to an algorithm, its:

  • digest length,
  • underlying construction,
  • behavior on all inputs,

MUST NOT change in any way that alters output values for the same input bytes.

For example:

  • HashId = 0x0001 MUST always denote SHA-256 semantics; future revisions cannot redefine it as anything that changes the digest for the same input bytes (e.g. “SHA-256 plus domain separator”).

If domain separation or similar techniques are required, they MUST be expressed at the input construction level (e.g. in CID/1 or encoding profiles), not by changing the hash function definition.

6.2 Adding new algorithms

A new "ASL1" algorithm MAY be added in a future HASH/ASL1 version if and only if:

  • it satisfies the family properties in §3.1;

  • it has a fixed digest length L > 0 bytes;

  • its spec includes:

    • assigned HashId,
    • digest length,
    • normative algorithm definition (via external standard or full spec),
    • status (MANDATORY, RECOMMENDED, OPTIONAL, EXPERIMENTAL);
  • it is introduced via:

    • a new HASH/ASL1 version,
    • at least one ADR,
    • published test vectors.

Existing HashId assignments MUST NOT be repurposed.

6.3 Coexistence and migration (informative)

Higher layers can use "ASL1"s crypto agility by:

  • computing more than one Reference for the same Artifact (multi-hash),

  • storing those in receipts, overlays, or catalogs,

  • defining profile-specific policies like:

    • “from date D, compute both HASH-ASL1-256 and HASH-ASL1-PQ1 for all new Artifacts; prefer 0x8001 for new dependencies.”

HASH/ASL1 itself:

  • does not prescribe when to migrate,
  • only guarantees that HashId mappings and algorithms are stable.

7. Conformance

An implementation is HASH/ASL1conformant (v0.2.4) if:

  1. Correct HASH-ASL1-256 implementation

    • Provides a HASH-ASL1-256 function:

      • accepts arbitrary-length OctetString input,
      • returns a 32-byte OctetString digest,
    • matches SHA-256 exactly for all inputs,

    • behaves deterministically and supports incremental operation.

  2. Consistent Reference use with ENC/ASL1-CORE v1

    • When encoding ReferenceBytes, emits:

      • hash_id as u16,
      • digest bytes equal in length to the algorithms canonical digest length.
    • When decoding ReferenceBytes:

      • for known hash_id values, enforces len(digest) == canonical_digest_length(hash_id) and treats mismatches as errors;
      • for unknown hash_id values, MAY accept Reference structurally but MUST treat the algorithm as unsupported for verification.
  3. Registry immutability

    • Does not change the meaning of any assigned HashId,
    • Does not use reserved IDs as custom algorithms outside the formal registry process.
  4. Family compliance for extra algorithms

    • For any additional "ASL1" algorithms claimed:

      • ensures they satisfy §3.1,
      • documents their digest length and behavior.
  5. Integration with ASL/1-CORE

    • Uses ASL/CORE-REF-DERIVE/1 when deriving References in the ASL/1 context,
    • For ASL_ENC_CORE_V1 and hash_id = 0x0001, uses HASH-ASL1-256 unless a profile explicitly specifies another algorithm.

8. Security Considerations

  1. Collision risk

    • Collisions in HASH-ASL1-256 would be a severe substrate-level integrity issue for systems that rely only on HashId = 0x0001.

    • Higher layers (CIL/1, FCT/1, FER/1, OI/1, TGK/PROV-style profiles) SHOULD:

      • assume collisions are possible in principle,
      • provide detection and mitigation strategies (e.g. optional dual-hash, anomaly logging).
  2. Algorithm deprecation

    • If HASH-ASL1-256 becomes weak:

      • future specs MAY introduce a new mandatory algorithm,
      • migration strategies SHOULD be defined at profile / domain layers.
    • Existing References with HashId = 0x0001 remain valid as historical IDs; their meaning MUST NOT be changed.

  3. Side-channel resistance

    • Implementations SHOULD mitigate timing/cache/power side channels, especially in shared environments.
    • Use well-reviewed crypto libraries where possible.
  4. Non-ASL1 hash usage

    • Systems MAY use other hash functions (e.g., for local caches, external APIs),
    • Such functions MUST NOT reuse HashIds defined in this registry for Reference.hash_id,
    • They MUST be clearly separated from ASL/1 identity semantics.

9. Example (Non-Normative)

Given:

  • EncodingProfileId = ASL_ENC_CORE_V1 (0x0001),

  • algorithm HASH-ASL1-256 (HashId = 0x0001),

  • Artifact:

    Artifact {
      bytes    = 0xDE AD
      type_tag = none
    }
    

Assume ENC/ASL1-CORE v1 canonical Artifact encoding:

00                 ; has_type_tag = false
0000000000000002   ; bytes_len = 2 (u64)
DEAD               ; bytes

Then:

  1. ArtifactBytes = encode_artifact_core_v1(Artifact).

  2. digest = HASH-ASL1-256(ArtifactBytes) (SHA-256).

  3. Reference = { hash_id = 0x0001, digest = digest }.

  4. ReferenceBytes under ENC/ASL1-CORE v1:

    0001 <32 bytes of digest>
    

The frame boundary (e.g., length prefix or message boundary) determines where the digest ends. A consumer that knows hash_id = 0x0001 and implements HASH/ASL1 will:

  • expect exactly 32 digest bytes,
  • treat any other length as an error.

This Reference can be used consistently across ASL/1-STORE, TGK/1-CORE, PEL/1, CIL/1, FER/1, FCT/1, OI/1, with equality defined by ASL/1-CORE.


Document History

  • 0.2.4 (2025-11-16): Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.