18 KiB
ENC/ASL1-CORE v1 — Core Canonical Encoding Profile
Status: Approved Owner: Niklas Rydberg Version: 1.0.5 SoT: Yes Last Updated: 2025-11-16 Linked Phase Pack: N/A Tags: [deterministic, binary-minimalism]
Document ID: ENC/ASL1-CORE
Profile ID: ASL_ENC_CORE_V1 = 0x0001
Layer: Substrate Primitive Profile (Canonical Encoding)
Depends on (normative):
- ASL/1-CORE v0.4.1 (value model:
Artifact,TypeTag,Reference,HashId)
Integrates with (cross-profile rules):
-
HASH/ASL1 v0.2.4 (ASL1 hash family: registry of
HashId → algorithm, digest length)- This profile does not depend on HASH/ASL1 to define its layouts.
- When both profiles are implemented, additional cross-checks apply (see §4.4, §5).
Used by (descriptive):
- ASL/1-CORE identity semantics (canonical encodings as the basis for hashing)
- ASL/1-STORE (persistence and integrity)
- PEL/1 (execution artifacts and results)
- CIL/1, FER/1, FCT/1, OI/1 (typed envelopes, receipts, facts, overlays)
- HASH/ASL1 (interpretation and checking of
ReferenceBytes)
The Profile ID
ASL_ENC_CORE_V1and this document’s version are not encoded intoArtifactBytesorReferenceBytes. Encoding version is selected by context (deployment, profile, or store configuration), not embedded per value.
© 2025 Niklas Rydberg.
License
Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.
Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.
0. Overview
ENC/ASL1-CORE v1 defines the canonical, streaming-friendly, injective binary encoding used across the Amduat 2.0 substrate for two core value types from ASL/1-CORE:
- ArtifactBytes — canonical bytes for an ASL/1
Artifact - ReferenceBytes — canonical bytes for an ASL/1
Reference
This profile ensures:
- Injectivity — each ASL/1 value maps to exactly one byte string.
- Determinism — identical values yield identical encodings across implementations.
- Stability — bytes never depend on platform, locale, endian, or environment.
- Streaming-compatibility — encoders, decoders, and hashers operate in forward-only mode.
ASL_ENC_CORE_V1 is the canonical ASL/1 encoding profile used by the Amduat 2.0 substrate stack for:
- ASL/1 identity model (via canonical encoding + ASL1 hashing),
- the hashing substrate (HASH/ASL1),
- ASL/1-STORE persistence semantics,
- PEL/1 execution input/output artifacts,
- and canonical near-core profiles.
The encodings defined in this profile satisfy all canonical encoding requirements in ASL/1-CORE §3.2: injectivity, stability, determinism, explicit structure, type-sensitivity, byte-transparency, and streaming-friendliness.
1. Scope & Layering
1.1 Purpose
This specification defines:
- The canonical binary layout for
ArtifactBytesandReferenceBytes. - Normative encoding and decoding procedures.
- How these encodings interact with the ASL1 hash family.
- Required consistency checks when HASH/ASL1 is present.
- Streaming and injectivity requirements.
1.2 Non-goals
This profile does not define:
- Any filesystem, transport, or database representation.
- Chunking or multipart strategies for large artifacts.
- Any alternative encoding families (those are separate profiles).
- Semantics of
TypeTagvalues or registry rules. - Storage layout, replication, or policy.
Those concerns belong to ASL/1-STORE, PEL/1, HASH/ASL1, and higher layers.
1.3 Layering constraints
In line with the substrate overview:
ENC/ASL1-COREis a near-core substrate profile, not a kernel primitive.- It MUST NOT re-define
Artifact,Reference,TypeTag, orHashId; those are defined solely byASL/1-CORE. - It is storage-neutral and policy-neutral.
- It defines exactly one canonical encoding profile:
ASL_ENC_CORE_V1.
2. Conventions
The key words MUST, SHOULD, MAY, etc. follow RFC 2119.
2.1 Integer encodings
All multi-byte integers are encoded as big-endian:
u8— 1 byteu16— 2 bytesu32— 4 bytesu64— 8 bytes
Only fixed-width integers are used.
2.2 Booleans (presence flags)
Booleans used as presence flags are encoded as:
false→0x00true→0x01
Booleans are only used for presence flags, never for general logical conditions.
2.3 OctetString
Except where explicitly overridden, an OctetString is encoded as:
[length (u64)] [raw bytes]
lengthis the number of bytes.lengthMAY be zero.- There is no implicit terminator or padding.
Whenever this profile says an ASL/1 field is an OctetString, its canonical encoding is this u64 + bytes form unless explicitly stated otherwise.
Exception:
Reference.digestis encoded without an explicit length field; see §4.2.
3. Artifact Encoding
3.1 Logical structure (from ASL/1-CORE)
From ASL/1-CORE:
TypeTag {
tag_id: uint32
}
Artifact {
bytes: OctetString
type_tag: optional TypeTag
}
TypeTag semantics (registries, meaning of tag IDs) are opaque at this layer.
3.2 Canonical layout: ArtifactBytes
The canonical binary layout for an Artifact is:
+----------------------+-------------------------+---------------------------+
| has_type_tag (u8) | [type_tag (u32)] | bytes_len (u64) |
+----------------------+-------------------------+---------------------------+
| bytes (b[bytes_len]) ...
+------------------------------------------------------------------------
Fields:
-
has_type_tag (u8) — presence flag for
type_tag0x00→ notype_tag0x01→type_tagis present and follows immediately
-
type_tag (u32) — only present if
has_type_tag == 0x01- Encodes
TypeTag.tag_idas a 32-bit unsigned integer.
- Encodes
-
bytes_len (u64)
- Length in bytes of
Artifact.bytes. - MAY be zero.
- Length in bytes of
-
bytes
- Raw bytes of
Artifact.bytes(payload).
- Raw bytes of
No padding, alignment, or variant tags are introduced beyond what is explicitly described above.
3.3 Encoding (normative)
Let A be an Artifact. The canonical encoding function:
encode_artifact_core_v1 : Artifact → ArtifactBytes
is defined as:
-
Emit
has_type_tag(u8):0x00ifA.type_tagis absent.0x01ifA.type_tagis present.
-
If
A.type_tagis present, emitA.type_tag.tag_idasu32. -
Let
bytes_len = len(A.bytes); emitbytes_lenasu64. -
Emit the raw bytes of
A.bytes.
The result is the canonical ArtifactBytes.
This encoding satisfies the ASL/1-CORE §3.2 requirements: injective, stable, deterministic, explicit in structure, type-sensitive, byte-transparent, and streaming-friendly.
3.4 Decoding (normative)
Given a byte slice known to contain exactly one ArtifactBytes value, the canonical decoding function:
decode_artifact_core_v1 : ArtifactBytes → Artifact
is defined as:
-
Read
has_type_tag(u8).- If the value is neither
0x00nor0x01, fail with an encoding error.
- If the value is neither
-
If
has_type_tag == 0x01, readtag_id (u32)and constructTypeTag{ tag_id }. -
Read
bytes_len (u64). -
Read exactly
bytes_lenbytes; this isbytes. -
Construct
Artifact{ bytes, type_tag }wheretype_tagis eitherNoneorSome(TypeTag{ tag_id })per steps above.
Decoders MUST reject:
- Invalid presence flags (
has_type_tagnot in{0x00, 0x01}). - Truncated sequences (insufficient bytes for declared lengths).
- Over-long sequences where
bytes_lencannot be represented or allocated safely in the implementation’s execution model (encoding error). - Trailing bytes if the decoding context expects an isolated
ArtifactBytesvalue.
3.5 Injectivity
The mapping:
Artifact → ArtifactBytes
defined by encode_artifact_core_v1 is injective:
- Each
Artifactvalue has exactly one canonical byte string. - Decoding the canonical bytes via
decode_artifact_core_v1yields exactly thatArtifact.
3.6 Streaming properties
Encoders and decoders MUST NOT require backtracking:
-
The header (
has_type_tag, optionaltype_tag,bytes_len) is computed and emitted/read once, in order. -
bytesMAY be streamed directly:- Encoders MAY produce the payload incrementally after emitting
bytes_len. - Decoders MAY pass the payload through to a consumer or hasher as it is read.
- Encoders MAY produce the payload incrementally after emitting
Incremental hashing (e.g., computing digests over ArtifactBytes) MUST be possible with a single forward pass over the byte stream.
4. Reference Encoding
4.1 Logical structure (from ASL/1-CORE)
From ASL/1-CORE:
Reference {
hash_id: HashId // uint16
digest: OctetString
}
HashId = uint16
For encoding purposes, Reference.digest is treated as a raw digest byte string, not as a generic encoded u64 + bytes OctetString.
4.2 Canonical layout: ReferenceBytes
The canonical binary layout for a Reference is:
+----------------+---------------------------+
| hash_id (u16) | digest (b[?]) ...
+----------------+---------------------------+
Fields:
-
hash_id (u16)
- Encodes
Reference.hash_id. - Semantically, an element of the
HashIdspace defined by ASL/1-CORE (and populated by HASH/ASL1 when present).
- Encodes
-
digest
-
Raw digest bytes.
-
The length of
digestis not encoded explicitly in this profile. -
Digest length is determined by the decoding context:
- by the frame boundary of the
ReferenceBytesvalue (e.g. “this message consists of exactly oneReferenceBytes”), or - by an outer length-prefix in a higher-level enclosing structure.
- by the frame boundary of the
-
This layout is an explicit exception to the general
OctetString = u64 + bytesrule. It keepsReferenceBytescompact and relies on framing + the hash registry for length.
4.3 Encoding (normative)
Let R be a Reference. The canonical encoding function:
encode_reference_core_v1 : Reference → ReferenceBytes
is defined as:
-
Emit
hash_id = R.hash_idasu16. -
Emit the raw bytes of
R.digest.
When HASH/ASL1 is implemented and the hash_id is known, the encoder MUST ensure:
len(R.digest) == expected_digest_length(hash_id)
where expected_digest_length is taken from the HASH/ASL1 registry.
The result is the canonical ReferenceBytes.
4.4 Decoding & consistency checks (normative)
Given a byte slice known to contain exactly one ReferenceBytes value, the canonical decoding function:
decode_reference_core_v1 : ReferenceBytes → Reference
is defined as:
-
Read
hash_idasu16. -
Treat all remaining bytes in the slice as the digest
digest. -
Construct
Reference{ hash_id, digest }.
Boundary requirement:
Decoding contexts MUST provide explicit boundaries for ReferenceBytes values (e.g., via an external length-prefix or by framing the entire message as a single ReferenceBytes value). A decoder MUST NOT read beyond the slice that defines the ReferenceBytes frame.
Cross-profile consistency with HASH/ASL1 (when present):
If the implementation also implements HASH/ASL1 and recognizes this hash_id, then:
-
Let
expected_len = expected_digest_length(hash_id)from the ASL1 registry. -
The implementation MUST enforce:
len(digest) == expected_len -
Any mismatch MUST result in an encoding/integrity error.
If the implementation does not implement HASH/ASL1 or does not recognize the hash_id:
- It MAY accept the value as a structurally well-formed
Reference. - It MUST treat the algorithm as unsupported for digest recomputation or verification.
4.5 Injectivity
The mapping:
Reference → ReferenceBytes
defined by encode_reference_core_v1 is injective:
- Each
Referencevalue has exactly one canonical byte string. - Equality of
ReferenceBytesimplies equality of the underlyingReference(samehash_id, same digest bytes).
No additional normalization is performed.
5. Hash Interactions & Canonicality
5.1 Canonical hashing rule
For encoding profile ASL_ENC_CORE_V1, the canonical rule for constructing Reference values from Artifact values is:
ArtifactBytes = encode_artifact_core_v1(A)
digest = H(ArtifactBytes)
Reference = { hash_id = HID, digest = digest }
where:
Ais anArtifact(ASL/1-CORE),His a hash function associated withHIDin the ASL1 hash family,HIDis aHashId(u16).
This is ASL/CORE-REF-DERIVE/1 instantiated with ASL_ENC_CORE_V1.
REF-DERIVE INV/ENC/1 Under
ASL_ENC_CORE_V1, any component that claims to deriveReferencevalues fromArtifactvalues MUST use this rule.
5.2 Default algorithm in canonical deployments
In canonical Amduat 2.0 substrate deployments (per HASH/ASL1):
HashId = 0x0001is assigned toHASH-ASL1-256.- Digest length is 32 bytes.
HASH-ASL1-256is SHA-256 or semantically equivalent.
This profile does not force any particular HashId in all deployments, but:
- if a deployment adopts
HashId = 0x0001asHASH-ASL1-256, then anyReferencewithhash_id = 0x0001MUST have a 32-byte digest.
5.3 Deterministic agreement
If two implementations:
- implement
ASL_ENC_CORE_V1, and - use the same hash algorithm
Hfor a givenHashId,
then for any Artifact A they MUST:
- produce identical
ArtifactBytes = encode_artifact_core_v1(A), - produce identical
digest = H(ArtifactBytes), - produce identical
ReferenceandReferenceBytes = encode_reference_core_v1(Reference).
This is the determinism foundation used by ASL/1-STORE, PEL/1, FER/1, and FCT/1.
5.4 Identity contexts and encoding profile selection
For any context where Reference values are derived (e.g. a store, a PEL engine, a profile), the encoding profile MUST be fixed and explicit.
If a context adopts ASL_ENC_CORE_V1:
- All
Referencevalues in that context MUST be derived viaencode_artifact_core_v1and the canonical hashing rule (§5.1). - The context MUST NOT mix
References derived from different canonical encoding profiles inside the same logical identity space.
This ensures that for a given (hash_id, digest) pair, there is a unique underlying ArtifactBytes and Artifact (modulo cryptographic collisions).
6. Examples (Non-Normative)
Hex values are shown compactly without separators.
6.1 Artifact without type tag
Artifact:
bytes = DE AD // two bytes: 0xDE, 0xAD
type_tag = none
Encoding:
has_type_tag = 00
bytes_len = 0000000000000002
bytes = DEAD
Canonical ArtifactBytes:
00 0000000000000002 DEAD
Digest with HASH-ASL1-256 (SHA-256):
digest = SHA-256(00 0000000000000002 DEAD)
Assuming HashId = 0001 for HASH-ASL1-256, the ReferenceBytes are:
hash_id = 0001
digest = <32 digest bytes>
Canonical ReferenceBytes:
0001 <32 digest bytes>
6.2 Artifact with type tag & empty bytes
Artifact:
bytes = "" (empty)
type_tag = TypeTag{ tag_id = 5 }
Encoding:
has_type_tag = 01
type_tag = 00000005
bytes_len = 0000000000000000
bytes = (none)
Canonical ArtifactBytes:
01 00000005 0000000000000000
Hashing and ReferenceBytes proceed as in §6.1.
7. Conformance
An implementation conforms to ENC/ASL1-CORE v1.0.5 if and only if it:
-
Correctly encodes and decodes Artifacts
- Implements
encode_artifact_core_v1anddecode_artifact_core_v1exactly as in §3.3 and §3.4. - Produces and accepts only the canonical layout for
ArtifactBytes. - Ensures injectivity and exact round-tripping.
- Implements
-
Correctly encodes and decodes References
-
Implements
encode_reference_core_v1anddecode_reference_core_v1exactly as in §4.3 and §4.4. -
Produces and accepts only the canonical layout for
ReferenceBytes(nodigest_lenfield). -
When HASH/ASL1 is also implemented:
- Enforces digest-length consistency for all known
HashIds, i.e.len(digest) == expected_digest_length(hash_id).
- Enforces digest-length consistency for all known
-
-
Implements canonical hashing correctly
- Uses
ArtifactBytesfromencode_artifact_core_v1as the only input to ASL1 hash functions when derivingReferences under this profile. - Computes
Referencevia the canonical rule in §5.1. - Does not derive
References from non-canonical or alternative encodings in contexts that claim to useASL_ENC_CORE_V1.
- Uses
-
Preserves streaming-friendliness
- Does not require backward reads or multi-pass parsing for either
ArtifactBytesorReferenceBytes. - Supports incremental hashing and streaming of payload bytes.
- Ensures that decoding contexts provide explicit boundaries for each
ReferenceBytesvalue.
- Does not require backward reads or multi-pass parsing for either
-
Respects layering and identity semantics
- Does not re-define
Artifact,Reference,TypeTag, orHashId(those come fromASL/1-CORE). - Treats storage, transport, and policy as out-of-scope (delegated to ASL/1-STORE and higher profiles).
- Ensures that two logical ASL/1 values encode identically under this profile if and only if they are identical under ASL/1-CORE semantics.
- Does not re-define
Everything else — transport, storage layout, replication, indexing, overlays, and policy — belongs to ASL/1-STORE, HASH/ASL1, TGK/1, and higher profiles.
Document History
- 1.0.5 (2025-11-16): Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.