Carl Niklas Rydberg b47b914224 Scaffold C layout and ASL registry model

2025-12-19 19:22:40 +01:00

14 KiB

Raw Permalink Blame History

ASL/1-CORE — Artifact Substrate Layer (Core)

Status: Approved Owner: Niklas Rydberg Version: 0.4.1 SoT: Yes Last Updated: 2025-11-16 Linked Phase Pack: N/A Tags: [deterministic, binary-minimalism]

Document ID: ASL/1-CORE Layer: L0 — Pure logical value model (no persistence / execution semantics)

Depends on (normative):

None (foundational model)

Informative references:

ENC/ASL1-CORE v1.x — canonical encoding profile (ASL_ENC_CORE_V1)
HASH/ASL1 v0.2.2 — ASL1 hash family and HashId assignments
ASL/1-STORE v0.4.0 — content-addressable store over ASL/1-CORE
TGK/1-CORE v0.7.0 — trace graph kernel over Reference
PEL/1 — execution substrate

License

Except where otherwise noted, this document (text and diagrams) is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

The identifier registries and mapping tables (e.g. TypeTag IDs, HashId assignments, EdgeTypeId tables) are additionally made available under CC0 1.0 Universal (CC0) to enable unrestricted reuse in implementations and derivative specifications.

Code examples in this document are provided under the Apache License 2.0 unless explicitly stated otherwise. Test vectors, where present, are dedicated to the public domain under CC0 1.0.

0. Conventions

The key words MUST, MUST NOT, REQUIRED, SHOULD, and MAY are to be interpreted as in RFC 2119.

ASL/1-CORE defines only logical values and their equality. It does not define storage formats, protocols, runtime APIs, or policy.

Primitive logical types:

OctetString — finite sequence of bytes 0x00–0xFF.
uint16, uint32, uint64 — fixed-width unsigned integers.

Binary layout, endianness, and on-wire representation come from encoding profiles, not from ASL/1-CORE itself.

1. Purpose & Non-Goals

1.1 Purpose

ASL/1-CORE defines the artifact substrate for Amduat 2.0:

what an Artifact is,
what a TypeTag is,
what a Reference is, and
how content-addressed identity is defined via canonical encodings and hash functions.

It aims to make computing sane by enforcing that:

content and type hints are explicit,
identity is precise and field-based,
logical values are immutable,
all higher behavior (store, execution, provenance, policy) is layered on top.

All other Amduat layers — STORE, PEL, CIL, FCT, FER, OI, TGK — must respect and build on this substrate.

1.2 Non-goals

ASL/1-CORE explicitly does not define:

Store APIs or persistence guarantees.
Execution runtimes, scheduling, or side-effects.
Certificates, trust semantics, or authorization.
Networks, transports, or wire formats.
Compression, chunking, encryption, or indexing.

Those are defined by ASL/1-STORE, PEL/1, CIL/1, FCT/1, FER/1, OI/1, TGK/1-CORE, and other profiles.

2. Core Value Model

2.1 OctetString

OctetString = finite sequence of 8-bit bytes (0x00–0xFF)

ASL/1-CORE does not assign any structure (e.g., text vs binary) to OctetString. Structure, if any, is introduced by higher-layer semantics keyed off TypeTag.

2.2 TypeTag

A TypeTag identifies how higher layers intend to interpret an Artifact’s bytes.

TypeTag {
  tag_id: uint32
}

Properties:

tag_id is opaque at this layer.
No particular tag_id semantics are defined here.
tag_id participates in identity: change the tag, you’ve changed the Artifact.

2.2.1 Tag ranges (conventions only)

By convention (non-normative here):

0x00000000–0x0FFFFFFF — core stack / shared profiles.
0x10000000–0xFFFFFFFF — extension / domain-specific tags.

Concrete registries and governance of tag_id live in separate documents.

2.3 Artifact

An Artifact is the fundamental immutable value in ASL/1:

Artifact {
  bytes:    OctetString
  type_tag: optional TypeTag
}

Properties:

Immutable logical value.
Two identity-sensitive dimensions:
- bytes — exact content bytes.
- type_tag — presence + tag_id if present.

ASL/CORE-ART-EQ/1 Two Artifacts A and B are identical in ASL/1-CORE iff:

A.bytes and B.bytes are byte-for-byte equal; and

either both have no type_tag, or both have a type_tag and A.type_tag.tag_id == B.type_tag.tag_id.

No encoding profile, store, or runtime may alter this equality.

ASL/CORE-IMMUT/1 Once an Artifact value is created, it is considered immutable. Any change to bytes or type_tag produces a different Artifact.

2.4 HashId

HashId = uint16

A HashId identifies a particular hash algorithm in a given family.

ASL/1-CORE itself is hash-family agnostic. The Amduat 2.0 core stack uses the "ASL1" family defined in HASH/ASL1 as the canonical family for identity-critical References.

2.5 Reference

A Reference is a content address for an Artifact:

Reference {
  hash_id: HashId
  digest:  OctetString
}

Interpretation:

hash_id selects a hash algorithm (e.g. HASH-ASL1-256).
digest is that algorithm’s digest of a canonical encoding of some Artifact.

ASL/CORE-REF-EQ/1 Two References R1 and R2 are identical iff:

R1.hash_id == R2.hash_id, and

R1.digest and R2.digest are byte-for-byte equal.

No cross-hash_id equivalence is defined at this layer. If two different (hash_id, digest) pairs refer to Artifacts that happen to be “the same” in some application sense, that is strictly a higher-layer interpretation.

3. Encoding Profiles

ASL/1-CORE separates logical values from concrete encodings via encoding profiles.

3.1 EncodingProfileId

EncodingProfileId = uint16

Each encoding profile (e.g. ASL_ENC_CORE_V1) is defined in its own document and specifies:

canonical ArtifactBytes encodings;
optionally ReferenceBytes encodings;
invariants required to preserve ASL/1-CORE identity.

The baseline encoding profile in Amduat 2.0 is:

ASL_ENC_CORE_V1 = 0x0001 — defined in ENC/ASL1-CORE v1.x.

3.2 Profile requirements

Any encoding profile used with ASL/1-CORE MUST satisfy:

Identity preservation

For all Artifacts A and B:
- A and B are identical under ASL/CORE-ART-EQ/1 ⇔ their canonical encodings under that profile are bit-identical.
Injectivity

Distinct Artifacts MUST NOT produce identical canonical encodings.
Stability and determinism

For any Artifact, canonical encoding:
- MUST be stable across time and implementations,
- MUST NOT depend on environment, clock, locale, or configuration.
Explicit structure

Field ordering and numeric formats MUST be fixed and unambiguous.
Byte transparency

Artifact.bytes MUST be encoded exactly as-is (no hidden transcoding).
Streaming-friendliness

Canonical encodings MUST be producible and consumable in a single forward-only pass.

Encoding profiles MAY impose extra constraints (e.g. on particular TypeTag subsets) but MUST NOT break the above.

4. Hashing and Reference Derivation

ASL/1-CORE defines how canonical encodings and hash functions combine to produce References.

4.1 Canonical encoding step

Given:

Artifact A,
encoding profile P with canonical encoder encode_P(A) -> ArtifactBytes,

encode_P MUST satisfy §3.2.

4.2 Reference derivation rule

Given:

Artifact A,
encoding profile P,
hash algorithm H with:
- HashId = HID,
- fixed digest length L bytes,

then the Reference R for A under (P, H) is:

ArtifactBytes = encode_P(A)
digest        = H(ArtifactBytes)
Reference     = { hash_id = HID, digest = digest }

ASL/CORE-REF-DERIVE/1 Any component that claims to derive References from Artifacts for a given (EncodingProfileId, HashId) MUST use this exact procedure.

4.3 Deterministic agreement lemma (informative)

For any two conformant implementations that share:

the same encoding profile P, and
the same hash algorithm H with HashId = HID,

then for any Artifact A:

both will compute identical ArtifactBytes,
both will compute identical digest = H(ArtifactBytes),
both will form identical Reference {hash_id = HID, digest = digest}.

This is the basis for cross-Store and cross-domain determinism in Amduat.

4.4 Canonical family for Amduat 2.0 (informative)

While ASL/1-CORE is conceptually family-agnostic, the Amduat 2.0 substrate standardizes:

ASL_ENC_CORE_V1 as the canonical Artifact encoding profile;
HASH-ASL1-256 (HashId = 0x0001) as the canonical default hash algorithm for identity-critical surfaces.

Other (EncodingProfileId, HashId) pairs are allowed but must be explicitly declared by the consuming profile or implementation.

4.5 Crypto agility

ASL/1-CORE supports evolution by:

delegating algorithm definitions and HashId assignments to HASH/ASL1;
delegating binary encodings to ENC/* profiles.

Higher layers MAY:

compute multiple References for the same Artifact (multi-hash, multi-encoding),
define migration policies,
mark some References as “preferred” or “legacy”.

ASL/1-CORE itself:

treats References as opaque (hash_id, digest) pairs;
does not specify any relationship between different References to “the same” Artifact other than equality within that pair.

5. Logical vs Physical Representation

5.1 Logical-only substrate

Artifacts and References are logical values.

ASL/1-CORE:

does not care where or how they’re stored;
does not care how they’re transported;
does not assume any particular API shape.

5.2 Internal representation freedom

Implementations MAY represent values as:

structs,
slices,
memory-mapped buffers,
immutable trees,
or any other structure,

so long as they can:

emit canonical encodings for supported profiles,
compute hashes correctly,
respect ASL/1-CORE identity semantics.

5.3 Passing values between layers

Passing Artifact or Reference between components:

means passing a value, not a mutable object.

Implementations:

MAY share underlying buffers internally,
MUST treat the logical value as immutable,
MUST NOT let in-place mutation change a value that has already been observed as an Artifact or Reference.

6. Identity, Equality, and Collisions

6.1 Artifact identity

Restating for emphasis:

ASL/CORE-ART-ID/1 Artifact identity is purely field-based:

bytes equality + type_tag presence + tag_id equality (if present).

Encoding profiles and hash functions MUST preserve this identity; they MUST NOT introduce alternative notions of “the same artifact” at this layer.

6.2 Reference identity

ASL/CORE-REF-ID/1 Reference identity is purely:

hash_id equality + digest byte equality.

Different (hash_id, digest) pairs are always distinct References, even if they logically point to the same underlying Artifact as understood by some higher layer.

6.3 Collision assumptions

ASL/1-CORE assumes the configured hash algorithms are cryptographically strong:

collisions are treated as extraordinary substrate failures, not supported behavior.

If two distinct Artifacts produce the same (hash_id, digest):

ASL/1-CORE itself does not define remediation;
ASL/1-STORE is responsible for surfacing this as an integrity error;
higher profiles (e.g. CIL/1, FCT/1) MAY define detection and response strategies.

7. Relationship to Other Layers (Informative)

7.1 ASL/1-STORE

ASL/1-STORE:

models StoreInstances as partial mappings Reference -> Artifact,
parameterizes each StoreInstance by a single StoreConfig = {encoding_profile, hash_id},
uses ASL/CORE-REF-DERIVE/1 to compute References in put,
respects ASL/CORE-ART-ID/1 and ASL/CORE-REF-ID/1.

STORE adds persistence, error semantics, and StoreConfig; it does not change the core value model.

7.2 TGK/1-CORE

TGK/1-CORE:

treats Reference as graph nodes,
treats specific Artifacts (EdgeArtifacts) as encodings of graph edges,
defines a ProvenanceGraph as a projection over Artifacts and configured profiles.

TGK relies on ASL/1-CORE to ensure:

Artifacts are immutable,
References are stable and deterministic across implementations,
all provenance evidence is expressed as Artifacts and References.

7.3 PEL/1, CIL/1, FCT/1, FER/1, OI/1

These layers:

allocate specific TypeTag ranges and schemas,
encode programs, execution traces, certificates, facts, overlays as Artifacts,
use References consistently via ASL/CORE-REF-DERIVE/1,
may store those values in ASL/1-STORE and expose them through TGK.

They must not override or reinterpret ASL/1-CORE equality; they build on it.

8. Conformance

An implementation is ASL/1-CORE–conformant if it:

Implements the value types
- Provides logical structures for Artifact, TypeTag, and Reference with at least the fields described in §2.
Respects Artifact and Reference equality
- Implements identity exactly as in ASL/CORE-ART-EQ/1 and ASL/CORE-REF-EQ/1 (and the derived ID invariants).
Uses encoding profiles appropriately
- Uses only encoding profiles that satisfy §3.2.
- For any encoding profile it claims to support, can produce canonical encodings for all Artifacts.
Derives References correctly
- Derives References strictly according to ASL/CORE-REF-DERIVE/1 for the declared (EncodingProfileId, HashId) pair.
Enforces immutability
- Treats Artifacts and References as immutable logical values.
- Does not leak any mechanism that would let a consumer mutate an Artifact or Reference “in place”.
Maintains separation of concerns
- Does not embed storage, execution, policy, or graph semantics into ASL/1-CORE constructs.
- Leaves stores, execution engines, and graph kernels to their respective layers.

Everything else — API design, transport formats, performance characteristics, deployment topology — lies outside ASL/1-CORE and MUST be specified by separate surfaces.

Document History

0.4.1 (2025-11-16): Registered as Tier-1 spec and aligned to the Amduat 2.0 substrate baseline.

14 KiB Raw Permalink Blame History Unescape Escape