amduat-api/notes/on-disk layout.md at 1d552bd46a56902615dfce5e25ecf7c12df977ab

niklas/amduat-api

Fork 0

Carl Niklas Rydberg 1d552bd46a Added some notes that needs to be analyzed.

2026-01-17 00:19:49 +01:00

6 KiB

Raw Blame History

Good, this is the right moment to pin this down, because on-disk shape is where abstraction leaks if we’re sloppy.

I’ll answer in layers:

What ASL actually requires on disk (minimal contract)
A concrete, sane on-disk layout that fits what you already have
How shards, blocks, artifacts, logs, and snapshots map
How this scales from “single laptop” to “Common”
Why this does not over-spec ASL-CORE

I’ll stay very explicit about what is normative vs conventional.

1. What ASL actually requires on disk (the hard minimum)

ASL itself does not care about directories.

ASL-STORE / ASL-STORE-INDEX require only that the store can provide:

Immutable blocks
An append-only log
Sealed snapshots
Deterministic replay
CAS addressing

So the only hard requirements are:

Blocks are immutable after seal
Index entries point to (block_id, offset, length)
Logs are append-only and ordered
Snapshots pin a prefix of state

Everything else is an implementation choice.

That means: 👉 your current “store root with shards” is already compatible.

2. Recommended on-disk layout (practical, explicit)

Here is a layout that is:

compatible with your current shard idea
works on POSIX
maps cleanly to ZFS later
easy to reason about during rescue / forensics

Store root (per domain)

/asl/
└── domains/
    └── <domain-id>/
        ├── meta/
        ├── blocks/
        ├── index/
        ├── log/
        ├── snapshots/
        └── tmp/

Everything below is domain-local.

3. Blocks (the real storage substrate)

3.1 Block storage (immutable)

blocks/
├── open/
│   └── blk_<uuid>.tmp
└── sealed/
    ├── 00/
    │   └── <blockid>.blk
    ├── 01/
    │   └── <blockid>.blk
    └── ff/
        └── <blockid>.blk

blockid = CAS hash
Sharded by prefix (first byte or two)
Open blocks are never visible
Sealed blocks are immutable

This directly matches your block + offset mental model.

Important: artifacts do not live as files They live inside blocks.

4. Index (where artifacts become visible)

Your shard-based approach fits perfectly here.

index/
├── shard-000/
│   ├── segment-0001.idx
│   ├── segment-0002.idx
│   └── bloom.bin
├── shard-001/
│   └── ...
└── shard-fff/

Each segment-XXXX.idx is:

append-only while open
immutable once sealed
referenced by the log

Index records point to:

ArtifactKey → (BlockID, offset, length)

This is exactly ASL-STORE-INDEX.

5. Append-only log (the spine of truth)

log/
├── log-0000000000000000.asl
├── log-0000000000001000.asl
└── current

Log records include:

index additions
tombstones
segment seals
DAM updates
witness rotation artifacts

Rules:

Logs are strictly ordered
Never rewritten
Replayable from snapshot

6. Snapshots (checkpoints, not magic)

snapshots/
├── snapshot-00000123/
│   ├── manifest.yaml
│   ├── pinned-segments.txt
│   └── pinned-blocks.txt
└── snapshot-00000124/

A snapshot is:

a commit point
a pin-set (segments + blocks)
a replay anchor

ZFS snapshots fit perfectly here but are not required.

7. Meta (authority, policy, identity)

meta/
├── dam/
│   └── dam-current.art
├── policy/
│   └── policy-current.art
├── keys/
│   ├── pub/
│   └── priv/
└── host/
    └── host-id

Important rule:

Meta artifacts are still artifacts

They must:

be sealed
be indexed
appear in the log

The files here are cached materializations.

8. Tmp (explicitly unsafe)

tmp/
├── ingest/
├── scratch/
└── pty/

This is where:

asl-capture writes streams
open blocks live
crashes are allowed

Nothing here is trusted.

9. How shards fit (your current design)

You said:

“I have a store root with shards with artifacts in them.”

That maps cleanly if you reinterpret it as:

shards own index segments
artifacts are logical entries, not files
shard locality = lookup acceleration

So instead of:

shard/
└── artifact-file

You have:

shard/
└── index segments → blocks

This is the key mental shift you already sensed.

10. Common vs Personal vs Group (on disk)

They look identical on disk.

Differences live in:

DAM
policy
witness artifacts

Common

quorum witness
stricter validation
often read-only on most hosts

Personal

self-authority
multiple active writers allowed
same block/index/log structure

Group

hybrid policies

No special filesystem layout needed.

11. Why this is the right abstraction boundary

You said:

“I used to look at the data as artifacts. Now it seems I’m better off looking at data as blocks with artifacts in them.”

That’s exactly right.

Artifacts are semantic Blocks are physical

This layout lets you:

reason in artifacts
store in blocks
snapshot blocks
index artifacts
federate semantics

without mixing concerns.

12. One sentence to lock it in

Artifacts are indexed meanings; blocks are frozen bytes; snapshots pin the relationship.

13. What you can safely implement now

Even before ASL-CORE / PEL are finished, you can implement:

block store
block sealing
index segments
append-only log
snapshot pinning
DAM storage and validation

All of that is infrastructure, not policy.

If you want next, I strongly suggest one of these:

walk through one artifact write end-to-end on disk
show how asl-capture writes into blocks
show how a personal domain syncs across two machines
map this layout directly onto ZFS datasets

You’re no longer at risk of mixing projects — you’re converging.

6 KiB Raw Blame History Unescape Escape