Article Object Model (AOM)

Crumpet Media — AOM (Spec v0.4, Multi-Asset Bundle)

Status: Final draft for implementation Core upgrades vs v0.3:

  • Canonical AOM Bundle (multi-file package) using deterministic CAR (UnixFS dir).

  • AOM Document = manifest (CBOR) inside the bundle, binding every file via hashes.

  • Still IPFS-first, compression required, encryption-at-rest mandatory, and Commit→Reveal with fees.

  • Three encryption modes (PAR/passphrase/recipient) unchanged, but now apply to the entire bundle.


0) Design Tenets

  • Bundle-first: An article (or any post) is a package: manifest + content parts (markdown, preview, images, video, attachments).

  • Deterministic & verifiable: Manifest binds every file by hash; rendering is reproducible; preview.html is trust-but-verify.

  • IPFS-first + L1-gated: The bundle is ciphertext on IPFS; Reveal on L1 releases the key (PAR) or the parameters/wraps (other modes).

  • Standardized pipeline: CAR → zstd → AEAD encrypt (no deviations).

  • Strict indexer gate: A post is Published only if Commit→Reveal, fees, signatures, and storage checks pass.


1) The AOM Bundle (multi-asset package)

1.1 Bundle format

  • Format: CAR v1 containing a UnixFS directory as root.

  • Canonicalization: File paths and links are stable, lowercase, UTF-8; directory entries sorted byte-lexicographically; timestamps zeroed.

  • Required layout (paths relative to bundle root):

  • Deterministic build: Given the same inputs, the CAR bytes MUST be identical (golden vectors recommended).

You can still do “image-only” or “video-only” posts — the manifest exists, but body.md may be absent. The manifest’s components[] (below) describes what’s inside.

1.2 AOM Document (manifest) — crumpet.cbor

  • Canonical CBOR object that binds the entire bundle:

    • Identifies content type (article, image, video, comment, press_release, …).

    • Lists components and per-file hashes (SHA-256) and sizes.

    • Records preview.html hash if present (for trust-but-verify).

    • Tracks language, author, tags, licensing, and lineage (versioning, previous CID).

  • AOMID = CIDv1(sha2-256) over the canonical CBOR bytes of /crumpet.cbor.

1.3 Components model (flexible, multi-type)

Inside the manifest:

Deterministic rendering:

  • Clients MAY show preview.html immediately but MUST re-render body.md with the reference renderer and compare hashes; on mismatch, discard preview and show local render.


2) Bundle → Ciphertext → IPFS

Build pipeline (MUST):

  1. Assemble Bundle as CAR v1 (UnixFS dir), deterministic ordering.

  2. Compress: zstd level 6 → produces bundle.car.zst.

  3. Encrypt: AEAD (XChaCha20-Poly1305) → produces ciphertext bytes.

  4. IPFS add: store ciphertext bytes; record:

    • stored_ipfs_cid (CIDv1 of ciphertext),

    • stored_sha256 (hash of ciphertext),

    • stored_len (length of ciphertext),

    • stored_codec = "car+zstd" (note: compression format of the pre-ciphertext is CAR+zstd; encryption yields opaque bytes, but we fix this label to indicate the pipeline).

Note: We do not publish any plaintext CIDs for internal files; everything at rest is ciphertext. Integrity of plaintext is ensured by verifying component hashes from the manifest after decryption.


3) Encryption Modes (unchanged, now over the bundle)

All modes encrypt the whole bundle.car.zst. Modes define how the CEK is disclosed:

  1. PAR (Public-After-Reveal) (default)

    • Random CEK.

    • Reveal envelope includes priv.cek_clear (plaintext CEK).

    • Anyone can decrypt after Reveal; before Reveal, bundle on IPFS is inert.

  2. Passphrase-Locked

    • CEK = Argon2id(passphrase, salt, params).

    • Reveal includes KDF params; CEK/passphrase not on-chain.

    • Reader supplies passphrase to decrypt.

  3. Recipient-Locked

    • Random CEK; envelope wraps CEK to declared recipients’ pubkeys (X25519 default; secp256k1-ECIES optional).

    • Only recipients can decrypt.

Order is fixed: CAR → zstd → AEAD (no alternative orderings). Clients MUST verify stored_sha256/stored_len on ciphertext before decrypting.


4) Commit → Reveal (same guarantees, stricter wording)

  • Commit binds the exact Reveal envelope bytes without the sign field:

    • commit_hash = SHA-256(envelope_without_sign_cbor_bytes)

    • Commit TX MUST include commit_hash and MUST pay protocol_fee.

  • Reveal publishes the full envelope, pays miner_tip, and MUST reference commit_ref.

  • Indexers treat posts as Published iff Commit→Reveal checks, fees, signatures, and storage/ mode-specific gates all pass (see §7).


5) Envelope (Reveal payload) — v4 fields

Why doc_cid is enough for plaintext integrity: The manifest inside the bundle lists every component with sha256 and size. After decrypt+decompress, clients validate each component hash against the manifest; then compute the CBOR CID of the manifest and compare to doc_cid.


6) Indexer Acceptance (unchanged in spirit, updated for bundle)

An indexer MUST mark a post Published iff all hold:

  1. Commit valid: on-chain; commit_hash == SHA-256(envelope_without_sign_bytes); protocol fee paid.

  2. Reveal valid: on-chain; envelope decodes; references Commit; miner tip meets minimum; sign verifies.

  3. Stored-ciphertext integrity: stored.codec=="car+zstd", len>0, sha256(ciphertext) matches, ipfs_cid valid CIDv1.

  4. Mode gate satisfied:

    • PAR: priv.cek_clear present and correct key length.

    • Passphrase: valid kdf object; no CEK in envelope.

    • Recipient: non-empty wrap[] of valid entries.

  5. Manifest identity: If decryption is possible (PAR or reader has keys), client MUST:

    • Decrypt → decompress → extract /crumpet.cbor, compute doc_cid and match envelope.

    • Verify every component listed in manifest exists with matching sha256 and bytes. If keys unavailable (passphrase/recipient), indexer STILL marks Published but MUST defer “content-verified” badge until a decrypt occurs (same rule as v0.3).

No other gates; no optionality.


7) Rendering & Safety (carryover from AOM v1)

  • Reference renderer defines deterministic Markdown→HTML (WASM+native).

  • Sanitizer allow-list; strip JS/event handlers; allow only http/https/ipfs links.

  • preview.html flow: show quickly, then re-render and compare to sha256(preview.html) from manifest; on mismatch, discard preview.


8) Versioning & Lineage

  • Each edit yields a new bundle and a new manifest (version++, previousCid = prior doc_cid).

  • doc_cid in Envelope always points at this edition’s manifest.

  • Explorers can show history by following previousCid.


9) Fees (unchanged; now based on ciphertext length)

  • Protocol fee (Commit) tiered by stored.len.

  • Miner tip (Reveal) must meet dynamic minimum.

  • Envelope records both in publication.fees.


10) Build & Publish (author CLI shape)


11) Worked Examples

11.1 Article with images + preview (PAR)

  • Bundle contains: /crumpet.cbor, /body.md, /preview.html, /media/hero.png, /media/chart.webp.

  • Build CAR → zstd → AEAD; IPFS add → fill stored.*.

  • Envelope: priv.enc="par-...", priv.cek_clear=CEK.

  • Verify (reader): fetch ciphertext → sha256 check → decrypt with CEK → decompress → verify all component hashes → compute doc_cid → render.

11.2 Press release with PDF attachment (Passphrase)

  • Bundle adds /attachments/press.pdf.

  • Envelope: priv.enc="passphrase-xchacha20-poly1305", includes KDF params.

  • Reader supplies passphrase to decrypt; rest identical.

  • Components include several /media/*.webp and no preview.html.

  • Envelope: priv.enc="xchacha20-poly1305", wrap[] for the editors.

  • Only recipients can decrypt to verify/preview.


12) Protocol Constants (recap)

  • Bundle: CAR v1 (UnixFS dir), deterministic ordering.

  • Compress: zstd level 6 (REQUIRED).

  • Encrypt: XChaCha20-Poly1305 (REQUIRED).

  • Hashing: sha2-256 for doc_cid (CBOR) and all component/file hashes.

  • Signatures: ed25519 (default), secp256k1 (alt).

  • KDF (passphrase): Argon2id {time:3, mem_kib:65536, lanes:1}, 16–32B salt.

  • Commit hash material: exact Envelope CBOR bytes without sign.


13) Back-compat notes vs AOM v1

  • Your previous crumpet.json becomes crumpet.cbor (canonical CBOR) but can be exported as JSON for dev tooling.

  • The old file layout (/body.md, /preview.html, /media/*) is preserved; we’ve just formalized packaging as a deterministic CAR.

  • All the old “deterministic render” and “trust-but-verify preview” rules remain — now enforced after decrypting the bundle.


14) Open Items (tight, implementation-ready)

  • Deterministic CAR rules: finalize “no-mtime”, path sort, and canonical UnixFS node shape in a short appendix.

  • Fee schedule constants: publish Tier S/M/L and per-block price for very large bundles.

  • Reference renderer + sanitizer: ship test corpus and golden vectors.

  • Explorer badges: “Published (PAR)”, “Published (Passphrase)”, “Published (Recipient)”, plus “Content-verified” once a decrypt has proven doc_cid and all component hashes.

Last updated