CIN API Reference (HTTP/JSON, WebSocket, gRPC)

Status: Planned for v0 / v0.1 Last updated: YYYY‑MM‑DD


0) Overview

The CIN (Crumpet Indexer Node) API exposes:

  • HTTP/JSON (public): search, article detail, trending, snapshot metadata, health/version.

  • WebSocket (public): live events stream (score updates, disputes, snapshot rotation).

  • gRPC (peer/control): snapshot announcements, metadata fetch, attestations, health.

A single node serves its own view of the corpus. Quorum/consensus across nodes is implemented in clients/SDKs by comparing responses (see headers in §1.5).


1) Conventions

1.1 Base URL & TLS

  • Public HTTP: https://<host>:<port>/v1/* (HTTP/2; HTTP/1.1 compatible)

  • WebSocket: wss://<host>:<port>/v1/ws/events

  • gRPC: grpcs://<host>:8081 (HTTP/2) — peer/control plane

1.2 Formats & Encodings

  • JSON; UTF‑8; timestamps are UNIX seconds.

  • CID in responses is string CIDv1; query params may accept either CIDv1 or 32‑byte digest hex (0x…).

1.3 Pagination & Limits

  • Pagination: from (offset, default 0), size (default 20, max 50).

  • Errors on size > 50 with 400.

1.4 Language & Ranking

  • lang uses BCP‑47 (e.g., en, en-US, zh-Hans).

  • sort / ranker: top | trending | new.

    • trending uses default half‑life 36h (node reports in /v1/version).

1.5 Response Headers (introspection)

Node returns metadata to help multi‑indexer clients:

Header
Meaning

x-crumpet-node

Node self‑name (operator‑defined)

x-crumpet-params-hash

Current Config parameter hash used at index time

x-crumpet-snapshot-root

Merkle root of the language segment used (for list endpoints)

x-crumpet-snapshot-to-time

Snapshot logical upper time (seconds)

etag

Fingerprint of the response payload for caching

cache-control

public, max-age=30, stale-while-revalidate=120 (typical)

Clients may include x-crumpet-client: <name>/<version> for telemetry attribution (optional).

1.6 Errors

Uniform error body:

Common codes: INVALID_PARAM, NOT_FOUND, TOO_MANY, RATE_LIMITED, INTERNAL.

1.7 Rate Limits

Operators may apply per‑IP token buckets. Standard headers:

  • x-rate-limit-limit, x-rate-limit-remaining, x-rate-limit-reset (epoch seconds). 429 on excess.


2) HTTP/JSON — Public Endpoints

GET /v1/search

Query params

  • q (string, 1–256 chars, required)

  • lang (string, optional; filter)

  • sort (top|trending|new, default top)

  • from (int ≥0, default 0)

  • size (int 1–50, default 20)

  • Filters: author=0x… (20‑byte hex), tag=foo (repeatable), status=normal|under_dispute|action_taken, fromTime, toTime (unix seconds)

200 Response

Errors

  • 400 INVALID_PARAM (empty q, invalid lang, size>50)

  • 200 total:0 for no matches


2.2 Article Detail

GET /v1/article/{cid}

Path

  • {cid}: CIDv1 (string) or digest hex 0x… (32 bytes)

200 Response

Errors

  • 404 NOT_FOUND (unknown CID)

  • 502 when dependent subsystems unavailable (rare)


GET /v1/trending

Query params

  • lang (string; required when server indexes multiple langs)

  • window (6h|12h|24h|48h|7d, default 24h)

  • size (1–50)

200 Response


2.4 Snapshot Metadata (per language)

GET /v1/snapshot/{lang}/latest

200 Response

Errors

  • 404 if language not served.


2.5 Health

GET /v1/health

2.6 Version

GET /v1/version


3) WebSocket — Live Events

GET /v1/ws/events → JSON Lines (one JSON object per message). Clients should auto‑reconnect with backoff and resume from last seq if provided.

Common envelope

Event types & payloads

  • article_published

  • score_updated

  • dispute_opened

  • dispute_resolved

  • snapshot_rotated

Close codes

  • 1000 normal; 1013 try again later (maintenance); 1011 server error.


4) gRPC — Control & Federation

4.1 Service Definition (excerpt, proto/cin.proto)

Transport

  • HTTP/2 + TLS by default; optional mTLS between known peers.

  • Timeouts: client default 3s; retry with backoff on UNAVAILABLE.

Semantics

  • AnnounceSnapshot: producer nodes notify peers; peers may pull bundle (HTTP/3/IPFS) and verify.

  • AttestSnapshot: watchers/peers post acceptance; node records for local federation decisions (on‑chain attest handled separately by Watchers → NodeRewards).


5) Object Schemas (JSON)

5.1 ArticleSummary

5.2 ArticleDetail

5.3 SnapshotMeta

See §2.4.

5.4 Error

See §1.6.


6) Examples

6.2 cURL — Article

6.3 WebSocket — NodeJS

6.4 gRPC — grpcurl


7) Caching & Freshness

  • List endpoints (/search, /trending) are cacheable for short TTL (e.g., 30s); include etag.

  • /article/{cid} may be cached longer (immutable CID content), but score and dispute fields change; TTL ≤ 60s.

  • Cache-Control and ETag present; clients may use If-None-Match.


8) Security & CORS

  • TLS required; HSTS recommended.

  • CORS: node operators may restrict origins. Default: allow GET from any origin; disallow credentials.

  • DoS protection: pagination caps; q length cap; server timeouts (2s per request, 10s hard).


9) Status & Error Codes (HTTP)

Code
When

200

Success

304

If-None-Match hit

400

Validation failed (INVALID_PARAM)

404

Not found (article/lang)

408

Upstream timeout (rare)

429

Rate limited

500

Internal error

503

Maintenance / overload


10) Compatibility & Versioning

  • Path version /v1/ for breaking changes.

  • Additive fields are non‑breaking.

  • x-crumpet-params-hash signals changes in economic parameters that might affect ranking/scoring; clients may reconcile across nodes.


11) SDK Mapping (JS)

  • cin.search({ … }) → /v1/search

  • cin.getArticle(cid) → /v1/article/{cid}

  • cin.trending({ … }) → /v1/trending

  • cin.getLatestSnapshot(lang) → /v1/snapshot/{lang}/latest

  • cin.events() → WebSocket consumer

  • cin.headers(resp) → parse x-crumpet-* for multi‑indexer quorum.


12) Operational Notes for Operators

  • Keep /v1/health cheap; don’t block on external calls.

  • Ensure HTTP/2 at edge for better multiplexing; enable WebSocket upgrade.

  • Expose Prometheus at /metrics (scrape path; not public on the same origin unless protected).

  • If you run multiple languages, snapshot rotations may occur independently; headers are per language for list endpoints.


13) Acceptance Criteria

  • All listed endpoints implemented with shapes above.

  • Headers present; ETag consistent; cache directives set.

  • WebSocket emits events promptly on on‑chain changes and snapshot rotations.

  • gRPC control plane compiles from proto/cin.proto and passes interop tests.

  • SDK smoke tests pass against a fresh node.


Appendix A — Query Grammar & Normalization

  • q is passed to analyzers per language; unsupported tokens ignored.

  • Phrase queries: quoted "exact phrase" supported.

  • Field filters (optional v0.1): tag:foo author:0xabc recognized if present; otherwise treated as text.

  • Stopwords per language follow ICU defaults.

Appendix B — Fingerprint for Quorum (client‑side)

  • Compute SHA‑256 over canonical JSON:

    where scoreNetBucket = floor(scoreNet / 5) to absorb tiny deltas.

Last updated