HTTP API reference

REST-ish JSON API. Auth via X-Service-Token header. All paths are scoped by {tenant}.

Base URL

https://storage.ceradela.com (one origin, tenant in the path)

Authentication

Every request includes:

X-Service-Token: <64-hex-token-issued-at-signup>

Token validity is checked against the tenant in the URL. A valid token for tenant acme that tries to read /api/tenant/globex/... gets 403, not 200.

Rotation / revocation: email us. On our side it takes seconds.

Endpoints

POST /api/tenant/{tenant}/ingest

Push a batch of rows into cold storage. Each call produces one .cbebomth file.

Body:
{
  "table": "orders",
  "rows": [ {...}, {...} ],
  "idempotency_key": "batch-2026-04-19-001",   // optional, 64 chars max
  "partition": "2026-03"                         // optional; overrides auto-derive
}

Response 200:
{
  "ok": true,
  "tenant": "acme",
  "table": "orders",
  "partition": "2026-04",
  "rows_stored": 24812,
  "bytes_stored": 134857,
  "storage_key": "acme/tables/orders/2026-04_ingest_1776563607442.cbebomth",
  "columns": 7,
  "replayed": false            // true if idempotency_key matched a prior successful ingest
}

Constraints:

  • Request body capped at 10 MB (enforced — returns 413 if exceeded)
  • Max 50,000 rows per request
  • table must match [a-zA-Z_][a-zA-Z0-9_]*
  • Row shape is inferred from the first batch; columns can vary across calls (each file stands alone)

Partition selection — in priority order:

  1. Explicit partition field in the body (must be YYYY-MM)
  2. First row's created_at (or timestamp, event_time, ts) as ISO8601 → UTC month
  3. Current UTC month

Use explicit partition when you're backfilling older data, or when your row times don't match when you want to bucket them.

Idempotency — if idempotency_key is provided and a prior ingest for the same (tenant, key) succeeded, this call returns that row's metadata with replayed: true and does not create a new file. Safe to retry on 500s. Keys are scoped per tenant, so two tenants can use the same key value without collision.

Error codes:

  • 400 — bad table name, empty rows, malformed JSON, bad partition format
  • 403 — token invalid for this tenant
  • 413 — request body > 10 MB or more than 50,000 rows
  • 429 — rate limit (60 req/min per tenant) — retry after the Retry-After seconds
  • 500 — encode or storage failure (rare; we roll back on index-insert failure)

GET /api/tenant/{tenant}/archives

List your archive files. Pagination via page + limit.

Query params:
  table    filter to one table (optional)
  page     default 1
  limit    default 50, max 500

Response 200:
{
  "tenant": "acme",
  "rows": [{
    "id": 1,
    "table_name": "orders",
    "partition": "2026-04",
    "storage_key": "acme/tables/orders/2026-04_ingest_1776563607442.cbebomth",
    "row_count": 24812,
    "byte_size": 134857,
    "columns": "id,customer_id,total_cents,status,created_at",
    "created_at": "2026-04-19T01:53:27Z"
  }, ...],
  "total": 42,
  "page": 1,
  "limit": 50
}

GET /api/tenant/{tenant}/usage

Billing + usage snapshot. One SQL query, very cheap.

Response 200:
{
  "tenant": "acme",
  "files": 142,
  "tables": 5,
  "total_rows": 1842391,
  "total_bytes": 18485723,
  "first_ingest": "2026-01-03T00:14:22Z",
  "last_ingest":  "2026-04-19T01:53:27Z"
}

DELETE /api/tenant/{tenant}/archives/{id}

Delete a single archive file + its index row. 404 if the id doesn't belong to this tenant (prevents cross-tenant enumeration).

DELETE /api/tenant/acme/archives/42

Response 200:
{ "ok": true, "deleted_id": 42, "storage_key": "acme/tables/orders/..." }

DELETE /api/tenant/{tenant}/data

GDPR "right to be forgotten" — remove every file + index row for a tenant. Requires a confirmation header matching the tenant name, so a stray DELETE doesn't wipe production:

DELETE /api/tenant/acme/data
X-Service-Token: ...
X-Confirm-Wipe: acme        // must equal the tenant name in the path

Response 200:
{
  "ok": true,
  "tenant": "acme",
  "files": 142,
  "total_rows": 1842391,
  "total_bytes": 18485723
}

Both the S3 objects (bulk DeleteObjects) and the index rows go away. Irreversible.

GET /api/tenant/{tenant}/cold/{table}

Read rows back. Merges every file matching (tenant, table, partition), decodes, paginates.

Query params:
  partition   YYYY-MM month label, required
  page        default 1
  limit       default 50, max 500

Response 200:
{
  "tenant": "acme",
  "table": "orders",
  "partition": "2026-04",
  "columns": ["id","customer_id","total_cents","status","created_at"],
  "rows": [ {...}, {...} ],
  "total": 24812,
  "page": 1,
  "limit": 50,
  "files": 3
}

files tells you how many physical archives were merged. 1 = one ingest call landed in this partition. N > 1 = multiple ingests across the month.

Error shape

We return a single generic error string. Specifics log server-side — this is deliberate to avoid leaking validation details.

HTTP/1.1 400 Bad Request
Content-Type: application/json

{"error": "Something went wrong"}

Rate limits

Enforced in-process per Lambda container. Limits are per-tenant-per-minute:

Ingest60 req/min
Cold reads (/cold/...)600 req/min
Listings + usage (/archives, /usage)120 req/min

Hitting a limit returns 429 with a Retry-After: 60 header. Back off with exponential jitter.

Idempotency

Ingest is not idempotent on its own — resending the same batch produces a second file with the same rows. If you need exactly-once, send a client-side dedup key in your row data and filter on read. A built-in idempotency token is planned.

Partitioning

Every ingest call gets bucketed into the current UTC month. A row with created_at = 2026-03-28 pushed on 2026-04-01 lands in the 2026-04 partition — partitioning is by ingest time, not by row timestamp. This keeps the API stateless.

If you need physical partitioning that matches an event-time column, split your ingest calls accordingly — each call is atomic and lands in one file.

Data format

Files are BEBO columnar archives. Full on-disk spec at /docs/spec. To read outside the API, pull down the file and use the bebo CLI (/docs/cli). The decoder is MIT-licensed and open-source; the format is documented and stable.

Internal endpoints

Ceradela's own webstore + admin databases run through a separate scope-based path (/api/storage/{scope}/...) gated by a shared service token. These are not exposed to external tenants and are not documented here.