bebo CLI reference

Standalone decoder for BEBO archive files. Read, verify, and export without hitting the Ceradela API. MIT-licensed, source on GitHub.

Install

# Homebrew (macOS / Linux)
brew install ceradela/tap/bebo

# Direct download
curl -sSL https://install.ceradela.com/bebo | sh

# Verify
bebo --version
# bebo 0.1.0

Global flags

-o, --outputtable | ndjson | json | csv | tsv — default table on a TTY, ndjson when piping
-c, --columnscomma-separated list — skips non-matching columns during decode
-n, --limitrow limit for streaming commands
--wherepredicate like "id>100" or "total<1000" (numeric columns only, v0.1)
--dictpath to trained zstd dictionary if the archive was compressed with one
--no-colordisable ANSI escape codes
--versionprint version and exit

Read / inspect commands

bebo head <file>

First N rows (default 10).

$ bebo head orders.cbebomth -n 3 -c id,total_cents
id    total_cents
──    ───────────
1     12500
2     8900
3     45000

bebo cat <file>

Stream all rows. Honors --where for filtering and -c for column pruning. Output is NDJSON when piped.

$ bebo cat orders.cbebomth --where "total_cents>10000" -c id,customer_id | jq -s length
1842

bebo schema <file>

Column schema with inferred Go types.

$ bebo schema orders.cbebomth
orders.cbebomth (monthly, 24812 rows, 7 cols)
  id                       int64
  customer_id              int64
  total_cents              int64
  status                   string
  created_at               time.Time
  metadata                 jsonb
  tags                     string[]

bebo meta <file>

File metadata: kind, bytes, row count, columns. JSON output by default.

bebo count <file>

Row count. O(1) for monthly archives (reads footer), O(N) for bundles (decodes each partition).

bebo sizes <file>

Per-column uncompressed size — useful for debugging why an archive is bigger than expected.

bebo stats <file>

Per-column non-null, null, distinct, min, max counts.

bebo sample <file>

Random sample of N rows. --seed for reproducibility.

DR / verify commands

bebo verify <file>

CRC32 per page + SHA-256 of the whole file. Quick integrity check.

bebo verify --deep <file>

Re-decodes every page — proves the archive is restorable, not just on-disk intact. Takes longer (equivalent to a full read) but this is the check that caught GitLab-style silent corruption before it ships a month of bad backups.

$ bebo verify --deep orders.cbebomth
{
  "file": "orders.cbebomth",
  "kind": "monthly",
  "bytes": 104857,
  "sha256": "bd77...",
  "crc32": "2144df1c",
  "decode_ok": true,
  "columns": 7,
  "rows_decoded": 24812
}

bebo diff <a> <b>

Schema + row-count diff between two archives. Useful for detecting schema drift between versions.

Portability / exit commands

bebo export <file> --to=<format>

Convert to another format. Use --out path to write a file, otherwise stdout.

FormatUse case
ndjsonStreaming, typed numbers, jq-friendly. Default when piping.
jsonOne JSON array. Smaller files only.
csvRFC 4180. Excel / Google Sheets.
tsvTab-separated. Unix pipes.
parquetVia DuckDB passthrough (install DuckDB). Industry-standard columnar.
# Parquet via the DuckDB passthrough path (v0.1 approach)
$ bebo export orders.cbebomth --to=ndjson | \
  duckdb -c "COPY (SELECT * FROM read_ndjson_auto('/dev/stdin')) TO 'orders.parquet' (FORMAT PARQUET)"

# CSV direct
$ bebo export orders.cbebomth --to=csv --out orders.csv

bebo merge <files...> --out <path>

Concatenate compatible archives into one NDJSON stream. Writing back to .cbebomth isn't supported from the CLI — re-ingest via the archiver instead.

Bundle helpers

bebo list <bundle>

List the labels inside a .cbeboqtr (monthly labels) or .cbeboyr (quarterly labels).

bebo extract <bundle> --label <label>

Pull one inner file out of a bundle. Use --out path to write a file.

$ bebo extract 2026-Q1_v1.cbeboqtr --label 2026-02 --out feb.cbebomth
extracted 2026-02 → feb.cbebomth (1147 bytes)

$ bebo count feb.cbebomth
32

Exit codes

0success
1usage / argument error
2decode or integrity failure
3file not found / I/O error

Troubleshooting

zstd: Unknown frame descriptor
You're feeding a non-BEBO file, or the file was compressed with a trained dictionary you don't have locally. Pass --dict path/to/DICT.bin if your tenant uses one.
CRC32 verification failed
The file was modified after it was written. Check your S3 sync didn't truncate it. If it's corrupt, fetch a fresh copy from the bucket (versioning retains old versions).
decode panic
Most likely an archive written before we shipped the array/JSONB/nullable-TS codec fix (pre-2026-04-18). Re-archive the source data — the new codec handles those column types correctly.