Cloud-native pipeline

The pipeline’s rule of thumb, established by the pipeline explorer: verification math is cheap everywhere; grid operators are the GPU’s job; format decode stays offline.

GRIB2 / prepBUFR / .stat  (the archive)
        │  offline, once  (cfgrib/ecCodes decode dominates: ~90% of ~73 s/cycle)
        ▼
Zarr v3 grids + Parquet tables        (sibling project: metplus-data-store)
        │  one R2 upload (maintainer-run)
        ▼
private R2 bucket ──► gated Worker (Range/206 + CORS) ──► browser
                                                        zarrita.js · DuckDB-WASM · WebGPU

Stores

Store	Size	Content
`metplus-input.zarr`	30 GB	Decoded model/analysis grids (bitround-12 + zstd; URMA lossless), 256² inner tiles
`metplus-grid-output.zarr`	810 MB	2,335 MET `_pairs.nc` cubes (superseded by the v2 direction)
`metplus_stat.parquet`	8.8 MB	The full 1.53M-row `.stat` archive — all models/vars/line types
`gdas_points.parquet`	45 MB	11.7M point observations
`web-demo-v2.zarr`	83 MB	The v2 common-grid demo store (in R2 now)

Note the striking economics: the entire .stat statistics archive compresses to 8.8 MB of Parquet — DuckDB-WASM can query any variable/line-type/statistic with zero server.

The v2 store — raw-first, pairs on the fly

A from-scratch redesign (two independent design teams, then user decisions) replaced “store MET’s _pairs.nc intermediates” with:

One common grid (URMA 2.5 km CONUS; the demo ships a ×3-downsampled 533×782 version) onto which every forecast is regridded once at conversion (hand-rolled numpy bilinear, weights cached, ~0.008 s).
Model as a dimension, variable as separate arrays (per-variable precision: bitround t2m=10 / si10=9 bits; APCP lossless for categorical parity), lead as a dimension with valid_time as the truth join key.
Whole-frame chunks (the app’s dominant read is a full 2-D frame → 1–2 range GETs per frame, vs ~10× GETs with 256² tiles).
No stored pairs at all — the browser (or GPU) computes fcst − truth live for any model/var/lead/region: strictly more powerful than MET’s fixed pairings. A tiny de-identified oracle (6 precip cases + expected CTC counts) preserves the browser == .stat parity proof.
Missing data uses the 1e30 sentinel (GPU-safe), and regional models cost ≈ their footprint via omitted/fill chunks.

Result: the pairs collection (55,331 objects / 810 MB — a tiny-object anti-pattern on an object store) collapses to a few oracle objects, and live pairs for any selection cost ~4 GETs / 11–17 MB.

Conversion benchmarks (M2 Pro)

Step	Cost
One full cycle, all models (v1: GRIB2→Zarr + pairs + stat→Parquet)	~73 s (GRIB2 decode ≈ 90%)
Full 24-cycle archive	≈ 25–30 min
v2 regrid + encode (from decoded Zarr; 160 frames + truth)	~7.9 s

GRIB2→Zarr expands on disk (1,226 → 1,692 MB even with bitrounding) — GRIB2’s native packing is excellent; the win is random access, not size.

Scaling direction

For global-plus-regional storage the growth path is a multiscale pyramid (GeoZarr-style multiscales: coarse global level + fine regional level), which keeps per-view fetch cost flat while the dataset grows — see the interactive multiscale storage explainer.