Roadmap & planned work

Planning artifacts live in three places: the idea board (ideas.html, the capture surface), the real-data blueprint (docs/REAL_DATA_INTEGRATION.md), and the session memory that tracks queued next steps. This page consolidates them.

Direction (current)

R2-first, not offline-first. The proof-of-concept is Cloudflare-hosted; offline single-file builds remain an option, not a requirement. Default: load real data from R2, keep a load-a-file path, use synthetic data only where no real data exists (ensemble).
v2 store: raw-first, pairs-on-the-fly. Rather than bulk-storing MET’s _pairs.nc intermediates (55,331 objects / 810 MB), store raw forecasts + truth regridded once to a single common grid (URMA 2.5 km CONUS); the browser computes pairs live (fcst − truth, even on the GPU) for any model/var/lead/region. A tiny de-identified oracle preserves the browser==.stat parity proof. This is strictly more powerful than MET’s fixed pairings.

Recently landed (was the “no upload needed” queue)

✅ Real-Data Demo DuckDB panel: variable/line-type/stat/model pickers over the full 1.53M-row .stat Parquet + an IO monitor (range-GET count, bytes, time-to-first-frame — Parquet reads metered via a pass-through service worker).
✅ Stat Explorer (02) and Guided Journey (04) now load the real de-identified bundle by default (served with the site; lib/met-data-source.mjs is the shared funnel), with synthetic fallbacks and .stat file-drop.
✅ METcalcpy/METplotpy gaps: bootstrap CIs (percentile + BCa, in the shared lib and the apps), the scorecard (paired, event-equalized bootstrap significance), the Taylor diagram, and ROC curves — card 12.
✅ Load-a-file for cards 09 (case JSON) and 11 (fields JSON).
✅ MODE faithful-core engine (lib/met-mode.mjs) + MODE Lab (card 13) + the MTD memory/time spike.

Queued next steps (need one R2 upload by the maintainer)

Tier 1: pairs cubes (~0.8 GB) → real data for Spatial Maps’ full workflow.
Tier 2: extend the web-demo store with 10 m wind, MSLP, and precip (re-slice from the existing input store, no re-decode).
Tier 3 (optional): convert GFS/CREDIT surface forecasts (GRIB2→Zarr) for true multi-model maps — their stats are already in the Parquet.

Open decisions (from the blueprint §6)

Eight decisions gate the remaining app wiring; the two MVP-critical ones:

Default forecast variable — APCP_06@A06 (best categorical coverage) vs TMP@Z2 (cleaner continuous). Recommended: APCP_06.
Per-pair reconstruction for client-side-wrapper — synthesize pairs from SL1L2 moments (fast, approximate) vs refactor to consume sums directly (higher fidelity). Recommended: synthesize for MVP behind a parity test, then refactor.

Others: single-region UX (repurpose the region axis as MODEL), no-data view policy (banner, not hide), spatial-maps gridding probe, bootstrap-CI proxy acceptability, ingestion entry point, vector-wind UI now-or-later.

The idea-board backlog (not yet prototyped)

From the 23 idea cards, roughly 12 are covered by the existing apps. The prioritized rest:

Tier 1 — bridges to real data: ~~stat-ingest~~ (done) · columnar-points (DuckDB-WASM over matched pairs — partially realized by the Real-Data Demo) · zarr-gridded (client Zarr reader feeding a spatial successor — partially realized).

Tier 2 — scale & compute: ~~gpu-compute~~ (done as WebGPU FSS) · uncertainty (bootstrap CIs everywhere) · scale-streaming (LOD/streaming for campaign-scale output).

Tier 3 — experience: nl-query · notebook-sessions · collab-annotation · accessibility.

Constraints worth remembering

The archive is deterministic-only — no ensemble (ECNT/ORANK/RHIST) or probabilistic (PCT/PSTD/PRC) line types — so ensemble views stay synthetic until such data exists.
There is a single verification region in the archive; apps repurpose the region axis as a MODEL axis.
The agent workflow cannot push to R2 or deploy Workers (safety classifier); those are one-command maintainer actions with staged scripts.