# Data notes — `stat_inline.mjs` (`export const INLINE_STAT_DATA`)

> **⚠ SYNTHETIC DATA.** Every value in this dataset is **fabricated** by
> `data/gen_data.mjs` to *resemble* MET `.stat` line-type output. Nothing here is
> computed from real forecasts or observations. Do not use for any scientific
> conclusion. It exists solely to exercise the interaction prototype.

## How it is produced

A deterministic, seeded generator (`mulberry32`, seed `0x4d455441` = "META")
emits the file, so regenerating yields byte-identical output:

```
node data/gen_data.mjs
```

The generator writes two files:

- `data/stat_inline.mjs` — **the single committed copy.** An ES module exporting
  `INLINE_STAT_DATA`, statically imported by `src/main.js`. Because it is a module
  (not a classic `<script>` global), `tools/inline.mjs` folds it into the module
  graph, so the single-file build is fully self-contained. It also serves as the
  runtime fallback when `fetch` is unavailable (zero `fetch` needed).
- `data/stat_sample.json` — an identical JSON mirror. **Git-ignored / not committed**
  (it would be a duplicate full mirror). `main.js` PREFERS it via `fetch` when served
  over `http://`, but falls back to `INLINE_STAT_DATA` otherwise.

This keeps the committed data lean — a single ~444 KB module. CI bounds are stored at
3 dp (they feed only the optional confidence-band display, never a derived metric),
which trims ~20 KB versus the earlier 4 dp without changing any computed statistic.

## Grain: one row per CASE, carrying RAW AGGREGABLES

The dataset is **one row per fully-crossed case** — a `(model, region, lead, thresh)`
cell for CTS, or a `(model, region, lead)` cell for CNT. Crucially, each case carries
the **raw aggregables** MET sums across cases *before* deriving a metric. This is what
makes correct cross-case aggregation (ratio-of-sums) possible.

### CTS case (categorical, threshold-dependent)

```json
{
  "model": "GFS", "line_type": "CTS", "region": "CONUS",
  "lead": 0, "thresh": ">=1.0", "n": 852,
  "c": [190, 58, 100, 504],
  "ci": { "CSI": [0.5281, 0.5644], "GSS": [0.3823, 0.418], "FAR": [...], "POD": [...], "FBIAS": [...] }
}
```

- `c` = the **2×2 contingency counts**, positional: `[fy_oy, fy_on, fn_oy, fn_on]`
  = `[hits, false-alarms, misses, correct-negatives]`. `n = fy_oy+fy_on+fn_oy+fn_on`.
- The five CTS metrics are **derived** from these counts (single source of truth), so
  they are mutually consistent. The per-case derived value is **not stored** — it is
  re-derived from `c` at load (a single-case ratio-of-sums equals the stored counts'
  metric exactly).
- `ci[metric]` = stored bootstrap `[bcl, bcu]` bounds (for the CI band / tooltip).

### CNT case (continuous, threshold-independent, `thresh = "NA"`)

```json
{
  "model": "GFS", "line_type": "CNT", "region": "CONUS",
  "lead": 0, "thresh": "NA", "n": 5467,
  "s": [1.607, 1.545, 5.083, 5.05, 4.352],
  "sae": 3999.1,
  "ci": { "ME": [...], "RMSE": [...], "MAE": [...], "PR_CORR": [...] }
}
```

- `s` = the **SL1L2 partial sums**, positional: `[FBAR, OBAR, FFBAR, OOBAR, FOBAR]`
  = `[mean(f), mean(o), mean(f²), mean(o²), mean(f·o)]` over the case's `n` pairs.
- `sae` = sum of `|f−o|` over the case's `n` pairs (MAE is not recoverable from SL1L2
  alone, so we carry the absolute-error sum to keep MAE a ratio-of-sums: `MAE = Σsae / Σn`).
- From `s`: `ME = FBAR−OBAR`, `MSE = FFBAR+OOBAR−2·FOBAR`, `RMSE = √MSE`,
  `PR_CORR = (FOBAR−FBAR·OBAR)/√((FFBAR−FBAR²)(OOBAR−OBAR²))`.

## Dimensions (the universal axes of MET `.stat` output)

| Axis | Column | Values |
|------|--------|--------|
| Model | `MODEL` | GFS, ECMWF, HRRR, NAM |
| Lead time | `FCST_LEAD` | 0, 6, 12, … 120 h |
| Region / mask | `VX_MASK` | CONUS, EAST, WEST, GREAT_PLAINS, GULF |
| Threshold | `FCST_THRESH` | >=1.0, >=5.0, >=10.0, >=25.0 (mm) — CTS only |
| Variable | `FCST_VAR` | APCP_03 (constant; in `meta`, not per-row) |

## Metrics

- **CTS** (categorical, **threshold-dependent**): `CSI`, `GSS`, `FAR`, `POD`, `FBIAS`
  — all derived from the per-case 2×2 counts `c`.
- **CNT** (continuous, **threshold-independent**): `ME`, `RMSE`, `MAE`, `PR_CORR`
  — derived from the per-case SL1L2 sums `s` (+ `sae` for MAE). For these rows
  `thresh = "NA"`.

## Cross-case aggregation — CORRECTED (ratio-of-sums)

> This is the fix the REVIEW flagged as the top correctness follow-up.

When the app groups cases (e.g. one line per model, aggregating over regions /
thresholds / leads), `src/model.js` now:

1. **SUMS the raw aggregables** across the grouped cases:
   - CTS: counts add directly — `Σfy_oy`, `Σfy_on`, `Σfn_oy`, `Σfn_on`.
   - CNT: the SL1L2 are *means*, so it re-weights by each case's `n`
     (`Σ(mean_i · n_i)`, tracking `Σn`) — and sums `Σsae`.
2. **DERIVES the metric** from the combined totals (e.g.
   `POD = Σfy_oy / Σ(fy_oy+fn_oy)`; `RMSE = √(ΣFFBAR·n/Σn + ΣOOBAR·n/Σn − 2·ΣFOBAR·n/Σn)`).

This is **ratio-of-sums**, which is what MET does — **not** the previous
mean-of-the-per-case-derived-statistic (mean-of-ratios). They differ; e.g. on the
committed data, ECMWF GSS over all CONUS/threshold cases is **0.420** by ratio-of-sums
versus **0.371** by the old mean-of-derived (a ~13% error). For a **single** case the
two agree exactly, so the single-case display is unchanged.

CI bands for an aggregated cell: stored per-case `[bcl,bcu]` bootstrap bounds cannot be
recombined without the raw resamples, so the aggregate carries an **n-weighted mean** of
the per-case bounds as an honest band proxy (exact for a single case). Recomputing true
bootstrap CIs on brushed subsets remains an open question (see PLAN.md).

## Realism baked in

- Skill **decays with lead time** (GSS/CSI down, RMSE up), with per-model decay
  rates chosen so model rankings **cross** at some leads (makes brushing interesting).
- **Higher thresholds → rarer events → lower categorical skill + wider CIs +
  smaller `n`.**
- **Regional spread:** WEST is hardest for precip; GULF easiest — so faceting by
  region reveals structure.
- Small Gaussian per-case noise so lines are not sterile.

## Honest simplifications

- Values are **fabricated**, not from matched pairs (verification math is Theme 01).
- Tidy per-case form is used instead of MET's wide fixed-column `.stat` layout, but the
  carried columns (CTC counts / SL1L2 sums) **are** the genuine MET aggregables.

Case count: **2,100** (4 models × 21 leads × [CTS: 5 regions × 4 thresh = 20 cases
+ CNT: 5 regions = 5 cases]). Cells (case × metric): **10,080**.
