MET-AL · storage notes

Pyramids for planet-scale data
COG · OME-Zarr · GeoZarr

Three cloud-native formats solve the same problem — how do you look at one small piece of a dataset far too big to download? All three answer with the same two tricks: tiling (fetch only the area you see) and overviews / pyramids (fetch only the resolution you need). This page explains each, how they relate, and how they shape the MET-AL "global-capable" store.

The problem, in one breath

A single global weather field can be gigabytes; an archive is terabytes. A browser can't download that to draw one region at one zoom level. The fix is to never send the whole thing — store the data so a client can ask for exactly the bytes it needs over plain HTTP range requests. Two ideas make that possible, and all three formats below are variations on them:

tiling
the array is cut into small blocks, so a viewport fetches only the blocks it overlaps — not the whole grid.
overviews
pre-computed coarser copies (½, ¼, ⅛…). Zoomed out, you read a small coarse level instead of millions of full-res tiles.

Why it matters here: the MET-AL store must serve interactive maps and GPU compute from Cloudflare R2 with no server-side decoder. Getting tiling + overviews right is the difference between "one quick range read" and "download the planet." The rest of this page is the vocabulary behind that design.

See it: a pyramid keeps fetch cost flat

Drag the zoom. The dataset quadruples at every finer level, but the number of tiles your screen actually fetches stays roughly constant — because you read the level that matches the zoom, and only the tiles under the viewport. That invariant is the whole point of overviews.

resolution level used for this view
tiles that exist at this level (whole world)
tiles fetched for the current viewport
vs. fetching this view from the finest level

The blue box is the viewport (what you see on screen). Filled cells are the tiles fetched; faint cells exist in the store but aren't requested. Switch to finest level only to see what happens with no overviews: a zoomed-out view has to pull every full-res tile.

Cloud-Optimized GeoTIFF COG mature · OGC standard

A COG is a perfectly ordinary GeoTIFF laid out so a client can range-read it. Three ingredients: the file opens with its metadata (so one read reveals every tile's byte offset), the pixels are internally tiled (256²/512² blocks, not scanline strips), and it carries overviews — embedded lower-resolution copies at ½, ¼, ⅛… A viewer reads the header, figures out which tiles (at which overview) cover the current map view, and fetches just those byte ranges. No tiling server, no reformatting — the file on S3/R2 is the tile service.

header IFD + tile byte offsets ov ⅛ ov ¼ ov ½ full-resolution tiles (256² blocks) byte 0 → → end of file metadata first · coarse → fine · a viewport = read the header, then range-GET a few highlighted tiles
One file, front-loaded metadata, coarse overviews before full-res tiles — the layout that makes range reads cheap.

OME-Zarr / NGFF OME-Zarr mature · community standard

When microscopy hit the same wall — single images of terabytes, five dimensions (time, channel, z, y, x) — the Open Microscopy Environment built a "next-generation file format" (NGFF) on top of Zarr. Its key idea is multiscales: a Zarr group holds several arrays, one per resolution level (0 = full res, 1,2… progressively coarser), and a small JSON metadata block describes them — the ordered datasets, each with a coordinateTransformations scale and the named axes. It's COG's overview idea, but Zarr-native and n-dimensional: viewers like vizarr/napari stream just the level + chunks in view.

image/ (group) multiscales meta 0 · full res 1 · ½ 2 · ¼ 3 · ⅛ "multiscales": [{ "axes": [t, c, z, y, x], "datasets": [ {path:"0", scale:[1,1,1,1,1]}, {path:"1", scale:[1,1,1,2,2]}, {path:"2", scale:[1,1,1,4,4]}, … ] }] ← ordered high → low res
A Zarr group of resolution-level arrays plus multiscales metadata that names the axes and the per-level scale. Each level is itself chunked, so a viewer streams only the chunks in view.

GeoZarr GeoZarr emerging · OGC draft (2026)

GeoZarr is the effort to make Zarr a first-class geospatial raster format — informally, "COG for the n-dimensional Zarr world." It's being standardized by an OGC Standards Working Group as a set of small, composable Zarr Conventions rather than one monolith: a multiscales convention (the pyramid idea, straight from OME-Zarr's lineage) for progressive visualisation, plus spatial conventions that tie array indices to real-world coordinates (CF grid_mapping/CRS, dimension names). So a single GeoZarr store can carry time, vertical level, and multiple resolution levels with a proper projection — the things a stack of COGs can't.

Status (mid-2026): an OGC Standards Working Group is active, a V1 release candidate is on the 2026 roadmap, and it already has implementations across GDAL, rioxarray, TiTiler, OpenLayers and the Copernicus EOPF data model. It is stabilizing, not frozen — worth following its conventions without hard-coding against a draft.

Side by side

 COGOME-Zarr (NGFF)GeoZarr
containerone GeoTIFF filea Zarr store (group of arrays)a Zarr store (group of arrays)
home domaingeospatial imagerybio-imaging / microscopygeospatial / earth-observation
dimensionsx, y, bands (2-D)up to t, c, z, y, x (n-D)n-D incl. time, level, + CRS
"tiling"internal TIFF tilesZarr chunks (+ v3 shards)Zarr chunks + v3 shards
"overviews"embedded overviewsmultiscales groupmultiscales convention
CRS / geoyes (GeoTIFF tags)no (physical units)yes (CF grid_mapping)
range readsHTTP byte rangesper-chunk object readsper-chunk / per-shard reads
maturityOGC standardcommunity standardemerging (2026)
browser toolsTiTiler, OpenLayers, GDALvizarr, napari, zarritazarrita, TiTiler, OpenLayers, GDAL

The one-sentence relationship: COG put a tiled pyramid inside a single 2-D geo file; OME-Zarr generalised the pyramid to n-D arrays in a Zarr store; GeoZarr brings that n-D Zarr pyramid back to geospatial with a real CRS — so it's the natural target for a multi-dimensional, multi-resolution earth-data store.

What this means for the MET-AL store

The MET-AL "global-capable" design is a GeoZarr-style multiscale pyramid: a coarse global level (a regular lat/lon grid, for global models like AIGFS/GFS) and a fine regional level (a Lambert CONUS grid, for regional models), each dataset stored at its honest native resolution. A reader picks the level that matches the zoom — coarse when you're looking at the hemisphere, fine when you're zoomed into a region — exactly the pyramid invariant from the demo above.