Data Formats#

The calibration engine reads L0 Zarr stores and writes L1 Zarr stores. Both use Zarr v3 with zstd compression and vlen-utf8 string arrays.

Data Levels#

Level

Content

Producer

Consumer

L0

Raw counts + metadata

zarr-fits-core

calibrate (cal-io)

L1

Calibrated \(T_A^*\) spectra

calibrate (cal-io)

reduction_pipeline

L1b

Baseline-subtracted, flagged

reduction_pipeline

reduction_pipeline

L2

Gridded maps

reduction_pipeline

Science users

L0 Zarr Structure#

session.zarr/
├── zarr.json                          # Zarr v3 root metadata
├── scan_025650/
│   ├── source/
│   │   ├── data_5d     [C, D, R, A, S]   i32  raw counts
│   │   ├── sobsmode    [S]               str  ON/OFF/OTF-ON/OTF-OFF
│   │   ├── mjd         [S]               f64  Modified Julian Date
│   │   ├── exptime     [S]               f32  integration time (s)
│   │   ├── elevation   [S]               f32  elevation (rad)
│   │   ├── azimuth     [S]               f32  azimuth (rad)
│   │   ├── pamb        [S]               f32  ambient pressure (Torr)
│   │   ├── tamb        [S]               f32  ambient temperature (K)
│   │   ├── signal_freq [S]               f64  signal sideband freq (Hz)
│   │   ├── image_freq  [S]               f64  image sideband freq (Hz)
│   │   ├── freq_res    [S]               f64  channel width (Hz)
│   │   ├── freq_off    [S]               f64  frequency offset (Hz)
│   │   ├── ref_channel [S]               f32  reference channel index
│   │   ├── lloadsn     [S]               i32  last-load scan number
│   │   ├── otf_lon     [S]               f64  OTF longitude offset (deg)
│   │   ├── otf_lat     [S]               f64  OTF latitude offset (deg)
│   │   ├── frontend_backends [S]         str  pixel IDs (e.g. HFAV_PX00)
│   │   └── raw_fits_headers  [S]         str  original FITS headers (JSON)
│   └── calibration/
│       ├── data_5d     [C, D, R, A, S]   i32  HOT/COLD counts
│       ├── sobsmode    [S]               str  HOT/COL/COLD/SKY
│       ├── thot        [S]               f32  hot load temperature (K)
│       ├── tcold       [S]               f32  cold load temperature (K)
│       └── ...                           (same coords as source)
└── scan_025651/
    └── ...

5D Array Layout#

The primary data array has shape [C, D, R, A, S]:

Axis

Name

Description

C

Channels

Spectral channels (typically 1024–16384)

D

Dumps

Time samples within a subscan

R

Receivers

Pixels within an array (e.g. 7 for HFA)

A

Arrays

Frontend-backend combinations (e.g. HFA, LFA, 4G1–4G4)

S

Subscans

Observing phases (ON, OFF, HOT, COLD, etc.)

Missing dumps are padded with i32::MIN (converted to NaN on read).

L1 Zarr Structure#

l1_output/
├── zarr.json                          # cal_schema_version, cal_engine_version
└── scan_025650/
    ├── spectra      [C, D, R, A, S]   f64  calibrated T_A* (K)
    ├── t_sys        [C, R, A, S]      f64  system temperature (K)
    ├── t_int        [S]               f64  integration time (s)
    ├── t_rec_ssb    [C, R, A]         f64  receiver temperature SSB (K)
    ├── gamma        [C, R, A]         f64  gain calibration factor
    ├── tau_signal   [C]               f64  zenith opacity, signal (Np)
    ├── tau_image    [C]               f64  zenith opacity, image (Np)
    ├── t_sky        [C, R, A]         f64  sky temperature (K)
    ├── flags        [C, D, R, A, S]   u16  quality bitmask
    ├── signal_freqs [C]               f64  signal frequencies (Hz)
    ├── image_freqs  [C]               f64  image frequencies (Hz)
    ├── otf_lon      [D, S]            f64  OTF longitude (deg, if OTF)
    ├── otf_lat      [D, S]            f64  OTF latitude (deg, if OTF)
    └── otf_airmass  [D, S]            f64  per-dump airmass (if OTF)

Scan-level attributes include scan_number, source, instmode, rest_freq_hz, pwv_mm, cal_strategy, ref_strategy, and QA metrics.

Schema Constants#

All array and group names are defined in the cal-schema crate (crates/cal-schema/src/lib.rs), the single source of truth for both the FITS→Zarr writer and the calibration reader.