Load and I/O — FAQ

Load and I/O — FAQ#

Every question here comes from a real user session. Pick the one that matches what you’re trying to do; each answer is a copy-pasteable snippet.

For a beginner-friendly walkthrough of uint8/uint16, memory estimates, CUDA GPU selection, and cleanup, start with IO/GPU.

For the full function reference, see load and the autodocs at the bottom of this page.

First-time walkthrough (no data of your own required)#

If you don’t have a 4D-STEM master.h5 on disk yet, use one of the reference datasets on Hugging Face — the whole flow is four lines:

from quantem.widget import ShowFolder, load, Show4DSTEM
from quantem.widget.io import list_datasets, download, discover_masters

# 1. See what's available (returns names like '4dstem/gold_512' with prefix)
list_datasets()

# 2. Download by SHORT name (drop the '4dstem/' prefix). Returns a Path
#    under ~/.cache/huggingface/... — cached after first call.
path = download("gold_512")

# 3. Look before you load — cached thumbnails, metadata, selection
ShowFolder(path)

# 4. Discover the master.h5 files + load the first + open the viewer
masters = discover_masters(path)     # sorted list of Path
data = load(masters[0], det_bin=4)   # fast browse preset (see next section)
Show4DSTEM(data)

Gotcha: list_datasets() returns 4dstem/gold_512 (with prefix) but download() takes the SHORT name gold_512 (no prefix). This is a quirk of the underlying quantem.data API and will be aligned in a future release — for now, drop the prefix.

The gold reference scans (gold_512 and friends) are 1-5 GB compressed and load in seconds on any modern GPU or Mac. Once this works end-to-end, swap download(...) for Path("/data/session") and everything downstream is identical.

`ShowFolder(path)` — what’s in this folder before I load anything?#

ShowFolder is the folder-level entry point for real microscope sessions. It builds cached thumbnails for image files, shows acquisition metadata, lets you star files for downstream analysis, saves that selection, and can open starred images immediately as Show2D or Show3D from the embedded selection panel.

from quantem.widget import ShowFolder

folder = ShowFolder("/data/session")
folder.paths("image")  # selected files after you star panels

For 4D-STEM master files, use discover_masters to return sorted master paths for a scripted load. Then inspect one file with get_metadata or load it with the memory-reduction options shown below.

Prefer discover_masters when you just want the sorted paths back for a scripted load:

from quantem.widget.io import discover_masters

masters = discover_masters("/data/session")               # all
masters = discover_masters("/data/session", scan_shape=(512, 512))  # filter by scan size

Prefer get_metadata when you want raw HDF5 attributes of ONE file without loading it — returns a dict of HDF5 tree paths plus the widget-friendly keys scan_shape, detector_shape, n_frames, dwell_time_us, saturation, detector_name:

from quantem.widget.io import get_metadata

meta = get_metadata("/data/session/scan_00_master.h5")
print(meta["scan_shape"], meta["detector_shape"])
# e.g. (512, 512) (192, 192)

How do I load only a scan ROI without loading the full frame first?#

Use load(..., scan_region=...) for reconstruction or denoise workflows that need a scan patch plus halo instead of the full scan plane. It reads only the selected HDF5 detector-frame chunks and returns a local CuPy patch:

from quantem.widget import load

result = load(
    "/data/session/scan_00_master.h5",
    scan_region=(160, 293, 234, 367),
)
patch = result.data
print(patch.shape)  # (133, 133, 192, 192)

This is for analysis pipelines, not first-pass full-field browsing. For a drift-corrected time series, derive scan_region from the shared specimen ROI, the frame shift, and a small scan halo, then sample the final ROI from the local patch. The detector counts remain raw; drift stays as scan-position metadata.

On the Pari native-detector ROI loader timing check, loading ten full frames before cropping took 9.66 s and used an 18.0 GiB temporary per frame. Loading the needed 133 x 133 patch took 2.44 s and used a 1.215 GiB temporary per frame. The current region loader is CUDA-only; use load() for Apple Metal/MPS until the region path is ported there.

Lightweight visual thumbnails#

Use quantem.widget.render.thumbnail when you need compact visual previews for folder reports, static dashboards, or quick review pages. WebP is the default preview format because it gives small files for noisy microscopy images while still showing the structure a human needs for browsing: particles, scan artifacts, contrast, FFT-like texture, and bad frames. This matters when a folder contains hundreds of images, when an HTML report is opened on a laptop or phone, or when a CI/dashboard page needs many thumbnails without becoming a large artifact.

Use WebP thumbnails for:

folder-browser previews and cached visual review pages
maintainer smoke dashboards and HTML reports
static report previews where file size matters more than exact pixel values
quick human decisions such as “which file should I open next?”

Do not use WebP thumbnails for scientific data storage, measurements, publication figures, exact widget state, or HTML exports that need interaction. WebP is a visual preview and may be lossy. Keep scientific arrays in array/HDF5 formats when values need to be reused.

q85 means “quality 85” for a lossy image encoder. Higher values keep more visual detail and make larger files; lower values make smaller files and can show compression artifacts. It is only a preview setting, not a scientific data type or measurement setting.

Use this policy when choosing a widget output image format:

Surface	Preferred format	Why
Saved-notebook fallback for `Show2D` / `Show3D`	JPEG preview generated from the widget render	Very portable across JupyterLab, VS Code, Colab, GitHub previews, and nbconvert; much smaller than PNG for noisy microscopy images
Folder browser, ShowFolder reports, smoke dashboards	WebP thumbnail, usually quality 85	Smallest practical preview for pages with many images
Publication-style static output from `save_image(...)`	PNG, PDF, or TIFF	Stable, lossless or publication-friendly output
Interactive sharing	HTML export	Keeps controls live without a Python kernel
Exact analysis data or reproducible widget state	Array/HDF5 data plus JSON view state, or `save_state=True` only for small widgets	Preserves values and interactivity instead of storing a lossy preview

The default saved-notebook path should stay conservative: keep notebooks small by omitting heavy widget buffers, but use a broadly supported JPEG preview so the output remains visible when someone opens the notebook without rerunning the kernel. If a local workflow values smaller files more than maximum notebook-tool compatibility, choose WebP explicitly:

Show2D(image, notebook_preview_format="webp", notebook_preview_quality=85)
Show3D(stack, notebook_preview_format="webp", notebook_preview_quality=85)

from quantem.widget.render import save_thumbnail, thumbnail_webp

webp_bytes = thumbnail_webp(image, size=256, cmap="inferno")
save_thumbnail(image, "preview.webp", size=256, cmap="inferno")

Use save_image(...) on a widget when you want a publication-style figure, export_html(...) when you want the interactive widget, and io.save(...) or a domain file format when you want data that another analysis step will consume.

I’m on a Linux workstation with an NVIDIA RTX GPU. How do I load a scan?#

from quantem.widget import load, Show4DSTEM

data = load("scan_master.h5")
Show4DSTEM(data)

load auto-detects CUDA and decompresses straight onto the GPU (zero-copy cupy → torch via dlpack). No flag needed. Works on every common workstation GPU: RTX PRO 6000 Blackwell (96 GB), L40S / A100 (48 GB), RTX 4090 / A6000 (24 GB), and anything else with a working cupy install.

Rough tier guidance for a 512×512×192×192 scan (~19 GB raw uint16):

GPU tier	Full-res u16 no-bin	Best default
96 GB (Blackwell)	fits everything, 3x scans in flight	`load(path)`
48 GB (L40S / A100)	fits with room for reconstruction	`load(path)`
24 GB (RTX 4090 / A6000)	fits browse (~21 GB peak) but tight for recon	`load(path)` (browse) or `load(path, det_bin=2)` (recon)
16 GB or less	bin at load	`load(path, det_bin=4, dtype="u8")`

I’m on a MacBook (Apple Silicon). How do I load a scan?#

Same one-liner as CUDA:

from quantem.widget import load, Show4DSTEM

data = load("scan_master.h5")
Show4DSTEM(data)

load auto-detects Apple Metal (MPS) and uses a zero-copy raw-Metal chunked-frames path. Unified memory means “VRAM” = “RAM” — the same 24 GB covers both. So a 24 GB M-series MacBook has to share load footprint with macOS + browser + everything else running.

Rough tier guidance for Mac unified memory:

MacBook Pro (unified)	Full-res u16 no-bin	Best default
48-128 GB (M2/M3/M4 Max, M3/M4 Ultra)	fits full-res comfortably	`load(path)`
24-36 GB (M-series Pro)	fits browse via raw-Metal chunked path	`load(path)` (browse) or `load(path, det_bin=2)`
16-18 GB (M-series base)	bin at load	`load(path, det_bin=4, dtype="u8")`

The raw-Metal path streams frames from a chunked buffer rather than requiring the whole 4D stack in one contiguous allocation, so a 24 GB Mac can browse 19 GB u16 no-bin without OOM even though the block wouldn’t fit as a single torch tensor on MPS.

For multi-scan on Mac, use load([m1, m2, m3]) — dataset 0 shows in ~2 s, and datasets 1..N-1 decode in a background worker behind the Dataset slider (so a 5-file series streams in without freezing the UI).

My GPU is 24 GB (RTX 4090 / A6000) and the scan is 512×512×192×192 (~19 GB uint16). Does it fit?#

Yes, after the 2026-07-02 mean_dp fix. Full-res uint16 no-bin peak = ~21 GB (data + widget). Fits 24 GB with ~2.5 GB headroom.

data = load("scan_master.h5")   # dtype defaults to uint16, no bin
Show4DSTEM(data)                # ~21 GB VRAM peak

If you need more headroom for downstream compute (reconstruction, SSB), bin the detector on the way in:

data = load("scan_master.h5", det_bin=2)   # 512x512x96x96, ~5 GB
Show4DSTEM(data)

My GPU is 48 GB (L40S / A100). Anything I need to know?#

No. Load full-res u16 no-bin — plenty of headroom for browse + downstream reconstruction in one process:

data = load("scan_master.h5")   # ~21 GB peak, ~27 GB free after
Show4DSTEM(data)

You can also load 2-3 scans simultaneously for cross-scan comparison without OOM. For time-series / tilt-series, load([m1, m2, m3]) in one call keeps them behind a single Dataset slider.

My GPU is 96 GB (Blackwell). Anything I need to know?#

No. Full-res u16 no-bin peaks at ~21 GB per scan — you can hold 3-4 scans in VRAM at once, or one scan plus a full reconstruction workspace. Same one-liner:

data = load("scan_master.h5")
Show4DSTEM(data)

I want to browse fast without caring about full detector detail.#

Bin harder + drop to uint8. Great for scrolling through a session to find good scans; not for reconstruction.

data = load("scan_master.h5", det_bin=4, dtype="u8")
Show4DSTEM(data)

Resident size drops to roughly 5% of the no-bin uint16 baseline. Peak brightness below 255 counts is fine (the loader warns if you’d saturate).

I want to browse many scans as one dataset.#

Pass a list. The result stacks them behind a Dataset slider inside Show4DSTEM, so scrubbing = switching files:

masters = [
    "/data/session/file_001_master.h5",
    "/data/session/file_002_master.h5",
    "/data/session/file_003_master.h5",
]
data = load(masters, det_bin=4, dtype="u8")
Show4DSTEM(data)

Result shape: (n_files, scan_row, scan_col, det_row, det_col). Filenames become slider labels.

On a CUDA workstation with multiple GPUs, keep large browse series sharded instead of forcing every master into one allocation:

data = load(masters, det_bin=1, dtype="u8", devices=[0, 1])
Show4DSTEM(data)

dtype="u8" is the fast browse contract. It decodes directly into uint8 before stacking or sharding, so the loader does not build a full uint16 stack first. Values above 255 clip, so use uint16/no-bin when detector counts are the scientific result.

The sharded path is disk-aware. If masters live on independent NVMe devices, load(..., devices=[0, 1]) interleaves files by physical disk and GPU. If every file is on one disk, sharding still increases GPU capacity and keeps flipping bounded, but cold loading stays limited by that disk.

I want a viewer to follow a growing folder.#

Show2D, Show3D, and Show4DSTEM share one folder-watching lifecycle. Watching is enabled by default, and every viewer can be paused, polled, resumed, and closed without constructing a replacement widget.

Viewer	What a new file becomes	Data and memory behavior
`Show2D.from_folder(...)`	One new gallery panel; visible pages default to 20 panels	Reads only the new full-resolution source file; preserves the existing widget and per-file panel state
`Show3D.from_folder(...)`	One new frame in a single unpaged stack	Reads only the new full-resolution source file; preserves the existing widget and frame state
`Show4DSTEM.from_folder(...)`	One cold lazy 4D-STEM dataset	Loads raw data only when visible; a bounded GPU cache evicts older raw pages as needed

from quantem.widget import Show2D, Show3D, Show4DSTEM

images = Show2D.from_folder(
    "/data/session/images",
    pattern="*.tif",
    page_size=20,  # another positive integer, or None for one gallery
)
movie = Show3D.from_folder("/data/session/frames", pattern="frame_*.tif")
scans = Show4DSTEM.from_folder(
    "/data/session/4dstem",
    pattern="*_master.h5",
    det_bin=1,
    page_size=5,
)

The common lifecycle is:

added = images.poll_folder()          # one immediate scan
images.stop_folder_watch()            # idempotent pause
images.watch_folder(interval=1.0)     # resume
images.close()                        # stop background work and close the comm

The same methods apply to all three viewers. poll_folder() returns the zero-based indices appended by that scan. Pass watch=False to any from_folder(...) call for deterministic manual polling. Watching is append-only: known files are not duplicated, transiently incomplete files wait for a later poll, and deletions do not remove already displayed scientific data.

Show2D folder pages are sequential independent files, not the repeated-slot comparison pages accepted by direct Show2D(...). Show3D folder files never cross a page threshold: they always extend one frame axis, even when the folder contains hundreds of frames.

These APIs load source data for scientific display. ShowFolder serves a different purpose: it uses cached WebP thumbnails and metadata so a large session can be browsed and selected quickly. Thumbnail pixels must never be substituted for the full-resolution arrays opened by Show2D or Show3D, or for the lazy raw masters opened by Show4DSTEM.

I want to load every master file in a folder.#

Use Show4DSTEM.from_folder(...) when the folder can grow or when you want a GPU-resident cache instead of loading every master immediately:

from quantem.widget import Show4DSTEM

w = Show4DSTEM.from_folder(
    "/data/session",
    backend="cuda",
    gpus=[0, 1],
    det_bin=1,
    dtype="auto",       # keep real counts; use "u8" only for clipped fast preview
    page_budget="auto",
    view_mode="multiple",
    compare_cols=3,
)

CUDA multi-GPU multiple views preload only the initial visible page with the optimized multi-file loader. Other masters stay as lazy slots, and new ready masters append through poll_folder() / watch_folder() without rebuilding the widget. Watching starts by default. Raw masters enter as cold lazy datasets; they use GPU memory only when selected or included in a visible page, and older raw pages are evicted when the resident budget requires it.

Use explicit discovery plus load(...) when the file list is fixed and you want to control exactly what enters the stack:

from quantem.widget import load, discover_masters, Show4DSTEM

masters = discover_masters("/data/session")   # sorted, filters to *_master.h5
data = load(masters, det_bin=4)
Show4DSTEM(data)

discover_masters also accepts a scan_shape=(512, 512) filter to keep only matching acquisitions when a folder mixes scan sizes.

Before loading anything, how do I check what’s in a folder?#

from quantem.widget import ShowFolder

folder = ShowFolder("/data/session")  # thumbnails, metadata, selection, cache

Use the embedded selection panel to open starred images as Show2D or Show3D. For 4D-STEM master files, pair this with discover_masters and get_metadata before calling load.

How do I inspect a single master’s calibration + metadata without loading it?#

from quantem.widget.io import get_metadata

meta = get_metadata("scan_master.h5")
print(meta)   # voltage_kV, semiangle_mrad, scan_sampling_A, det_shape, ...

I have HAADF or a 2D image (Velox EMD, TIFF, PNG). How do I load that?#

from quantem.widget import Show2D, read_image

img = read_image("haadf.emd")   # Dataset2d with sampling + units
Show2D(img)

For a stack (multi-frame TIFF, sequence of PNGs):

from quantem.widget import Show3D, read_image_stack

stack = read_image_stack("frames", pattern="frame_*.png")
Show3D(stack)

I want the reference gold or MoS2 dataset from Hugging Face.#

from quantem.widget.io import list_datasets, download

list_datasets()                # what's shared
path = download("gold_drift_0deg")   # returns local path
data = load(path)

I want to save a `LoadResult` back to disk (e.g. after binning).#

from quantem.widget.io import save

save(data, "binned_out.h5")   # compressed, matches original chunk shape

What’s the difference between `det_bin`, `dtype`, and `no bin`?#

det_bin=1 (default): full-detector resolution. Every diffraction pixel preserved. CBED at full angular resolution.
det_bin=N > 1: mean-reduces N×N detector blocks at load. det_bin=2 on a 192² detector → 96² output. Faster virtual-image compute; less angular detail.
dtype="u16" (default): raw counts (0-65535). Exact for reconstruction.
dtype="u8": 0-255. Halves memory. Fine when max counts <255 (loader warns if you’d saturate).

Memory rule of thumb for a 512×512×192×192 scan#

mode	resident VRAM per file
no bin, uint16	18-20 GB
`det_bin=2`, uint16	4.5-5 GB
`det_bin=4`, uint16	1.1-1.3 GB
`det_bin=4`, `dtype="u8"`	~0.6 GB

Show4DSTEM(data) adds ~2-3 GB overhead (colormap, virtual-image cache, CBED buffer) on top of the load footprint. Budget accordingly.

Detector files are often integers, not floating-point images. If you are new to dtype choices: uint16 (u16) stores exact raw detector counts from 0 to 65535 in 2 bytes per pixel. uint8 (u8) stores 0 to 255 in 1 byte per pixel, so it is smaller and faster for display, but it can saturate real count data. Use uint16 for scientific loading and reconstruction; use uint8 only for an explicit preview or browsing copy.

Which mode + your GPU / Mac tier at a glance:

your box	full u16 no-bin	`det_bin=2` u16	`det_bin=4` u8
NVIDIA 24 GB (RTX 4090 / A6000)	browse ✓ · recon tight	recon ✓	✓
NVIDIA 48 GB (L40S / A100)	browse + recon ✓	✓	✓
NVIDIA 96 GB (Blackwell)	multi-scan + recon ✓	✓	✓
Mac 48+ GB (M-series Max/Ultra)	browse + recon ✓	✓	✓
Mac 24-36 GB (M-series Pro)	browse ✓ via raw-Metal chunked	✓	✓
Mac 16-18 GB (M-series base)	bin at load	✓	✓

How do I choose a specific NVIDIA GPU?#

Set CUDA_VISIBLE_DEVICES before launching Jupyter. This controls which NVIDIA GPU the Python kernel can see:

CUDA_VISIBLE_DEVICES=0 jupyter lab --no-browser --ip=0.0.0.0

Use 1, 2, etc. for another physical GPU. This is a CUDA/NVIDIA control; it is not used for Apple Silicon or CPU-only machines.

CUDA_VISIBLE_DEVICES=1 jupyter lab --no-browser --ip=0.0.0.0

Inside the notebook:

import torch

print(torch.cuda.is_available())
print(torch.cuda.get_device_name(0))
print(torch.cuda.mem_get_info())

To release memory from the current Python kernel:

del data

import gc
import torch

gc.collect()
torch.cuda.empty_cache()

If memory is still occupied, another object or another Jupyter kernel still owns it. Shut down that kernel from JupyterLab or stop the Python process.

I have data others should use. How do I upload it?#

The shared Hugging Face dataset repo (bobleesj/quantem-data, MIT license) is the one place tutorial and reference data lives — its dataset card intentionally holds no instructions and points back to this page. The upload protocol is three steps:

Install the hub extra and log in once. Uploading needs a Hugging Face account and a write token from huggingface.co/settings/tokens:
```
pip install "quantem.widget[hub]"
hf auth login   # paste the write token
```
Upload with the bucket + sidecar convention. The repo has two trees:
- widget-tutorials/<widget>/<dataset>/<size>/ — the baseline tutorial bundles behind quantem.widget.datasets (sizes small/medium/ large/full). To contribute a tutorial bundle, pass folder="widget-tutorials/<widget>" and name="<dataset>/full". Datasets shared by several widgets (the gold HAADF feeds both Show2D and Show3D) live under widget-tutorials/shared/.
- 4dstem/ and haadf/ — full-size originals for power users. A folder of Arina *_master.h5 files goes under 4dstem/, a single image file under haadf/ (those are also the defaults for a directory vs a file).
Always pass meta= so downstream widgets get calibration — it is written as a meta.json sidecar next to your data:
```
from quantem.widget.io import upload

upload(
    "/data/session/gold_512/",          # folder -> 4dstem/gold_512/*
    name="gold_512",
    folder="4dstem",
    meta={"sampling": [0.5, 0.5], "units": ["A", "A"],
          "voltage_kV": 300, "probe_mrad": 30, "camera_length_mm": 91},
)
```
No write access to the shared repo? Either open an issue on quantem.widget to get added, or pass repo="you/your-data" to use your own HF dataset repo with the same layout — every download helper accepts the same repo= override.

Verify like a user would. List, download to a fresh path, and open it in the widget before announcing the dataset:

from quantem.widget.io import list_datasets, download, status

list_datasets()            # '4dstem/gold_512' should appear
folder = download("gold_512")
status()                   # repo-wide file/size snapshot

Remove a mistake with delete("name") — it deletes every file under the dataset’s folder, so double-check list_datasets() first. Uploads to the shared repo are published under its MIT license; only upload data you have the right to share.

Function reference#

quantem.widget.io.hdf5.load(filepath, *args, dtype: str | None = None, gpus=None, stack: bool = True, max_concurrent=None, scan_region=None, **kwargs)#

Load 4D-STEM data — one master, or many.

load(master) → one LoadResult.
load(master, scan_region=(r0, r1, c0, c1)) → one cropped LoadResult without loading the full scan first.
load([masters]) → the masters stacked into one 5D dataset (the series/viewer case).
load([masters], gpus=[0, 1]) (or stack=False) → a list of separate LoadResult, read in parallel across disks and placed across GPUs — the joint-reconstruction path (gpus: None current device / int all-that-GPU / list per-master round-robin). Decode is serial (concurrent in-process CUDA decode corrupts the device); reads overlap across disks so bandwidth adds.

Also recommends / applies the smallest lossless browse dtype: dtype=None prints the recommendation; dtype='u8' clips@255 + casts; dtype='auto' picks uint8 only if lossless.

quantem.widget.io.hdf5.load_scan_region(filepath: str, scan_region: tuple[int, int, int, int] | list[int], *, backend: str = 'cuda', scan_shape: tuple[int, int] | None = None, det_bin: int = 1, apply_mask: bool = True, verbose: bool = True, auto_narrow: bool = True, output_dtype: type | dtype | None = None) → LoadResult#

Load only a rectangular scan region from a raw HDF5 master.

Parameters:

filepath – Dectris/Arina master HDF5 file.
scan_region – (row_start, row_stop, col_start, col_stop) in the full scan grid.
scan_shape – Full acquisition scan shape. When omitted, it is derived from metadata.
det_bin – Optional detector binning factor. The scan region is never binned.

Returns:

data is a backend array with shape (region_rows, region_cols, det_rows, det_cols). Metadata keeps the full acquisition grid in full_scan_shape and the loaded patch in scan_region.

Return type:

LoadResult

Discover + inspect#

quantem.widget.io.hdf5.discover_masters(folder: str, pattern: str = '*_master.h5', recursive: bool = True, scan_shape: tuple[int, int] | None = None, verbose: bool = True) → list[str]#

Find all master HDF5 files in a folder, sorted by path.

Parameters:

folder (str) – Root directory to search.
pattern (str) – Glob pattern for matching filenames (default *_master.h5).
recursive (bool) – Search subdirectories recursively (default True).
scan_shape (tuple[int, int], optional) – If set, only return files whose frame count matches scan_shape[0] * scan_shape[1]. Reads HDF5 headers only, no decompression. Useful when a folder contains mixed scan sizes.
verbose (bool) – Print indexed file list (default True).

Returns:

Sorted list of absolute file paths.

Return type:

list[str]

Raises:

FileNotFoundError – If folder does not exist.
ValueError – If no files match the pattern.

quantem.widget.io.hdf5.get_metadata(filepath: str) → dict#

Read all scalar metadata from an HDF5 master file.

Returns a flat dict that mixes two layers:

Derived, named fields (always present as keys; value is None when the source field is missing from the file):

scan_shapetuple[int, int] or None
Scan grid as (height, width). Derived from ntrigger assuming a square scan. If ntrigger is not a perfect square, this is None and the caller must pass scan_shape= to load() explicitly.
n_framesint or None
Total frame count (ntrigger).
dwell_time_usfloat or None
Per-frame dwell in microseconds (frame_time * 1e6).
detector_shapetuple[int, int] or None
Detector pixel count as (height, width).
detector_namestr or None
Human-readable detector description, e.g. "Dectris ARINA Si".
saturationint or None
ADU ceiling before the detector saturates.

Raw HDF5 scalars (schema-agnostic): every scalar dataset in the file keyed by its full HDF5 path, e.g. metadata["entry/instrument/detector/frame_time"]. Arrays of more than 100 elements are skipped. This is the escape hatch when you need a field the derived layer does not cover.

Note

Scope-side parameters (voltage_kV, semiangle, scan_sampling, camera_length, rotation) are NOT in the h5 master - they must be passed to ssb() explicitly or loaded from a site config. If a field is in this dict, it came from the file.

Parameters:: filepath (str) – Path to the HDF5 master file.
Returns:: Mixed dict of derived named fields and raw h5-path scalars.
Return type:: dict

Examples

`python m = get_metadata("scan_master.h5") m["scan_shape"] # (512, 512) m["dwell_time_us"] # 49.8 m["detector_name"] # detector model string # any raw HDF5 scalar is also available by its full path: m["entry/instrument/detector/count_time"] # 9.95e-05 `

Images (2D / 3D)#

quantem.widget.io.image.read_image(path: str | Path) → Dataset2d | RgbImage#

Return a single image from disk (grayscale or true-color RGB).

One reader for every survey-image format the lab produces:

.npy - raw array, no calibration.
.emd - Velox HAADF (image under Data/Image/<hash>/Data with a JSON metadata blob carrying the pixel size); falls back to the largest 2D dataset for non-Velox EMD layouts (e.g. a data/drift/data series).
.tif / .tiff / .png / .jpg / .bmp / .gif - via Pillow. Color PNG/JPEG/TIFF keep RGB (RgbImage with shape (H, W, 3)); they are not converted to a single gray channel. Pass the result to Show2D or Show3D to display true color.
.dm3 / .dm4 - Gatan, via ncempy.

Grayscale results are Dataset2d. Color results are RgbImage (duck-types the .array / .name / .sampling surface Show2D already unwraps). A multi-frame container is reduced to its first frame.

Examples

>>> from quantem.widget import Show2D, io  
>>> Show2D(io.read_image("figure_rgb.png"))  # true color, not gray  

quantem.widget.io.image.read_image_stack(path: str | Path, *, file_type: str | None = None, pattern: str | None = None, workers: int = 8, progress: bool = True) → Dataset3d#

Decode a folder of image frames into a Dataset3d in parallel.

A directory of PNG/TIFF/EMD/DM/NPY frames - an in-situ time series, a tilt series, a reconstruction sweep - is read with a thread pool into one contiguous (N, H, W) float32 array, then wrapped so Show3D(read_image_stack(dir)) scrubs the frames with no extra arguments. Frames are sorted naturally (frame_2 before frame_10). Decode is threaded because PIL/tifffile release the GIL during the C decode, so N threads give near-linear speedup until I/O or memory bandwidth saturates; ~8 workers is optimal on most disks. When the first frame is a calibrated format (EMD/DM), its pixel sampling and units carry onto the stack’s spatial axes so Show3D draws a physical scale bar.

Parameters:

path (str or Path) – Folder containing the image frames.
file_type (str, optional) – Extension filter (e.g. "png", "tif"). When omitted every common image extension in the folder is taken.
pattern (str, optional) – Glob within the folder (e.g. "frame_*.png"); overrides file_type.
workers (int, default 8) – Thread count for parallel decompression.
progress (bool, default True) – Show a tqdm bar while decoding.

Returns:

Shape (N, H, W), dtype float32. Sampling defaults to pixels since a bare image folder carries no calibration.

Return type:

Dataset3d

Detector binning#

quantem.widget.io.hdf5.bin(data, factor: int = 2, axes: str = 'detector', dtype=None, reduction: str = 'sum')#

Apply spatial binning on GPU: CuPy or Torch (same type out).

Pass cupy.ndarray and get cupy.ndarray back. Pass torch.Tensor and get torch.Tensor back. NumPy is not accepted.

Spatial sizes that are not multiples of factor are cropped to the largest multiple (trailing rows/cols dropped). Callers do not need to pre-slice.

Parameters:

data (cupy.ndarray or torch.Tensor) –
One of:
- 4D: (scan_row, scan_col, k_row, k_col) for full 4D-STEM.
- 3D: (n_frames, k_row, k_col) for flattened scan / time series.
- 2D: (k_row, k_col) for a single diffraction pattern or image.
factor (int) – Binning factor (2 for 2x2, 4 for 4x4, etc.). Default 2.
axes (str) –
Which axes to bin:
- "detector" or "k": bin k_row and k_col (last two dims on 2D/3D stacks of STEM frames).
- "scan" or "r": bin scan_row and scan_col.
- "all": bin all four dimensions (4D data only).
dtype – Output dtype in the input library. Default: float32 for mean; integer sum uses uint32 (CuPy) or int64 (Torch); otherwise float32.
reduction (str) – "sum" (default) or "mean".

Returns:

Binned array, same library as data.

Return type:

cupy.ndarray or torch.Tensor

Examples

>>> from quantem.gpu.io import bin
>>> stack = bin(stack, factor=4, axes="detector", reduction="sum")  # (N,H,W)
>>> binned = bin(cupy_4d, factor=2, axes="detector")

Hugging Face datasets#

quantem.widget.io.hub.list_datasets(*args: Any, **kwargs: Any) → list[str]#: List datasets through the installed quantem.data implementation.

quantem.widget.io.hub.download(name: str, *args: Any, **kwargs: Any) → Path | str#: Download a dataset and return the local file/folder path when supported.

quantem.widget.io.hub.upload(path_or_data: Any, name: str | None = None, *args: Any, **kwargs: Any) → Any#

Upload either a widget array dataset or a file/folder raw dataset.

quantem.data owns array uploads using technique=.... The legacy live-data convention uploads raw files/folders under haadf/ or 4dstem/ using folder=... and optional meta=... sidecars.

quantem.widget.io.hub.status(*args: Any, **kwargs: Any) → dict[str, Any]#: Return a lightweight repository status snapshot.

quantem.widget.io.hub.delete(name: str, *args: Any, **kwargs: Any) → list[str]#: Delete a shared folder/file dataset by flat name.

Save#

quantem.widget.io.save.save(filepath: str, data: np.ndarray | cp.ndarray, scan_shape: tuple[int, int] | None = None, metadata: dict | None = None, dtype: type | np.dtype | None = None, batch_size: int = 4096, wait: bool = True, verbose: bool = False, source_master: str | None = None, frames_per_file: int = 32768, compression: str = 'lz4', compression_level: int = 0) → None#

Save 4D-STEM data as an Arina-style bitshuffle+LZ4 HDF5 set.

Output: a master HDF5 file pointing to *_data_NNNNNN.h5 external files with per-frame HDF5 chunks. Matches Arina row/column native chunking. By default uses GPU bitshuffle+LZ4 (the fastest path); pass compression= to switch codecs.

Drift-correction recipe (the canonical use case)#

Bilinear merging of a 0°/+90° pair produces a float32 4D-STEM where every detector cell holds a weighted average of two integer counts. Lossless float32 + LZ4 compresses these fractional values to ~2× ratio (huge files, slow). Quantizing back to uint16 recovers the integer-count statistics of the underlying detector, gives 10× better compression, runs ~4× faster, and keeps max error 0.5 counts which is far below the detector’s own Poisson noise (~√N counts at signal level N):

# Drift-corrected merged float32 → save as uint16 (recommended):
save("corrected_master.h5", merged_f32,
     scan_shape=(512, 512),
     dtype="u16")            # ← the single line that matters

Float→integer casts use cp.rint (round-half-to-even) followed by cp.clip to the dtype range, NOT truncation. Max error is exactly half a count for any value in range; truncation would double it.

Performance - a 512^2 x 192^2 float32 bilinear-merged stack#

Measured on RTX PRO 6000 Blackwell (workstation), real bilinear-merged data:

dtype

wall

file size

ratio

GB/s in

float32 (lossless)

~49 s

19.5 GB

1.98x

0.78

uint16 (round-quantize)

~41 s

3.4 GB

11.24x

0.94

For smaller (256²) scans the wall scales linearly. Pure synthetic data compresses 22× rather than 2× because random integer counts have heavy bit-level repetition; bilinear-merged real data is the realistic ceiling.

Quality — what “max_err = 0.5 counts” means#

The detector measures integer photon counts. Poisson noise floor at signal level N counts is √N, so:

Mean signal

Noise floor (σ)

uint16 quant error / σ

100 counts

10 counts

0.5 / 10 = 5%

1000

~32

0.5 / 32 = 1.6%

4000

~63

0.5 / 63 = 0.8%

Quantization is well below the data’s own statistical noise. For ptychography, drift correction, virtual imaging, etc., this is indistinguishable from lossless. If 0.5 counts still feels too coarse, scale up before quantizing — store round(merged * 10) as uint16, divide by 10 on read; max error becomes 0.05 counts at ~30% ratio cost.

Why `uint16` not `int16`#

Detector counts are non-negative by physics. Unsigned uses the full [0, 65535] range; signed wastes a bit on the negative half and risks clipping bright Bragg spots > 32767. Use np.uint16.

`float16` is intentionally NOT supported#

10-bit mantissa makes the smallest representable step at value V equal to V / 1024. For typical STEM detector counts up to ~3000, that’s a step of ~3 counts — worse than uint16’s 0.5 max error AND larger than the Poisson noise floor at low signals. float16 saves are pure noise, never use them. Calls with dtype=np.float16 raise ValueError.

Drift metadata co-saved with the 4D-STEM#

Save BOTH the spline knot positions (compact, model-of-record) and the dense per-position offsets (ready for ptycho without re-evaluation). Pass them via metadata= (root attrs) or write into the master file:

save("corrected_master.h5", merged_f32, dtype=np.uint16, metadata={
    "drift_model": "spline_n16",
    "drift_knots": knots,                     # (n_imgs, 2, n_knots)
    "drift_probe_positions_px": probe_pos,    # (N_scan_pos, 2)
})

Knots = small (kilobytes), regenerable, source of truth. Probe positions = dense (~2 MB at 512²), consumed directly by ptycho.

param filepath:: Output master HDF5 path. External data files are written next to it with the same prefix.
type filepath:: str
param data:: 4D-STEM data. Shape (N, det_row, det_col) or (scan_row, scan_col, det_row, det_col). CuPy arrays save without a host copy.
type data:: np.ndarray | cp.ndarray
param scan_shape:: Scan grid shape. Required for 3D inputs; inferred from 4D inputs.
type scan_shape:: tuple[int, int] | None
param dtype:: Output dtype. None uses input dtype. Short aliases such as "u16" and "f32" are accepted. For drift-corrected bilinear-merged float32 inputs, pass “u16” explicitly (10× smaller, 4× faster, sub-noise-floor error).
type dtype:: str or np.dtype or None
param batch_size:: Frames compressed per GPU pass. 4096 is the sweet spot for 192² det.
type batch_size:: int
param compression:: lz4 (default, GPU pipeline) is fastest. zstd/blosc2_zstd run on CPU, give a few extra % ratio at 5-20× the wall time.
type compression:: {“lz4”, “zstd”, “blosc2_zstd”}
param compression_level:: Codec level. 0 = codec default. Ignored for LZ4.
type compression_level:: int
param metadata:: Saved as root attributes on the master file. Use for drift knots, probe positions, calibration.
type metadata:: dict | None

dtype	wall	file size	ratio	GB/s in
`float32` (lossless)	~49 s	19.5 GB	1.98x	0.78
`uint16` (round-quantize)	~41 s	3.4 GB	11.24x	0.94

Mean signal	Noise floor (σ)	uint16 quant error / σ
100 counts	10 counts	0.5 / 10 = 5%
1000	~32	0.5 / 32 = 1.6%
4000	~63	0.5 / 63 = 0.8%

Load and I/O — FAQ

Contents

Load and I/O — FAQ#

First-time walkthrough (no data of your own required)#

ShowFolder(path) — what’s in this folder before I load anything?#

How do I load only a scan ROI without loading the full frame first?#

Lightweight visual thumbnails#

I’m on a Linux workstation with an NVIDIA RTX GPU. How do I load a scan?#

I’m on a MacBook (Apple Silicon). How do I load a scan?#

My GPU is 24 GB (RTX 4090 / A6000) and the scan is 512×512×192×192 (~19 GB uint16). Does it fit?#

My GPU is 48 GB (L40S / A100). Anything I need to know?#

My GPU is 96 GB (Blackwell). Anything I need to know?#

I want to browse fast without caring about full detector detail.#

I want to browse many scans as one dataset.#

I want a viewer to follow a growing folder.#

I want to load every master file in a folder.#

Before loading anything, how do I check what’s in a folder?#

How do I inspect a single master’s calibration + metadata without loading it?#

I have HAADF or a 2D image (Velox EMD, TIFF, PNG). How do I load that?#

I want the reference gold or MoS2 dataset from Hugging Face.#

I want to save a LoadResult back to disk (e.g. after binning).#

What’s the difference between det_bin, dtype, and no bin?#

Memory rule of thumb for a 512×512×192×192 scan#

How do I choose a specific NVIDIA GPU?#

I have data others should use. How do I upload it?#

Function reference#

Discover + inspect#

Images (2D / 3D)#

Detector binning#

Hugging Face datasets#

Save#

Drift-correction recipe (the canonical use case)#

Performance - a 512^2 x 192^2 float32 bilinear-merged stack#

Quality — what “max_err = 0.5 counts” means#

Why uint16 not int16#

float16 is intentionally NOT supported#

Drift metadata co-saved with the 4D-STEM#

`ShowFolder(path)` — what’s in this folder before I load anything?#

I want to save a `LoadResult` back to disk (e.g. after binning).#

What’s the difference between `det_bin`, `dtype`, and `no bin`?#

Why `uint16` not `int16`#

`float16` is intentionally NOT supported#