HyperBench: a shared testbed to fairly and widely test hyperspectral super-resolution methods
This paper introduces HyperBench, a software framework that standardizes how researchers test methods for hyperspectral super-resolution. Hyperspectral super-resolution (HSR) tries to reconstruct a high-spatial-resolution hyperspectral image by fusing a low-resolution hyperspectral image with a high-resolution multispectral image. Because real paired training and test data are rare, the community normally evaluates methods on synthetic pairs made by degrading a true hyperspectral image. HyperBench aims to make those synthetic tests reproducible and broader in scope.
The authors point out that many past studies use a single, simple blur model (usually a Gaussian) and only one or two spectral response settings. That narrow practice makes results hard to compare and may hide methods’ weaknesses. HyperBench implements Wald’s protocol for making synthetic pairs. It also applies a consistent normalization step to the ground truth image using percentile clipping (the paper uses the 1st and 99th percentiles) to reduce outlier effects.
HyperBench provides a library of configurable degradations and an automated pipeline. It supports ten different point spread functions (PSFs) — the paper lists examples such as Gaussian, Kolmogorov, Airy, Moffat, Sinc, Lorentzian-squared, and Hermite — and four spectral response functions (SRFs) derived from real multispectral sensors. Users can pick spatial downsampling factors and matched additive white Gaussian noise levels. The framework runs methods across a grid of these settings, computes standardized reconstruction scores, and logs the full experimental context so results are directly comparable.
To show how this matters, the authors reran six recent HSR methods across 70 degradation configurations on four commonly used hyperspectral scenes. They found that the gap between methods, measured in peak signal-to-noise ratio (PSNR), grew from about 5 dB on the easiest PSF to over 13 dB on the hardest PSF. Several methods that looked state-of-the-art under the single-Gaussian protocol lost 8–12 dB under other spatial degradations. These results reveal a fragility that single-configuration testing can miss.