Back to Benchmarks

Image Preprocessing, GPU‑accelerated

11/25/2025
5 min read
By RunMat Team

If you ship geospatial or vision workloads, you’ve likely written this stage countless times: standardize each 4K tile, apply a small radiometric correction, gamma‑correct, and run a quick QC metric. On CPUs this is fine—until the batch grows and your wall‑clock explodes.

In this article, we benchmark the performance of RunMat against NumPy, and PyTorch.

The math is deliberately simple and realistic: compute a per‑image mean and standard deviation, normalize, apply a modest gain/bias and a gamma curve, then validate with a mean‑squared error.


Results

RunMat is 10x faster than NumPy

4K Image Pipeline Perf Sweep (B = batch size)

BRunMat (ms)PyTorch (ms)NumPy (ms)NumPy ÷ RunMatPyTorch ÷ RunMat
4142.97801.29500.343.50×5.60×
8212.77808.92939.274.41×3.80×
16241.56907.731783.477.38×3.76×
32389.251141.923605.959.26×2.93×
64683.541203.206958.2810.18×1.76×

Core implementation in RunMat (MATLAB-syntax)

We'll use a simple pipeline: compute a per‑image mean and standard deviation, normalize, apply a modest gain/bias and a gamma curve, then validate with a mean‑squared error.

rng(0); B=16; H=2160; W=3840;
gain=single(1.0123); bias=single(-0.02); gamma=single(1.8); eps0=single(1e-6);

imgs = rand(B, H, W, 'single');
mu = mean(imgs, [2 3]);
sigma = sqrt(mean((imgs - mu).^2, [2 3]) + eps0);
out = ((imgs - mu) ./ sigma) * gain + bias;
out = out .^ gamma;
mse = mean((out - imgs).^2, 'all');

fprintf('RESULT_ok MSE=%.6e\n', double(mse));

Full sources:

Note: MATLAB’s license agreement restricts usage of their runtime for benchmarking, so we do not include MATLAB runs. If you have numbers, consider sharing them on GitHub Discussions.


Why RunMat is fast (accelerate + fusion)

RunMat fuses elementwise stages and keeps tensors resident on device between steps, while random number generation and updates execute in large, coalesced kernels—a strong fit for GPUs. For the big picture on fusion and residency, see the Introduction to RunMat on the GPU document.


Reproduce the benchmarks

See the benchmarks directory in the RunMat repo on GitHub for the full source code and instructions to reproduce the benchmarks: runmat-org/runmat/benchmarks.

Enjoyed this benchmark? Join the newsletter

Monthly updates on RunMat, Rust internals, and performance tips.

Ready to try RunMat?

Get started with the modern MATLAB runtime today.