What does the trace function do in MATLAB / RunMat?
trace(A) returns the sum of the elements on the main diagonal of A. The result matches MATLAB
for scalars, vectors, rectangular matrices, logical masks, and complex inputs. When the argument
is a gpuArray, RunMat keeps the result on the GPU whenever the active provider exposes the
required hooks.
How does the trace function behave in MATLAB / RunMat?
- Operates on the leading two dimensions. Higher dimensions must be singleton; otherwise an error is raised.
- Works for non-square matrices by summing up to
min(size(A, 1), size(A, 2)). - Scalars (real or complex) return their own value.
- Logical inputs are promoted to double precision (
true → 1.0,false → 0.0). - Complex inputs retain both real and imaginary parts in the result.
- Empty matrices yield
0. Empty complex matrices yield0 + 0i. gpuArrayinputs stay on the device when the provider implements diagonal extraction and sum reductions; otherwise RunMat gathers once, computes on the host, and uploads a 1×1 scalar.
trace Function GPU Execution Behaviour
- When the input already lives on the GPU and the active provider exposes both
diag_extractandreduce_sum, RunMat extracts the diagonal on device and performs the reduction there, returning a1×1gpuArray that stays resident for downstream work. - If either hook is missing or the provider declines (unsupported precision, shape, or size), RunMat gathers the matrix exactly once, computes the diagonal sum on the CPU, and uploads the scalar back to the provider so subsequent GPU-friendly code keeps running on device memory.
- Mixed-residency calls automatically upload host matrices before these steps, matching MATLAB's
gpuArraybehaviour while letting the auto-offload planner decide which tier benefits the most.
Examples of using the trace function in MATLAB / RunMat
Summing the diagonal of a square matrix
A = [1 2 3; 4 5 6; 7 8 9];
t = trace(A);
Expected output:
t = 15
Computing the trace of a rectangular matrix
B = [4 2; 1 3; 5 6];
result = trace(B);
Expected output:
result = 7
Getting the trace of a triangular matrix
U = [4 1 2; 0 5 3; 0 0 6];
tri_trace = trace(U);
Expected output:
tri_trace = 15
Working with complex-valued matrices
Z = [1+2i 2; 3 4-5i];
zTrace = trace(Z);
Expected output:
zTrace = 5.0000 - 3.0000i
Tracing a gpuArray without gathering
G = gpuArray(rand(1024));
gpuResult = trace(G); % stays on the GPU
scalarHost = gather(gpuResult);
scalarHost is approximately trace(rand(1024)), and the value is computed on the GPU whenever
the provider supports diagonal extraction plus reductions.
Handling empty matrices safely
E = zeros(0, 5);
value = trace(E);
Expected output:
value = 0
GPU residency in RunMat (Do I need gpuArray?)
You usually do NOT need to call gpuArray yourself in RunMat (unlike MATLAB).
The auto-offload planner keeps residency on the GPU when expressions benefit from it. When the
active provider exposes both diag_extract and reduce_sum, trace executes entirely on the GPU.
If either hook is missing, RunMat performs a single gather, computes the scalar on the CPU, and
uploads a 1×1 result back to the device so downstream fused expressions continue to operate on GPU
data.
To preserve backwards compatibility with MathWorks MATLAB—and for situations where you want to
explicitly manage residency—you can wrap inputs with gpuArray. This mirrors MATLAB while still
letting RunMat's planner decide whether the GPU offers an advantage for the surrounding code.
FAQ
What happens if my matrix is not square?
trace sums along the main diagonal up to min(m, n), matching MATLAB behaviour for rectangular matrices.
Does trace accept higher-dimensional arrays?
Only when trailing dimensions are singleton. Otherwise it raises an error because MATLAB restricts trace to 2-D matrix slices.
How are logical inputs handled?
Logical values are promoted to double precision (0.0 or 1.0) before summing, mirroring MATLAB semantics.
What is returned for empty inputs?
Empty real matrices produce 0; empty complex matrices produce 0 + 0i, exactly like MATLAB.
Does the result stay on the GPU?
Yes, when the provider implements the required hooks. Otherwise RunMat re-uploads the scalar so later GPU-friendly code still sees a gpuArray.
Can I call trace on complex data?
Absolutely. The result is a complex scalar containing the sum of the diagonal's real and imaginary parts.
Is there any precision loss with large matrices?
trace accumulates in double precision (f64), matching MATLAB's default numeric type.
Does trace modify the input matrix?
No. It reads the diagonal and returns a new scalar without altering the original matrix or its residency.
How does trace interact with sparse matrices?
Sparse support is planned; current releases operate on dense arrays. Inputs are treated as dense matrices.
Can I rely on trace inside fused GPU expressions?
Fused kernels treat trace as a scalar reduction boundary. The planner emits GPU kernels when hooks are available; otherwise it falls back gracefully.
See Also
diag, sum, mtimes, gpuArray, gather
Source & Feedback
- Implementation:
crates/runmat-runtime/src/builtins/math/linalg/ops/trace.rs - Found a behavioural difference? Please open an issue with details and a minimal repro.