I have a 3 dimensional array
x = rand(6,6,2^10)
I want to multiply each matrix along the third dimension by a vector. Is there a more clean way to do this than:
y = rand(6,1)
z = zeros(6,1,2^10)
for i in 1:2^10
z[:,:,i] = x[:,:,i] * y
end
If you are working with matrices, it may be appropriate to consider x as a vector of matrices instead of a 3D array. Then you could do
x = [rand(6,6) for _ in 1:2^10]
y = [rand(6)]
z = x .* y
z is now a vector of vectors.
And if z is preallocated, that would be
z .= x .* y
And, if you want it really fast, use vectors of StaticArrays
using StaticArrays
x = [@SMatrix rand(6, 6) for _ in 1:2^10]
y = [@SVector rand(6)]
z = x .* y
That's showing a 10x speedup on my computer, running in 12us.
mapslices(i->i*y, x, (1,2)) is maybe "cleaner" but it will be slower.
Read as: apply the function "times by y" to each slice of the first two dimensions.
function tst(x,y)
z = zeros(6,1,2^10)
for i in 1:2^10
z[:,:,i] = x[:,:,i] * y
end
return z
end
tst2(x,y) = mapslices(i->i*y, x, (1,2))
time tst(x,y);
0.002152 seconds (4.10 k allocations: 624.266 KB)
@time tst2(x,y);
0.005720 seconds (13.36 k allocations: 466.969 KB)
sum(x.*y',2) is a clean short solution.
It also has good speed and memory properties. The trick is to view matrix-vector multiplication as a linear combination of matrix's columns scaled by the vector elements. Instead of doing each linear combination for matrix x[:,:,i], we use the same scale y[i] for x[:,i,:]. In code:
const x = rand(6,6,2^10);
const y = rand(6,1);
function tst(x,y)
z = zeros(6,1,2^10)
for i in 1:2^10
z[:,:,i] = x[:,:,i]*y
end
return z
end
tst2(x,y) = mapslices(i->i*y,x,(1,2))
tst3(x,y) = sum(x.*y',2)
Benchmarking gives:
julia> using BenchmarkTools
julia> z = tst(x,y); z2 = tst2(x,y); z3 = tst3(x,y);
julia> @benchmark tst(x,y)
BenchmarkTools.Trial:
memory estimate: 688.11 KiB
allocs estimate: 8196
--------------
median time: 759.545 μs (0.00% GC)
samples: 6068
julia> @benchmark tst2(x,y)
BenchmarkTools.Trial:
memory estimate: 426.81 KiB
allocs estimate: 10798
--------------
median time: 1.634 ms (0.00% GC)
samples: 2869
julia> @benchmark tst3(x,y)
BenchmarkTools.Trial:
memory estimate: 336.41 KiB
allocs estimate: 12
--------------
median time: 114.060 μs (0.00% GC)
samples: 10000
So tst3 using sum has better performance (~7x over tst and ~15x over tst2).
Using StaticArrays as suggested by @DNF is also an option, and it would be nice to compare it to the solutions here.