2
votes

I am a julia newbie, and have a baby assignment to write a function which converts a vector of vectors to a matrix. This was pretty easy to do by iterating over the elements.

However, I have read that broadcasting tends to be more efficient. But I wasn't sure how to do it here, because a .= operation cannot work, as it would read the vector as a 1 by n array, and thus be trying to broadcast on two arrays of different length.

Is there a way to broadcast?

My code is below

function vecvec_to_matrix(vecvec)
    dim1 = length(vecvec)
    dim2 = length(vecvec[1])
    my_array = zeros(Int64, dim1, dim2)
    for i in 1:dim1
        for j in 1:dim2
            my_array[i,j] = vecvec[i][j]
        end
    end
    return my_array
end
2
The biggest performance bottleneck in your code is memory layout. You store the vectors that are in vecvec vector as rows of my_array, and in Julia it is more efficient to process data by columns (so if you stored them in columns instead it would be faster with e.g. reduce(hcat, vecvec) as commented below).Bogumił Kamiński
Broadcasting is not more efficient than iterating. Broadcasting does iteration behind the scenes anyway.DNF

2 Answers

3
votes

If your vectors are short and of fixed size (e.g., a list of points in 3 dimensions), then you should strongly consider using the StaticArrays package and then calling reinterpret. Demo:

julia> using StaticArrays

julia> A = rand(3, 8)
3×8 Array{Float64,2}:
 0.153872  0.361708  0.39703   0.405625  0.0881371  0.390133  0.185328  0.585539
 0.467841  0.846298  0.884588  0.798848  0.14218    0.156283  0.232487  0.22629
 0.390566  0.897737  0.569882  0.491681  0.499163   0.377012  0.140902  0.513979

julia> reinterpret(SVector{3,Float64}, A)
1×8 reinterpret(SArray{Tuple{3},Float64,1,3}, ::Array{Float64,2}):
 [0.153872, 0.467841, 0.390566]  [0.361708, 0.846298, 0.897737]  [0.39703, 0.884588, 0.569882]  …  [0.390133, 0.156283, 0.377012]  [0.185328, 0.232487, 0.140902]  [0.585539, 0.22629, 0.513979]

julia> B = vec(copy(ans))
8-element Array{SArray{Tuple{3},Float64,1,3},1}:
 [0.1538721224514592, 0.467840786943454, 0.39056612358281706]
 [0.3617079493961777, 0.8462982350893753, 0.8977366743282564]
 [0.3970299970547111, 0.884587972864584, 0.5698823030478959]
 [0.40562472747685074, 0.7988484677138279, 0.49168126614394647]
 [0.08813706434793178, 0.14218012559727544, 0.499163319341982]
 [0.3901332827772166, 0.15628284837250006, 0.3770117394226711]
 [0.18532803309577517, 0.23248748941275688, 0.14090166962667428]
 [0.5855387782654986, 0.22628968661452897, 0.5139790762185006]

julia> reshape(reinterpret(Float64, B), (3, 8))
3×8 reshape(reinterpret(Float64, ::Array{SArray{Tuple{3},Float64,1,3},1}), 3, 8) with eltype Float64:
 0.153872  0.361708  0.39703   0.405625  0.0881371  0.390133  0.185328  0.585539
 0.467841  0.846298  0.884588  0.798848  0.14218    0.156283  0.232487  0.22629
 0.390566  0.897737  0.569882  0.491681  0.499163   0.377012  0.140902  0.513979
3
votes

Your way is intuitive and fast already. You can improve performance with some @inbounds and that's about it. vcat is also fast. I think broadcasting is not necessary in your case. You Here are some benchmarks of the various ways I can think of

function vecvec_to_matrix(vecvec)
    dim1 = length(vecvec)
    dim2 = length(vecvec[1])
    my_array = zeros(Int64, dim1, dim2)
    for i in 1:dim1
        for j in 1:dim2
            my_array[i,j] = vecvec[i][j]
        end
    end
    return my_array
end

function vecvec_to_matrix2(vecvec::AbstractVector{T}) where T <: AbstractVector
    dim1 = length(vecvec)
    dim2 = length(vecvec[1])
    my_array = Array{eltype(vecvec[1]), 2}(undef, dim1, dim2)
    @inbounds @fastmath for i in 1:dim1, j in 1:dim2
        my_array[i,j] = vecvec[i][j]
    end
    return my_array
end

function vecvec_to_matrix3(vecvec::AbstractVector{T}) where T <: AbstractVector
    dim1 = length(vecvec)
    dim2 = length(vecvec[1])
    my_array = Array{eltype(vecvec[1]), 2}(undef, dim1, dim2)
    Threads.@threads for i in 1:dim1
        for j in 1:dim2
            my_array[i,j] = vecvec[i][j]
        end
    end
    return my_array
end

using Tullio

function using_tullio(vecvec::AbstractVector{T}) where T <: AbstractVector
    dim1 = length(vecvec)
    dim2 = length(vecvec[1])
    my_array = Array{eltype(vecvec[1]), 2}(undef, dim1, dim2)

    @tullio my_array[i, j] = vecvec[i][j]

    my_array
end

function using_vcat(vecvec::AbstractVector{T}) where T <: AbstractVector
    vcat(vecvec...)
end

using BenchmarkTools
vecvec =[rand(Int, 100) for i in 1:100];
@benchmark vecvec_to_matrix(vecvec)
@benchmark vecvec_to_matrix2(vecvec)
@benchmark vecvec_to_matrix3(vecvec)
@benchmark using_tullio(vecvec)
@benchmark using_vcat(vecvec)

with results

julia> @benchmark vecvec_to_matrix(vecvec)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     12.701 μs (0.00% GC)
  median time:      15.001 μs (0.00% GC)
  mean time:        24.465 μs (10.98% GC)
  maximum time:     3.884 ms (98.30% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark vecvec_to_matrix2(vecvec)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     8.600 μs (0.00% GC)
  median time:      9.800 μs (0.00% GC)
  mean time:        19.532 μs (12.37% GC)
  maximum time:     3.834 ms (98.82% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark vecvec_to_matrix3(vecvec)
BenchmarkTools.Trial:
  memory estimate:  83.28 KiB
  allocs estimate:  32
  --------------
  minimum time:     8.399 μs (0.00% GC)
  median time:      14.600 μs (0.00% GC)
  mean time:        28.178 μs (11.82% GC)
  maximum time:     8.269 ms (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark using_tullio(vecvec)
BenchmarkTools.Trial:
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     8.299 μs (0.00% GC)
  median time:      10.101 μs (0.00% GC)
  mean time:        19.476 μs (12.15% GC)
  maximum time:     3.661 ms (98.74% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark using_vcat(vecvec)
BenchmarkTools.Trial: 
  memory estimate:  78.20 KiB
  allocs estimate:  2
  --------------
  minimum time:     5.540 μs (0.00% GC)
  median time:      7.480 μs (0.00% GC)
  mean time:        16.236 μs (15.30% GC)
  maximum time:     876.400 μs (97.85% GC)
  --------------
  samples:          10000
  evals/sample:     5