2
votes

So I am working on coding a program that requires me to assign and invert a large matrix M of size 100 x 100 each time, in a loop that repeats roughly 1000 times.

I was originally using the inv() function but since it is taking a lot of time, I want to optimize my program to make it run faster. Hence I wrote some dummy code as a test for what could be slowing things down:

function test1()
  for i in (1:100)
    ????=Diagonal(rand(100))  
    inverse_????=inv(B)
 end
end

using BenchmarkTools
@benchmark test1()
---------------------------------
BenchmarkTools.Trial: 
memory estimate:  178.13 KiB
allocs estimate:  400
--------------
minimum time:     67.991 μs (0.00% GC)
median time:      71.032 μs (0.00% GC)
mean time:        89.125 μs (19.43% GC)
maximum time:     2.490 ms (96.64% GC)
--------------
samples:          10000
evals/sample:     1

When I use the '\' operator to evaluate the inverse:

function test2()
for i in (1:100)
    ????=Diagonal(rand(100))  
   inverse_????=????\Diagonal(ones(100))
 end
end
using BenchmarkTools
@benchmark test2()
-----------------------------------
BenchmarkTools.Trial: 
memory estimate:  267.19 KiB
allocs estimate:  600
--------------
minimum time:     53.728 μs (0.00% GC)
median time:      56.955 μs (0.00% GC)
mean time:        84.430 μs (30.96% GC)
maximum time:     2.474 ms (96.95% GC)
--------------
samples:          10000
evals/sample:     1

I can see that inv() is taking less memory than the '\' operator, although the '\' operator is faster in the end.

Is this because I am using an extra Identity matrix--> Diagonal(ones(100)) in test2() ? Does it mean that everytime I run my loop, a new portion of memory is being assigned to store the Identity matrix?

My original matrix ???? is a tridigonal matrix. Does inverting a matrix with such a large number of zeros cost more memory allocation? For such a matrix,what is better to use: the inv() or the '\' function or is there some other better strategy?

P.S: How does inverting matrices in julia compare to other languages like C and Python? When I ran the same algorithm in my older program written in C, it took a considerably less amount of time...so I was wondering if the inv() function was the culprit here.


EDIT:

So as pointed out,I had made a typo while typing in the test1() function. It's actually

function test1()
for i in (1:100)
   ????=Diagonal(rand(100))  
   inverse_????=inv(????)
  end
end

However my problem stayed the same, test1() function allocates less memory but takes more time:

using BenchmarkTools
@benchmark test1()

>BenchmarkTools.Trial: 
memory estimate:  178.13 KiB
allocs estimate:  400
--------------
minimum time:     68.640 μs (0.00% GC)
median time:      71.240 μs (0.00% GC)
mean time:        90.468 μs (20.23% GC)
maximum time:     3.455 ms (97.41% GC)

samples:          10000
evals/sample:     1

using BenchmarkTools

@benchmark test2()

BenchmarkTools.Trial: 
memory estimate:  267.19 KiB
allocs estimate:  600
--------------
minimum time:     54.368 μs (0.00% GC)
median time:      57.162 μs (0.00% GC)
mean time:        86.380 μs (31.68% GC)
maximum time:     3.021 ms (97.52% GC)
--------------
samples:          10000
evals/sample:     1

I also tested some other variants of the test2() function:

function test3()
for i in (1:100)
   ????=Diagonal(rand(100)) 
    ????=Diagonal(ones(100))
   inverse_????=????\????
 end
end


function test4(????)
for i in (1:100)
     ????=Diagonal(rand(100))  
   inverse_????=????\????
end
end

using BenchmarkTools

@benchmark test3()

>BenchmarkTools.Trial: 
 memory estimate:  267.19 KiB
allocs estimate:  600
 --------------
minimum time:     54.248 μs (0.00% GC)
median time:      57.120 μs (0.00% GC)
mean time:        86.628 μs (32.01% GC)
maximum time:     3.151 ms (97.23% GC)
 --------------
samples:          10000
evals/sample:     1

using BenchmarkTools

@benchmark test4(Diagonal(ones(100)))

>BenchmarkTools.Trial: 
 memory estimate:  179.02 KiB
 allocs estimate:  402
 --------------
 minimum time:     48.556 μs (0.00% GC)
 median time:      52.731 μs (0.00% GC)
 mean time:        72.193 μs (25.48% GC)
 maximum time:     3.015 ms (97.32% GC)
 --------------
 samples:          10000
 evals/sample:     1

test2() and test3() are equivalent. I realised I could go about the extra memory allocation in test2() by passing the Identity matrix as a variable instead, as done in test4(). It also fastens up the the function.

1
No question about C here.Swordfish
x=inv(𝐌)*b is slower than x=𝐌\b since the first inverts first (an O(n^3) operation) and then multiplies, and the second solves the system of equations directly (using gauss-sidel or LU) which is a O(n^2) operation and hence much faster.John Alexiou
As I have shown below in this case inv will be a bit faster than \ because both inv and \ actually use O(n) operations, and using inv allocates less in the implementation presented. @akira specifically asked about matrix inversion not about solving a system of equations. If you would need to solve a system of equations and 𝐌 would not have a special structure then what @ja72 has written would be true, but if I understand correctly something else is required.Bogumił Kamiński
The \ operator actually works faster than inv for meakira

1 Answers

3
votes

The question you ask is tricky and depends on the context. I can answer you within your context, but if you post your real problem the answer might change.

So for your question, the codes are not equivalent, because in the first you use some matrix B in inv(B), which is undefined (and probably is a global, type unstable, variable), If you change B to 𝐌 actually the first code is a bit faster:

julia> function test1()
         for i in (1:100)
           𝐌=Diagonal(rand(100))
           inverse_𝐌=inv(𝐌)
        end
       end
test1 (generic function with 1 method)

julia> function test2()
       for i in (1:100)
           𝐌=Diagonal(rand(100))
          inverse_𝐌=𝐌\Diagonal(ones(100))
        end
       end
test2 (generic function with 1 method)

julia> using BenchmarkTools

julia> @benchmark test1()
BenchmarkTools.Trial:
  memory estimate:  178.13 KiB
  allocs estimate:  400
  --------------
  minimum time:     28.273 μs (0.00% GC)
  median time:      32.900 μs (0.00% GC)
  mean time:        43.447 μs (14.28% GC)
  maximum time:     34.779 ms (99.70% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark test2()
BenchmarkTools.Trial:
  memory estimate:  267.19 KiB
  allocs estimate:  600
  --------------
  minimum time:     28.273 μs (0.00% GC)
  median time:      33.928 μs (0.00% GC)
  mean time:        45.907 μs (15.25% GC)
  maximum time:     34.718 ms (99.74% GC)
  --------------
  samples:          10000
  evals/sample:     1

Now, the second thing is that your code uses diagonal matrices and Julia is smart enough to have specialized methods for inv and \ for this kind of matrices. Their definitions are as follows:

(\)(Da::Diagonal, Db::Diagonal) = Diagonal(Da.diag .\ Db.diag)

function inv(D::Diagonal{T}) where T
    Di = similar(D.diag, typeof(inv(zero(T))))
    for i = 1:length(D.diag)
        if D.diag[i] == zero(T)
            throw(SingularException(i))
        end
        Di[i] = inv(D.diag[i])
    end
    Diagonal(Di)
end

And you can see that such an example is not fully representative for the general case (if the matrices were not diagonal other methods would be used). You can check which methods are used like this:

julia> @which �\Diagonal(ones(100))
\(Da::Diagonal, Db::Diagonal) in LinearAlgebra at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\LinearAlgebra\src\diagonal.jl:493

julia> @which inv(�)
inv(D::Diagonal{T,V} where V<:AbstractArray{T,1}) where T in LinearAlgebra at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\LinearAlgebra\src\diagonal.jl:496

and look up the code yourself.

I assume that in your real exercise you do not have diagonal matrices. In particular if you have block matrices you might have a look at https://github.com/JuliaArrays/BlockArrays.jl package, as it might have optimized methods for your use case. You might also have a look at https://github.com/JuliaMatrices/BandedMatrices.jl.

In summary - you can expect Julia to try to highly optimize the code for the specific use case, so in order to get a definitive answer for your use case the detailed specification of your problem will matter. If you would share it a more specific answer can be given.