3
votes

In MATLAB to delete all the rows of a matrix that have NaN values I write the code below:

myMatrix( any(isnan(myMatrix), 2), :) = [] 

Where:

any(isnan(myMatrix), 2) 

return a logical vector

I there a way to do this in Julia?

I can not seem to find a way to do this in Julia. Therefore, I am forced to write ugly loops.

3

3 Answers

3
votes

You can use broadcasting to achieve this:

julia> x = rand([NaN; 1:10], 10, 4)
10×4 Array{Float64,2}:
 4.0    9.0   2.0    6.0
 3.0   10.0   2.0    2.0
 3.0    1.0   3.0    6.0
 7.0    8.0   5.0   10.0
 5.0   10.0  10.0  NaN
 4.0    3.0   7.0    5.0
 1.0    8.0   9.0    4.0
 6.0  NaN     3.0    5.0
 9.0  NaN     7.0    1.0
 9.0    4.0   6.0   10.0

julia> x[.!any.(isnan, eachrow(x)), :]
7×4 Array{Float64,2}:
 4.0   9.0  2.0   6.0
 3.0  10.0  2.0   2.0
 3.0   1.0  3.0   6.0
 7.0   8.0  5.0  10.0
 4.0   3.0  7.0   5.0
 1.0   8.0  9.0   4.0
 9.0   4.0  6.0  10.0

or

julia> x[vec(.!any(isnan.(x), dims=2)), :]
7×4 Array{Float64,2}:
 4.0   9.0  2.0   6.0
 3.0  10.0  2.0   2.0
 3.0   1.0  3.0   6.0
 7.0   8.0  5.0  10.0
 4.0   3.0  7.0   5.0
 1.0   8.0  9.0   4.0
 9.0   4.0  6.0  10.0
2
votes

Yes, just use a logical array as the index.

julia> v = rand(5)
5-element Array{Float64,1}:
 0.6377159558422454
 0.1205285547043713
 0.04902451987818024
 0.737928505686815
 0.34881071296002175

julia> i = v .> 0.5
5-element BitArray{1}:
 1
 0
 0
 1
 0

julia> v[i]
2-element Array{Float64,1}:
 0.6377159558422454
 0.737928505686815

The same thing works with 2D arrays:

julia> m = rand(3,2)
3×2 Array{Float64,2}:
 0.377744  0.0296205
 0.682967  0.366501
 0.906793  0.791147

julia> m[[true,true,false],:]
2×2 Array{Float64,2}:
 0.377744  0.0296205
 0.682967  0.366501

In julia, the equivalent of any(isnan(myMatrix), 2) is instead any(isnan, myMatrix, dims=2). Or since you said you wanted to remove those rows, you actually want all(!isnan, myMatrix, dims=2) However, either way this returns a 1 column 2D array, which you can't use to index. You can either convert this to a vector, or instead map this over the rows to get a vector directly:

julia> myMatrix = rand([NaN, 1:5...], 5,2)
5×2 Array{Float64,2}:
 1.0    3.0
 5.0  NaN
 4.0    1.0
 5.0  NaN
 1.0    2.0

julia> rowfilter = all(!isnan, myMatrix, dims=2)[:,1]
5-element Array{Bool,1}:
 1
 0
 1
 0
 1

julia> myMatrix[rowfilter, :]
3×2 Array{Float64,2}:
 1.0  3.0
 4.0  1.0
 1.0  2.0

or

julia> myMatrix[map(row-> all(!isnan, row), eachrow(myMatrix)), :]
3×2 Array{Float64,2}:
 1.0  3.0
 4.0  1.0
 1.0  2.0

or with broadcasting instead of map():

julia> myMatrix[all.(!isnan, eachrow(myMatrix)), :]
3×2 Array{Float64,2}:
 1.0  3.0
 4.0  1.0
 1.0  2.0
2
votes

Improved Bogumił Kamiński answer and added some Benchmarks.

This seems to be the fastest:

# !any_func is equal to `(x)-> !any_func(x)`
x[vec(all(!isnan, x, dims = 2)), :] 
using BenchmarkTools
A = rand(1000, 1000)
function version1(x)
    x[.!any.(isnan, eachrow(x)), :]
end
function version2(x)
    x[vec(.!any(isnan.(x), dims = 2)), :]
end
function version3(x)
    x[all.(!isnan, eachrow(x)), :]
end
function version4(x)
    x[vec(all(!isnan, x, dims = 2)), :]
end
x = rand(1000, 1000)
@btime version1($A);
@btime version2($A);
@btime version3($A);
@btime version4($A);
5.439 ms (1011 allocations: 7.69 MiB)
6.370 ms (21 allocations: 7.75 MiB)
5.548 ms (1011 allocations: 7.69 MiB)
4.119 ms (14 allocations: 7.63 MiB)