1
votes

I would like to create a function that deals with missing values. However, when I tried to specify the missing type Array{Missing, 1}, it errors.

function f(x::Array{<:Number, 1})
    # do something complicated
    println("no missings.")
    println(sum(x))
end

function f(x::Array{Missing, 1})
    x = collect(skipmissing(x))
    # do something complicated
    println("removed missings.")
    f(x)
end

f([2, 3, 5])
f([2, 3, 5, missing])

I understand that my type is not Missing but Array{Union{Missing, Int64},1}

When I specify this type, it works in the case above. However, I would like to work with all types (strings, floats etc., not only Int64).

I tried

function f(x::Array{Missing, 1})
    ...
end

But it errors again... Saying that

f (generic function with 1 method)
ERROR: LoadError: MethodError: no method matching f(::Array{Union{Missing, Int64},1})
Closest candidates are:
  f(::Array{Any,1}) at ...

How can I say that I wand the type to be union missings with whatever?


EDIT (reformulation)

Let's have these 4 vectors and two functions dealing with strings and numbers.

x1 = [1, 2, 3]
x2 = [1, 2, 3, missing]
x3 = ["1", "2", "3"]
x4 = ["1", "2", "3", missing]


function f(x::Array{<:Number,1})
    println(sum(x))
end
function f(x::Array{String,1})
    println(join(x))
end

f(x) doesn't work for x2 and x3, because they are of type Array{Union{Missing, Int64},1} and Array{Union{Missing, String},1}, respectively.

It is possible to have only one function that detects whether the vector contains missings, removes them and then deals appropriately with it.

for instance:

function f(x::Array{Any, 1})
    x = collect(skipmissing(x))
    print("removed missings")
    f(x)
end

But this doesn't work because Any indicates a mixed type (e.g., strings and nums) and does not mean string OR numbers or whatever.


EDIT 2 Partial fix

This works:

function f(x::Array)
    x = collect(skipmissing(x))
    print("removed missings")
    f(x)
end

[But how, then, to specify the shape (number of dimensions) of the array...? (this might be an unrelated topic though)]

1

1 Answers

4
votes

You can do it in the following way:

function f(x::Vector{<:Number})
    # do something complicated
    println("no missings.")
    println(sum(x))
end

function f(x::Vector{Union{Missing,T}}) where {T<:Number}
    x = collect(skipmissing(x))
    # do something complicated
    println("removed missings.")
    f(x)
end

and now it works:

julia> f([2, 3, 5])
no missings.
10

julia> f([2, 3, 5, missing])
removed missings.
no missings.
10

EDIT:

I will try to answer the questions raised (if I miss something please add a comment).

First Vector{Union{Missing, <:Number}} is the same as Vector{Union{Missing, Number}} because of the scoping rules as tibL indicated as Vector{Union{Missing, <:Number}} translates to Array{Union{Missing, T} where T<:Number,1} and where clause is inside Array.

Second (here I am not sure if this is what you want). I understand you want the following behavior:

julia> g(x::Array{>:Missing,1}) = "$(eltype(x)) allows missing"
g (generic function with 2 methods)

julia> g(x::Array{T,1}) where T = "$(eltype(x)) does not allow missing"
g (generic function with 2 methods)

julia> g([1,2,3])
"Int64 does not allow missing"

julia> g([1,2,missing])
"Union{Missing, Int64} allows missing"

julia> g(["a",'a'])
"Any allows missing"

julia> g(Union{String,Char}["a",'a'])
"Union{Char, String} does not allow missing"

Note the last two line - although ["a", 'a'] does not contain missing the array has Any element type so it might contain missing. The last case excludes it.

Also you can see that you could change the second parameter of Array{T,N} to something else to get a different dimensionality.

Also this example works because the first method, as more specific, catches all cases that allow Missing and a second method, as more general, catches what is left (i.e. essentially what does not allow Missing).