How should I perform missing imputation in DataFrame.jl? E.g., for a given DataFrame, how to turn all missings to 0. Thanks in advance!
1
votes
1 Answers
4
votes
use coalesce
and broadcasting. So assuming your data frame is stored in df
variable then just do:
df .= coalesce.(df, 0)
Now, if you wanted to perform this substitution only in selected columns then do:
@. df[!, cols] = coalesce(df[!, cols], 0)
where cols
is a column selector.
An alternative way to achieve this is to use transform!
:
transform!(df, cols .=> ByRow(x -> coalesce(x, 0)), renamecols=false)
where cols
is your column selector. Use names(df)
for cols
is you want to do the imputation in all columns of the DataFrame
.
This approach is a bit more verbose in this case, but it is more flexible in general.