I want to define a new column based on a which is afterwards not linked to the original.
using DataFrames
x = DataFrame(a=1:3)
x.b = x.a
x.b[1] += 1
There are several ways to do it, the major are:
x[:, :b] = x.a
or
x.b = x[:, :a]
You can also write:
x[!, :b] = x[:, :a]
(this can be useful if :b
were a variable)
Finally you could also just write:
df.b = copy(df.a)
or
df.b = df.a[:]
All indexing rules for DataFrames.jl can be found at https://juliadata.github.io/DataFrames.jl/stable/lib/indexing/.
In short (simplifying a bit but these rules are enough to know in practice):
df.col
is non-copying for getting and for setting a columndf[!, :col]
is the same as df.col
with the difference that you can then easily use a variable instead of a literal for indexing and it works with broadcasting while df.col
does not work with broadcasting if :col
were not present in a data framedf[:, :col]
copies for getting a column and is an in-place operation for setting a column, unless :col
is not present in df
in which case it freshly allocates it when setting