4
votes

I want to define a new column based on a which is afterwards not linked to the original.

using DataFrames
x = DataFrame(a=1:3)
x.b = x.a
x.b[1] += 1
1

1 Answers

6
votes

There are several ways to do it, the major are:

x[:, :b] = x.a

or

x.b = x[:, :a]

You can also write:

x[!, :b] = x[:, :a]

(this can be useful if :b were a variable)

Finally you could also just write:

df.b = copy(df.a)

or

df.b = df.a[:]

All indexing rules for DataFrames.jl can be found at https://juliadata.github.io/DataFrames.jl/stable/lib/indexing/.

In short (simplifying a bit but these rules are enough to know in practice):

  • df.col is non-copying for getting and for setting a column
  • df[!, :col] is the same as df.col with the difference that you can then easily use a variable instead of a literal for indexing and it works with broadcasting while df.col does not work with broadcasting if :col were not present in a data frame
  • df[:, :col] copies for getting a column and is an in-place operation for setting a column, unless :col is not present in df in which case it freshly allocates it when setting