4
votes

I am trying to iterate over the rows of a DataFrame in Julia to generate a new column for the data frame. I haven't come across a clear example of how to do this. In R this type of thing is vectorized but from my understanding not all of Julia's operations are vectorized so I need to loop over the rows. I know I can do this with indexing but I believe there must be a better way. I want to be able to reference the column values by name. Here is that I have:

test_df = DataFrame( A = [1,2,3,4,5], B = [2,3,4,5,6])
test_df["C"] = [ test_df[i,"A"] * test_df[i,"B"] for i in 1:size(test_df,1)]

Is this the Julia/DataFrames way of doing this? Is there a more Julia-eque way of doing this? Thanks for any feedback.

1

1 Answers

3
votes

You'd be better off doing test_df[i,"A"] .* test_df[i,"B"]. In general, Julia uses a dot prefix to indicate operations that are elementwise. All of these element-wise operations are vectorized.

You also don't want to use an Array comprehension since you probably want a DataArray as your output. There are no DataArray comprehensions for now since comprehensions are built into the Julia parser, which makes them hard to override in libraries like DataArrays.jl.