5
votes

I would like to easily apply multiple functions to a single column in a Julia dataframe. Here is a simple example from notebook 5 of the DataFrames.jl course on Julia Academy.

Bogumil shows us to easily calculate the mean of the jumps column by doing the following:

combine(df, :jumps => mean)
jumps_mean
Float64
1 2.7186

But what if I want to apply multiple functions to jumps to get multiple summary statistics? So far I can get the following to work:

combine(df, :jumps => (x -> [(mean(x), std(x), minimum(x), maximum(x))]) => [:mean, :std, :min, :max])
mean std max min
Float64 Float64 Int64 Int64
1 2.7186 0.875671 2 11

Is there a cleaner syntax for doing this, without needing wrap the function return in [ ] or specifically use an anonymous function?

For example, I would like to do:

combine(df, :jumps => (mean, std, minimum, maximum))
1

1 Answers

4
votes

Do:

combine(df, :jumps .=> [mean, std, minimum, maximum])

See also Multiple summary statistics on grouped column in Julia for some more advanced examples.