4
votes

I am a beginner trying to use dplyr for do data analysis. My data basically are from a few Operations ("Ops") and are well ordered. I often need to apply different functions to the observations("Num") according to the type of Operations, then combine them for analysis.

Trivial example is below:

  X      Num  Ops
  0       37   S
  1       18   R
  2       11   S
  3        3   R
  4       11   S
  5       13   R
  ...     ... ...

I want to add a new column "Num2", according to the values column "Ops", e.g.:

df %〉% mutate(Num2=ifelse(Ops="S",Num-1, Num+1))

I am not sure if I should do a lot of ifelse assignments -- it feels redundant and inefficient.

There must be a much better solution, maybe using some combinations of "group_by, select, filter". Any suggestions?

Basically I want to figure out if there is a way to group the data according to certain criteria, then apply different functions to different subsets, and finally merge the results back together. Typical dplyr examples I found apply the same function(s) to all subsets.

@eddi below provided a more general solution using data.table. Is there a dplyr equivalent?

2
you can try the following approach: stackoverflow.com/a/19054962/817778eddi
Check this, this, and this for ideas and possible alternative techniques.JasonAizkalns
Thanks for the suggestions. Those are not exactly what I want. Basically I want to figure out if there is a way to group the data according to certain criteria, apply different functions to different subsets, then merge the results back together. Typical dplyr examples apply the same function(s) to all subsets.Dong
@eddi It looks you indeed provide a more general solution with data.table. Is there a dplyr equivalent?Dong
@Dong not sure, I'm not a dplyr experteddi

2 Answers

1
votes

There is a dplyrExtras package that includes a mutate_if function.

# install dplyrExtras
library(devtools)
install_github(repo="skranz/dplyrExtras")
require(dplyrExtras)
# code using mutate_if
df %>% 
  mutate(Num2 = Num+1) %>% 
  mutate_if(Ops=="S", Num2 = Num-1)
0
votes

You can easily avoid the ifelse for numeric return values. Just convert the condition to numeric and use appropriate numeric calculations.

df %>% mutate(Num2 = Num - 2*(Ops=="S") + 1)