4
votes

I have a trouble with the mutate function in dplyr and the error says;

Error: incompatible size (0), expecting 5 (the group size) or 1

There are some previous posts and I tried some of the solutions but no luck for my case.

group-factorial-data-with-multiple-factors-error-incompatible-size-0-expe

r-dplyr-using-mutate-with-na-omit-causes-error-incompatible-size-d

grouped-operations-that-result-in-length-not-equal-to-1-or-length-of-group-in-dp

Here is what I tried,

ff <- c(seq(0,0.2,0.1),seq(0,-0.2,-0.1))
flip <- c(c(0,0,1,1,1,1),c(1,1,0,0,0,0))
df <- data.frame(ff,flip,group=gl(2,6)) 

> df
     ff flip group
1   0.0    0     1
2   0.1    0     1
3   0.2    1     1
4   0.0    1     1
5  -0.1    1     1
6  -0.2    1     1
7   0.0    1     2
8   0.1    1     2
9   0.2    0     2
10  0.0    0     2
11 -0.1    0     2
12 -0.2    0     2

I want to add new group called c1 and c2 based on some conditions as follows

 dff <- df%>%
      group_by(group)%>%
      mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
      spread(direc,flip)%>%
      arrange(group,group)%>%
      mutate(c1=ff[head(which(forward>0),1)],c2=ff[tail(which(backward>0),1)])

Error: incompatible size (0), expecting 5 (the group size) or 1

I also add do and tried

do(data.frame(., c1=ff[head(which(.$forward>0),1)],c2=ff[tail(which(.$backward>0),1)]))

Error in data.frame(., c1 = ff[head(which(.$forward > 0), 1)], c2 = ff[tail(which(.$backward > : arguments imply differing number of rows: 5, 1, 0

but when I only mutate c1 column everything seems to be working. Why?

2
last line should be mutate(c1=head(ff[forward>0], 1), c2 = tail(ff[backward > 0], 1)) (Both give NA though with this example)Sotos
There is no value in group 2 for which backward is larger than 0.alistaire
@alistaire Even there is no value I should be able to get at least NA values for empty rows right?Alexander
If you adjust your mutate to return NA if the length of your computation is 0, yes.alistaire
@alistaire Ok. I got it. Thanks!Alexander

2 Answers

3
votes

It might be informative to step through the pipe to see what is going on.

df %>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  spread(direc,flip)%>%
  arrange(group,group)
# Source: local data frame [10 x 4]
# Groups: group [2]
#       ff  group backward forward
#    <dbl> <fctr>    <dbl>   <dbl>
# 1   -0.2      1        1      NA
# 2   -0.1      1        1      NA
# 3    0.0      1        1       0
# 4    0.1      1       NA       0
# 5    0.2      1       NA       1
# 6   -0.2      2        0      NA
# 7   -0.1      2        0      NA
# 8    0.0      2        0       1
# 9    0.1      2       NA       1
# 10   0.2      2       NA       0

BTW: Why arrange(group,group)? Doubling the order variable is pointless.

Looking here, you'll see that you have (1) backward values that are not greater than 0. When you run something like which(FALSE) you get integer(0). This might be a good time to realize that dplyr needs the vector length of the rhs to be the same length as the number of rows in the group.

Instead of your mutate, I'll show it with a slight modification: return the number of unique values returned in the which call for c2:

df %>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  spread(direc,flip)%>%
  arrange(group,group)%>%
  mutate(
    c1 = ff[head(which(forward>0),1)],
    c2len = length(which(backward > 0))
  )
# Source: local data frame [10 x 6]
# Groups: group [2]
#       ff  group backward forward    c1 c2len
#    <dbl> <fctr>    <dbl>   <dbl> <dbl> <int>
# 1   -0.2      1        1      NA   0.2     3
# 2   -0.1      1        1      NA   0.2     3
# 3    0.0      1        1       0   0.2     3
# 4    0.1      1       NA       0   0.2     3
# 5    0.2      1       NA       1   0.2     3
# 6   -0.2      2        0      NA   0.0     0
# 7   -0.1      2        0      NA   0.0     0
# 8    0.0      2        0       1   0.0     0
# 9    0.1      2       NA       1   0.0     0
# 10   0.2      2       NA       0   0.0     0

In order to meaningfully index on ff, you need something other than integer(0) in your returns.

3
votes

Just expanding on @allistaire's comment.

  1. Your specified conditions are the cause of the error. specifically, tail(which(backward>0),1)
  2. Given code can be optimised to get rid of the spread()

you can try

dff <- df%>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  arrange(group)%>%
  mutate(c1=ff[head(which(direc=="forward" & flip > 0),1)])

It seems like you are looking to identify influx points where direction changes, for each group. In this scenario, please clarify exactly how flip is related, or maybe if you change flip <- c(c(0,0,1,1,1,1),c(1,1,0,0,0,0)) to flip <- c(c(0,0,1,1,1,1),c(1,1,0,1,1,1)) so that flip marks change in direction of ff , you can use

dff <- df%>%
  group_by(group)%>%
  mutate(flip=as.numeric(flip),direc=ifelse(c(0,diff(ff))<0,"backward","forward"))%>%
  arrange(group)%>%
  mutate(c1=ff[head(which(direc=="forward" & flip > 0),1)]) %>%
  mutate(c2=ff[tail(which(direc=="backward"& flip >0),1)])

which gives:

Source: local data frame [12 x 6]
Groups: group [2]

      ff  flip  group    direc    c1    c2
   <dbl> <dbl> <fctr>    <chr> <dbl> <dbl>
1    0.0     0      1  forward   0.2  -0.2
2    0.1     0      1  forward   0.2  -0.2
3    0.2     1      1  forward   0.2  -0.2
4    0.0     1      1 backward   0.2  -0.2
5   -0.1     1      1 backward   0.2  -0.2
6   -0.2     1      1 backward   0.2  -0.2
7    0.0     1      2  forward   0.0  -0.2
8    0.1     1      2  forward   0.0  -0.2
9    0.2     0      2  forward   0.0  -0.2
10   0.0     1      2 backward   0.0  -0.2
11  -0.1     1      2 backward   0.0  -0.2
12  -0.2     1      2 backward   0.0  -0.2