1
votes

I am trying to make a new column in my data.frame based on another column.

My data frame is called dat.cp2 and one column in it has a certain year from 1990-2017 Here you can see how my data looks. The "ar" column states a year.

enter image description here

I need to make a new column called "TB" with periods. e.g period one is 1990-1996 and i want that period to be called "TB1".. 1997-2003 is "TB2" etc. So for a person born in 1995 the new column says "TB1".

I tried:

dat.cp2 %>% mutate(TB =
                     case_when(ar <=1996 ~ "TB1",
                               ar >=1997&<=2003 ~ "TB2",
                               ar >=2004&<=2010 ~ "TB3",
                               ar >=2011 ~ "TB4")

But i get error message:

Error: unexpected '<=' in:
"                     case_when(ar <=1996 ~ "TB1",
                               ar >=1997&<="

I have tried looking for answers but can't find any.. Can anyone help?

3
Please provide a sample of your data by inputting it into dput() and posting the output.rjen
The syntax &<= is wrong for TB2 and TB3 It should be ar >= 1997 & ar <= 2003akrun

3 Answers

2
votes

You don't actually need the & since you are working sequentially, and also you can finalise with TRUE:

dat.cp2 %>% 
  mutate(
    TB = case_when(ar <= 1996 ~ 'TB1',
                   ar <= 2003 ~ 'TB2',
                   ar <= 2010 ~ 'TB3',
                   TRUE ~ 'TB4')
  )
2
votes

The syntax &<= may be acceptable in some other languages, but in R, the syntax should have ar in both expressions connected by &

library(dplyr)
dat.cp2 %>% 
         mutate(TB =
                 case_when(ar <=1996 ~ "TB1",
                           ar >=1997 & ar <=2003 ~ "TB2",
                           ar >=2004 & ar <=2010 ~ "TB3",
                           ar >=2011 ~ "TB4"))

NOTE: There are many methods for simplifying. But, this is just to show where the OP's code mistake is

1
votes

You could also do:

dat.cp2 %>%
   mutate(TB = cut(ar, breaks = c(1989,1996, 2003, 2010, 2017),
                       labels = c("TB1", "TB2","TB3","TB4")))