0
votes

I have a dataframe with monthly data and I want to add a column, which gives me the Season of each month. Hereby, 3(Mrz)-5(May) is defined as Spring, 6(Jun)-8(Aug) as Summer, 9(Sep)-11(Nov) as Autumn and 12(Dec)-2(Feb) as Winter.

sample Data.

MONTH <- sample(1:12, 10, rep=TRUE)
SALES <-sample(30:468, 10, rep = TRUE)
df = data.frame(MONTH,SALES)

   MONTH SALES
1      9   209
2      3   273
3      9   249
4      7    99
5      9   442
6      6   202
7      7   347
8      3   428
9      1    67
10     2   223

i reached my goal by using nested ifelse:

df$SEASON<-ifelse(df$MONTH>=3 & df$MONTH<=5,"SPRING",
                 ifelse(df$MONTH>=6 & df$MONTH<=8,"SUMMER",
                       ifelse(df$MONTH>=9 & df$MONTH<=11,"AUTUMN",
                              ifelse(df$MONTH>=12 | df$MONTH<=2,"WINTER",NA))))

   MONTH SALES SEASON
1      9   209 AUTUMN
2      3   273 SPRING
3      9   249 AUTUMN
4      7    99 SUMMER
5      9   442 AUTUMN
6      6   202 SUMMER
7      7   347 SUMMER
8      3   428 SPRING
9      1    67 WINTER
10     2   223 WINTER

However: The use of nested ifelse is not very elegant, is it? Furthermore, it gets laborious, if I have more than 4 character-values to assign (for example: add names to twenty different IDs). What would be the more elegant way, to solve this kind of problem?

3
Instead of mentioning conditions one by one use cut with labels - stackoverflow.com/questions/13559076/…Ronak Shah

3 Answers

0
votes

Does this work:

> library(dplyr)
> df %>% mutate(SEASON = case_when(MONTH %in% 3:5 ~ 'Spring', MONTH %in% 6:8 ~ 'Summer', MONTH %in% 9:11 ~ 'Autumn', TRUE ~ 'Winter'))
# A tibble: 10 x 4
      X1 MONTH SALES SEASON
   <dbl> <dbl> <dbl> <chr> 
 1     1     9   209 Autumn
 2     2     3   273 Spring
 3     3     9   249 Autumn
 4     4     7    99 Summer
 5     5     9   442 Autumn
 6     6     6   202 Summer
 7     7     7   347 Summer
 8     8     3   428 Spring
 9     9     1    67 Winter
10    10     2   223 Winter
> 
0
votes

What you're looking for is a dplyr method of mutate + case_when

df <- df %>% mutate(new_season = case_when(Month = 1 ~ "January"))

0
votes

You can create a vector of your expected values, then index off of it.

seasons <- c(
  "WINTER",
  rep(c("SPRING", "SUMMER", "AUTUMN"), each = 3),
  "WINTER", "WINTER"
)

df$SEASON <- season[df$MONTH]