Ifelse conditional on same strings in multiple columns

Question

So I guess this is possible to achieve by just making a veeery long line code using mutate() and ifelse() but I want to know if there is a way of doing it without writing a tone of code.

I have data where the degree of each individual is written in a non-ordered fashion. The data looks like this:

id <- c(1, 2, 3, 4, 5, 6)
degree1 <- c("masters", "bachelors", "PhD", "bachelors", "bachelors", NA)
degree2 <- c("PhD", "masters", "bachelors", NA, NA, NA)
degree3 <- c("bachelors", NA, "masters", NA, "masters", NA)

Now I want to create a new column containing the string for the highest degree, like this

dat$highest_degree <- c("PhD", "masters", "PhD", "bachelors", "masters", NA)

How can I achieve this?

akrun akrun · Accepted Answer · 2021-01-19T20:46:29

An option is to loop over the rows for the selected 'degree' column, convert to factor with levels specified in the order, drop the levels to remove the unused levels and select the first level

v1 <- c("PhD", "masters", "bachelors")
dat$highest_degree <- apply(dat[-1], 1, function(x)  
        levels(droplevels(factor(x, levels = v1)))[1])
dat$highest_degree
#[1] "PhD"       "masters"   "PhD"       "bachelors" "masters"   NA

Or using tidyverse, reshape into 'long' format, then slice the first row after arrangeing the long format column by matching with an ordered degree vector and grouping by 'id', then join with the original data

library(dplyr)
library(tidyr)
dat %>% 
     pivot_longer(cols = starts_with('degree'), values_to = 'highest_degree') %>%
     select(-name) %>% 
     arrange(id, match(highest_degree, v1)) %>% 
     group_by(id) %>%
     slice_head(n = 1) %>%
     ungroup %>% 
     left_join(dat, .)

data

dat <- data.frame(id, degree1, degree2, degree3)

Ifelse conditional on same strings in multiple columns

2 Answers

data