A handful of options:
df <- read.table(text = 'var1 var2 var3
m01 Y NA
m02 Y NA
m03 NA Y
m04 NA Y
m05 Y Y
m06 Y NA
m07 Y Y', head = TRUE, stringsAsFactors = FALSE)
A typical base R approach would be to apply
to iterate rowwise across the requisite columns. This is silently coercing to a matrix, which is why some avoid this approach.
apply(df[-1], 1, function(x){sum(which(x == 'Y'))})
#> [1] 1 1 2 2 3 1 3
You could translate it to dplyr with rowwise
, which does not coerce to a matrix, but is not usually the fastest possible approach:
library(dplyr)
df %>%
rowwise() %>%
mutate(var4 = sum(which(c(var2, var3) == 'Y')))
#> Source: local data frame [7 x 4]
#> Groups: <by row>
#>
#> # A tibble: 7 x 4
#> var1 var2 var3 var4
#> <chr> <chr> <chr> <int>
#> 1 m01 Y <NA> 1
#> 2 m02 Y <NA> 1
#> 3 m03 <NA> Y 2
#> 4 m04 <NA> Y 2
#> 5 m05 Y Y 3
#> 6 m06 Y <NA> 1
#> 7 m07 Y Y 3
This also will fail as-is for factors (which get converted to integers by c
), but they can be coerced beforehand or internally, or you could use is.na
instead of checking equality.
More creative base options include pasting the columns together to create a factor that can be deliberately leveled for coercion to integer:
as.integer(factor(paste0(df$var2, df$var3), levels = c('YNA', 'NAY', 'YY')))
#> [1] 1 1 2 2 3 1 3
or using do.call
to pass a list of a function and each desired variable of df
(flattened with c
) to mapply
:
do.call(mapply,
c(function(...){sum(which(!is.na(c(...))))},
df[-1],
USE.NAMES = FALSE))
#> [1] 1 1 2 2 3 1 3
If you really want the ifelse
logic, dplyr::case_when
lets you use cascading conditionals without the messy syntax:
df %>% mutate(var4 = case_when(var2 == 'Y' & var3 == 'Y' ~ 3,
var2 == 'Y' ~ 1,
var3 == 'Y' ~ 2))
#> var1 var2 var3 var4
#> 1 m01 Y <NA> 1
#> 2 m02 Y <NA> 1
#> 3 m03 <NA> Y 2
#> 4 m04 <NA> Y 2
#> 5 m05 Y Y 3
#> 6 m06 Y <NA> 1
#> 7 m07 Y Y 3
df$var4 <- apply(df[-1], 1, function(x){sum(which(x == 'Y'))})
– alistaireifelse
. – Ronak Shah