R ifelse and NAs within dataframes

Question

I have a problem with an ifelse evaluation.

The following function evaluates based on 3 conditions:

mk <- function(a, b, c, d, e_1, e_2, f, k)
  # condition 1
  ifelse (!is.na(e_1) & !(k %in% 1),
    mk <- d - e_1 * c,
  # condition 2
    ifelse (!is.na(e_2) & !(k %in% 1),
    mk <- e_2 - d * c,
      # condition 3
        ifelse((a - b) <= 11,
          mk <- c * a - b * f,
          mk <- c * f
        ))
  )

if I parse a single element the function evaluates correctly, but if I give rows of a dataframe as input values the function only ever uses the computation in the last condition, even if the previous conditions are met. the columns containing the values for e_1, e_2 and k have some NA's in them, I suspect that is the problem. what I don't get is why the NA'S force the whole vector to be evaluated as condition 3, even if they are actually never used in the computation because the conditions should rule out their usage. if I replace the calculations with characters, i.e. write "uses condition 1/2/3" instead of the formulas, the conditions are evaluated correctly.

how can I avoid this problem?

Are there some rows that meet more than one condition, that is 'e_1` and e_2 are not NA and a-b <= 11? — John Paul
it is possible, but is it relevant? this last condition should only be evaluated when neither 1 or 2 apply. — Chris
Can you post the code where you feed the data.frame to the function? This is hard to test without that and some data. — John Paul
turns out the problem was a rounding function I didn't even suspect to be part of the problem — Chris

Chris Chris · Accepted Answer · 2014-10-28T11:11:36

Turns out the NAs weren't the cause of the problem at all, but rather a rounding operation that is done after the initial evaluation. The round function was not in my first question since I didn't suspect it being the problem, but it is actually the cause of the problem.

A more simple form of my problem is represented by:

mktest <- function(a, b, e_1, e_2, k) {
  # condition 1
  ifelse (!is.na(e_1) & !(k %in% 1),
    mk <- 1 - e_1,
  # condition 2
    ifelse (!is.na(e_2) & !(k %in% 1),
    mk <- 2 - e_2,
      # condition 3
        ifelse((a - b) <= 1,
          mk <- -a * b,
          mk <- a * 2
        ))
  )
  round(mk,0)
  }

# some testdata with all possible combinations of values in my data frame
test <- data.frame(expand.grid(2:3, 1, c(1,NA), c(1,NA), c(0,1,NA)))
names(test)[1]    <- "a"
names(test)[2]    <- "b"
names(test)[3]    <- "e_1"
names(test)[4]    <- "e_2"
names(test)[5]    <- "k"

# visualize conditions
test$cond1 <- !is.na(test$e_1) & !(test$k %in% 1)
test$cond2 <- !is.na(test$e_2) & !(test$k %in% 1)
test$cond3 <- ((test$a - test$b) <= 1)

# results
test$result <- mktest(test$a, test$b, test$e_1, test$e_2, test$k)

If I evaluate the function without the round(mk,0) at the end it evaluates the conditions correctly. If the rounding is done, only the last condition is used. The reason for this behaviour is still beyond me, since the rounding operation is done AFTER the evaluation of the conditions, but at least the problem at hand is solved.

R ifelse and NAs within dataframes

1 Answers