How assign value to new variable using multiple relational operators in data with missing values in [r]?

Question

I have a dataset with 20 variables, and quite a bit of missing data. I am trying to add a new variable with a value assigned for each row, based on those of another variable. Below is code and a smaller dataset that gives the same errors as my larger dataset. Any suggestions?

A=seq(1,6); B=seq(2,4)
length(A)=7; length(B)=7
m=cbind(A,B)

I do not understand completely what converting from a matrix to a dataframe does.

df=as.data.frame(m)
df

First trying to create a categorical variable,to use when assigning the value of the new variable

df$Acat=cut(df$A,
              breaks=c(-Inf,2.5,4.5,Inf),
              labels=c("low","mod","hi"))
df$Acat

This code below is where I get an error ": argument is of length zero"

if (df$Acat.=="low"){
  df$C=1
}else if (df$Acat.=="mod"){
  df$C=2
}else if(df$Acat.=="hi"){
  df$C=3
}else {
  df$C=NA
}
df$C

I also tried it this way, using the numeric variable for assigning the value of the new variable but I am getting this error:

the condition has length > 1 and only the first element will be used

if (df$A<2.5){
  df$D=1
} else if (df$A>=2.5 && df$A<4.5){
  df$D=2
} else if (df$A>=4.5){
  df$D=3
} else {
  df$D=NA
}
df$D

mrhd mrhd · Accepted Answer · 2019-12-16T15:38:41

You seem to be new to R. You will find out, as you go on, that some things are done quite differently in R than in other languages.

For instance, to set the column C according to your conditions, you would do:

df$C = ifelse(
  df$Acat=="low", 1, ifelse(
    df$Acat=="mod", 2, ifelse(
     df$Acat=="hi", 3, NA 
    )))

If you are working with tidyverse, you can also use case_when.

How assign value to new variable using multiple relational operators in data with missing values in [r]?

2 Answers