1
votes

I am trying to combine the male and female columns I have created into one column. I tried using some answers I found on stack, but the second sex I queried was excluded.

Build Data Frame:

ID <- 1:10
SPAYDT <- c("", "2011-12-01", "", "2006-05-01", "", "", "", "", "", "")
SPAYDTU <- c(1, NA, NA, NA, NA, NA, NA, NA, NA, NA)
NEUTDT <- c("", "", "", "", "", "", "2013-03-01", "", "", "")
NEUTDTU <- c(NA, NA, NA, NA, NA, NA, NA, 1, NA, NA)
df <- as.data.frame(cbind(ID, SPAYDT, SPAYDTU, NEUTDT, NEUTDTU))
df

The goal is to have a column for sex, formated as a factor with 2 levels - Male and Female It should say female if the SPAYDT or SPAYDTU have a value in them, and male if the NEUTDT or NEUTDTU have a value in them. What I have tried:

  • using a nested if-else statement to build one sex column
  • making two columns then combining using
df$male <- ifelse(NEUTDT!="", "Male",
                  ifelse(NEUTDTU=1, "Male", NA))
df$female <- ifelse(SPAYDT!="", "Female",
                    ifelse(SPAYDTU==1, "Female", NA))
df$sex <- ifelse(!is.na(df$female), df$female, df$male)

and

df$sex <- ifelse(SPAYDT!="", "Female",
                 ifelse(SPAYDTU==1, "Female",
                        ifelse(NEUTDT!="", "Male",
                               ifelse(NEUTDTU=1, "Male", NA))))

However, no matter what I do, the sex column at the end only has one sex. I made sure my df was attached for use of column names as variables. I tried restarting R and running the setup code again. I just don't know why the ifelse statement is ignoring the second sex input.

Any help is greatly appreciated!

Clarifications: In the larger dataframe I am working with I have done data clean up so that each ID only corresponds to 1 sex. Sorry about the mistake in the code.

Desired output:

ID <- 1:10
SPAYDT <- c("", "2011-12-01", "", "2006-05-01", "", "", "", "", "", "")
SPAYDTU <- c(1, NA, NA, NA, NA, NA, NA, NA, NA, NA)
NEUTDT <- c("", "", "", "", "", "", "2013-03-01", "", "", "")
NEUTDTU <- c(NA, NA, NA, NA, NA, NA, NA, 1, NA, NA)
SEX <- c("Female", "Female", NA, "Female", NA, NA, "Male", "Male", NA, NA)
df <- as.data.frame(cbind(ID, SPAYDT, SPAYDTU, NEUTDT, NEUTDTU, SEX))
df
1
One column by definition has one value. How do you want to handle the case where both genders are assigned?Tim Biegeleisen
Are you sure you have created the dataframe correctly ? I think in the last line it should be cbind instead of rbind. Also can you update your post with the expected output ?Ronak Shah

1 Answers

0
votes

Is this what you are after?

ID <- 1:10
SPAYDT <- c("", "2011-12-01", "", "2006-05-01", "", "", "", "", "", "")
SPAYDTU <- c(1,NA,NA,NA,NA,NA,NA,NA,NA,NA)
NEUTDT <- c("", "", "", "", "", "", "2013-03-01", "", "", "")
NEUTDTU <- c(NA,NA,NA,1,NA,NA,NA,NA,NA,NA)
df <- data.frame(ID, SPAYDT, SPAYDTU, NEUTDT, NEUTDTU)

df %>% 
 mutate(
   sex = case_when(
      NEUTDT!="" | NEUTDTU==1 ~ "Male", 
      SPAYDT!="" | SPAYDTU==1 ~ "Female", 
      TRUE ~ NA_character_))