1
votes

I have a dataset like this (but this is just a subset; the real dataset has hundreds of ID_Desc variables), where each data point has a person's gender, and whether they checked off a number of descriptors (1) or not (NA):

Gender  ID1_Desc_1  ID1_Desc_2  ID1_Desc_3  ID2_Desc_1  ID2_Desc_2  ID2_Desc_3  ID3_Desc_1  ID3_Desc_2  ID3_Desc_3
1       NA          NA          1           NA          NA          1           NA          NA           NA
2       NA          1           1           NA          NA          NA          1           1            NA
1       1           1           1           NA          1           NA          NA          NA           NA

I'm trying to write a loop that will (1) check their gender, (2) based on their gender, check whether they checked off the same descriptor in the first list they saw (lists ID1 and ID2 for Gender=1 and lists ID1 and ID3 for Gender=2), and (3) create a new variable (Same#) that indicates whether they checked off the same descriptor in both lists (by writing a 1) or not (by writing a 0).

I've been working with this code, which seems to be checking their gender ok and creating the new variables (Same#), but it's writing 0's for everything, which is not correct:

for (i in 1:3){
  assign(paste("Same",i,sep=""),
  ifelse(Gender=="1",
         ifelse(paste("ID1_Desc_",i,sep="")==paste("ID2_Desc_",i,sep=""),1,0),
         ifelse(paste("ID1_Desc_",i,sep="")==paste("ID3_Desc_",i,sep=""),1,0)
         )
  )
}

Based on the data I provided, Same1 should be 0 0 1 (since Gender=1 and they chose Desc_3 in both the ID1 and ID2 lists), Same2 should be 0 1 0 (since Gender=2 and they chose Desc_2 in both the ID1 and ID3 lists), and Same3 should be 0 1 0 (since Gender=1 and they chose Desc_2 in both the ID1 and ID2 lists) but right now, all 3 come out as 0 0 0.

I know using loops may not be the best way to do this, but I'd really like to know how to do it with loop if it's possible. If not, anything that works would be incredibly appreciated. Thanks.

1
Would you be able to rephrase what you are trying to do? I cannot see what you are trying to get. - jazzurro
Sorry; I want to get 39 new data frames, named Same1 through Same39, where each one contains a series of 0's and 1's indicating whether, for each data point, the value of the column ID1_Desc_# == ID2_Desc_# (if Gender = 1) or ID1_Desc_# == ID3_Desc_# (if Gender = 2). Does that make sense? - abclist19

1 Answers

0
votes

You may try this

 ind1 <- grep("^ID1", colnames(df))
 ind2 <- grep("^ID2", colnames(df))
 ind3 <- grep("^ID3", colnames(df))
 cond1 <- do.call(cbind,Map(`==` , df[ind1], df[ind2]))
 cond2 <- do.call(cbind,Map(`==` , df[ind1], df[ind3]))
 Finalind <- do.call(cbind, Map(`|`, as.data.frame(t(cond1)),
                    as.data.frame(t(cond2))))
 res <- (!is.na(Finalind))+0
 rownames(res) <- paste0("Same", 1:3)
 t(res)
 #    Same1 Same2 Same3
 #V1     0     0     1
 #V2     0     1     0
 #V3     0     1     0


 cbind(df, t(res))

data

df <- structure(list(Gender = c(1L, 2L, 1L), ID1_Desc_1 = c(NA, NA, 
1L), ID1_Desc_2 = c(NA, 1L, 1L), ID1_Desc_3 = c(1L, 1L, 1L), 
ID2_Desc_1 = c(NA, NA, NA), ID2_Desc_2 = c(NA, NA, 1L), ID2_Desc_3 = c(1L, 
NA, NA), ID3_Desc_1 = c(NA, 1L, NA), ID3_Desc_2 = c(NA, 1L, 
NA), ID3_Desc_3 = c(NA, NA, NA)), .Names = c("Gender", "ID1_Desc_1", 
"ID1_Desc_2", "ID1_Desc_3", "ID2_Desc_1", "ID2_Desc_2", "ID2_Desc_3", 
"ID3_Desc_1", "ID3_Desc_2", "ID3_Desc_3"), class = "data.frame",
 row.names = c(NA, -3L))