Brand new to R and stack. Hope I'm asking this question correctly.
I have numerous string variables that I need to recode into unique columns. The data are collected from a survey. For example, if a respondent selected "2-black" and "22-hispanic" the data are recorded in variable "string" as "2;22."
I need to recode the variables into unique binary variables with colnames as: "Black", "White", "Hispanic", etc. The columns should be populated as "TRUE" or "FALSE" by searching for number patterns in the string value.
I tried writing a function using "grepl" but it's no good. First I had to create an object "string" from the data frame (code not included). Then I ran into problems distinguishing between, say, "2" and "22".
If you run the code below you can see it's not working as intended
strg_to_many<-function(newcol, string, number) {
for (i in 1:length(number)){
string<-newcol[I]
df_temp[string]<-grepl(number[i], df_temp$string)
}
return(df_temp)
}
df_temp<-data.frame(string=c("22;2", "20", "40,20", "2"))
newcol<-c("black" , "white", "hispanic", "other")
number<-c("2", "20", "22", "40")
string<-c("22;2", "20", "40;20", "2")
df <- strg_to_many(newcol, string, number)
The output I expect is:
- string black white hispanic other
- 22;2 TRUE FALSE TRUE FALSE
- 20 FALSE TRUE FALSE FALSE
- 40;20 FALSE TRUE FALSE TRUE
- 2 TRUE FALSE FALSE FALSE
Thank you for any help!
40,20
? Will that beOther == TRUE
&white == TRUE
? In the case of two numbers, how are they separated? In your example you seem to have both a semicolon and a comma. It would help if you were to provide the full expected output for the sample data you give (not just one row). – Maurits Evers