I have a dataframe, dfregion, which looks as follows:
dput(dfregion)
structure(list(region = structure(c(1L, 2L, 3L, 3L, 1L), .Label = c("East",
"New England", "Southeast"), class = "factor"), words = structure(c(4L,
2L, 1L, 3L, 5L), .Label = c("buildings, tallahassee", "center, mass, visitors",
"god, instruct, estimated", "seeks, metropolis, convey", "teaching, academic, metropolis"
), class = "factor")), .Names = c("region", "words"), row.names = c(NA,
-5L), class = "data.frame")
region words
1 East seeks, metropolis, convey
3 New England center, mass, visitors
4 Southeast buildings, tallahassee
5 Southeast god, instruct, estimated
6 East teaching, academic, metropolis
I am working on "melting" or "reshaping" this dataframe by region and then would like to paste the words together.
The following code is what I have tried:
dfregionnew<-dcast(dfregion, region ~ words,fun.aggregate= function(x) paste(x) )
dfregionnew<-dcast(dfregion, region ~ words, paste)
dfregionnew <- melt(dfregion,id=c("region"),variable_name="words")
Finally, I did this- however I am not sure this is the best way to accomplish what I want
dfregionnew<-ddply(dfregion, .(region), mutate, index= paste0('words', 1:length(region)))
dfregionnew<-dcast(dfregionnew, region~ index, value.var ='words')
The result is a dataframe reshapen in the right way, yet each "word" column is separate. Subsequently, I tried to paste these columns together and am getting various errors while doing so.
dfregionnew$new<-lapply(dfregionnew[,2:ncol(dfregionnew)], paste, sep=",")
dfregionnew$new<-ldply(apply(dfregionnew, 1, function(x) data.frame(x = paste(x[2:ncol(dfregionnew], sep=",", collapse=NULL))))
dfregionnew$new <- apply( dfregionnew[ , 2:ncol(dfregionnew) ] , 1 , paste , sep = "," )
I was able to solve that problem by doing something similar to below:
dfregionnew$new <- apply( dfregionnew[ , 2:5] , 1 , paste , collapse = "," )
I guess my real question is, would it be possible to do this in one step using melt or dcast, without having to paste together the various columns after they are output. I am very interested in improving my skills and would love faster/ better practices in R. Thanks in advance!
dput
of the input, but I'm still not clear on the exact output you want. You just want all the values in the "word" column pasted together grouped by "region"? – A5C1D2H2I1M1N2O1R2T1