I have a question about the cast and melt function of reshape2 (I think).
I have a dataset with one patient per row and all observation in columns twice (for observer x and y seperate).
Edit: - I changed the example data a little bit to resemble my real data better because variables have been tested on multiple levels in the patients -
It looks like this:
data <- data.frame(letters[1:2], 1:2, 3:4, 5:6, 7:8, 9:10, 11:12, 13:14, 15:16 )
colnames(data) <- c("pat_id", "var1_1_x", "var1_1_y", "var1_2_x", "var1_2_y", "var2_1_x", "var2_1_y", "var2_2_x", "var2_2_y")
data
For analysis, I would like to have one column per observer with all the values for all variables in it. So it would look like this (sorry for the crappy table):
pat.id variable level obs_x obs_y
a var1 1 1 3
b var1 1 2 4
a var1 2 5 7
b var1 2 6 8
a var2 1 9 11
b var2 1 10 12
a var2 2 13 15
b var2 2 14 16
When I use
dataM <- melt(data, id="pat_id")
dataM
I end up with a data frame with dim 3 x 16 with all values in one column. I can't seem to seperate the values from the two observers.
I also tried to use the 'grep' function to 'rbind' all the columns from obs_x and obs_y seperately into a new data frame and to 'cbind' from the molten data set rows with values for x and y, but some variable names contain x's and y's in their names.
Edit: - I hope this doesn't overcomplicate my question. I tried using the reshape2 function to melt the data and than colsplit based on the underscore and it works perfect. When I dcast the data frame with sample data it works as well, but when I use it on my real data (original dimensions 600x250) it doens't: the dimensions and colnames work fine, but the values for the observers x and y are changed to '1' and I don't understand why. -
Edit 2: - I renamed the real data to dummy
str(dummmy_melt)
'data.frame': 299008 obs. of 3 variables:
$ DesirNr : Factor w/ 584 levels "1001","1002",..: 10 20 31 41 43 44 45 46 47 57 ...
$ variable: Factor w/ 512 levels "X.1_F","ReaderNr.1_F",..: 1 1 1 1 1 1 1 1 1 1 ...
$ value : Factor w/ 1024 levels "01 01 MRI.pdf",..: 1 2 3 4 5 6 7 8 9 10 ...
str(dummy_split)
'data.frame': 299008 obs. of 6 variables:
$ DesirNr : Factor w/ 584 levels "1001","1002",..: 10 20 31 41 43 44 45 46 47 57 ...
$ variable: Factor w/ 512 levels "X.1_F","ReaderNr.1_F",..: 1 1 1 1 1 1 1 1 1 1 ...
$ value : Factor w/ 1024 levels "01 01 MRI.pdf",..: 1 2 3 4 5 6 7 8 9 10 ...
$ observer: chr "F" "F" "F" "F" ...
$ afwijk : chr "X" "X" "X" "X" ...
$ level : int 1 1 1 1 1 1 1 1 1 1 ...
- End of edit 2.
Maybe I am totally overlooking the obvious solution, if someone has some help I would really appreciate it!