I have a very large dataset, so I want to avoid loops.
I have three columns of data:
col1 = time presented as 10000, 10001, 10002, 10100, 10101, 10102, 10200, 10201, 10202, 10300, ... (total 18000 times)
col2 = id number 1 2 3 4 ... (total 500 ids)
col3 = reading associated with particular id at particular time. 0.1 0.5 0.6 0.7... Say this is called Data3
10000 1 0.1
10001 1 0.5
10002 1 0.6
10100 1 0.7
10200 1 0.6 (NOTE - some random entries missing)
I want to present this as a matrix (called DataMatrix), but there is missing data, so a simple reshape will not do. I want to have the missing data as NA entries.
DataMatrix is currently an NA matrix of 500 columns and 18000 rows, where the row names and column names are the times and ids respectively.
1 2 3 4 ....
10000 NA NA NA NA ....
10001 NA NA NA NA ....
Is there a way I can get R to go through each row of Data3, completing DataMatrix with the reading Data3[,3] by placing it in the row and column of the matrix whose names relate to the Data3[,1] and Data3[,2]. But without loops.
Thanks to all you smart people out there.
library(reshape2); DataMatrix <- dcast(Data3, col1~col2, value.var="col3")
? - lukeA