2
votes

I am struggling to combine two matrices of unequal length, spanning across same (or similar) timeframe. I want to merge the information of the two matrices into one matrix according to the time dimension, filling zeroes on the rows where the information of the second matrix is missing.

In the following example, I have a 5x2 and 3x1 matrices with rownames equal to the corresponding time.

Input 1

                   [,1] [,2]
20160518  15:31:00    1    1
20160518  15:32:00    2    1
20160518  15:33:00    3    1
20160518  15:34:00    4    1
20160518  15:35:00    5    1

Input 2

                  [,1]                      
20160518  15:31:00 100
20160518  15:34:00 101
20160518  15:35:00 102

Desired result

                   [,1] [,2] [,3]
20160518  15:31:00    1    1  100
20160518  15:32:00    2    1    0
20160518  15:33:00    3    1    0
20160518  15:34:00    4    1  101
20160518  15:35:00    5    1  102

Second question would be very similar. Now instead of matching according to identical rownames, I am interested in matching according to identical values in a row of vector. I.e. imagine the rownames are a separate column of a given matrix (so I have a 5x3 and 3x2 matrices) and I want to combine them into one according to same logic as above.

I would really appreciate your help. I have searched for many hours to find the solution. I tried all sorts of merge, cbind and dplyr package commands. I am probably missing some small bit, but I cannot figure it out. The topics that came closest is (but I still cannot tailor it to my problem):

combining two data frames of different lengths

Best, P.

2

2 Answers

1
votes

If your rownames are porperly set, unique, and so on... you can do:

input3 <- input2[rownames (input1),] # reorder input2 as input1  
missing <- is.na (input3[,1])        # find the missing values
input3[missing,1] <- 0               # replace by 1
cbind (input1, input3)               # combine

Regarding your second questions you can always use the columns you want to rename row names... (use paste if you need to use several columns as unique identifiers of your rows)

Alternatively to the above solution, yo can use data.frames instead of matrices and then include row names as character columns. Then you sould be able to use functions like merge or dplyr::full_join.

1
votes

In my opinion you should be working with data-frames not matrices. Matrices are meant to be used with numerical data, whereas here you have a mixture of numerical and categorical data.

> x <- cbind(t=rownames(x), as.data.frame(unname(x)))
> y <- cbind(t=rownames(y), as.data.frame(unname(y)))
> xy <- merge(x, y, by='t', all=TRUE)
> xy[is.na(xy)] <- 0
> xy
                   t V1.x V2 V1.y
1 20160518  15:31:00    1  1  100
2 20160518  15:32:00    2  1    0
3 20160518  15:33:00    3  1    0
4 20160518  15:34:00    4  1  101
5 20160518  15:35:00    5  1  102

Then if you really want the result in matrix form you can do as.matrix(xy[-1]).

The data:

x <- structure(c(1L, 2L, 3L, 4L, 5L, 1L, 1L, 1L, 1L, 1L),
               .Dim = c(5L, 2L),
               .Dimnames = list(c("20160518  15:31:00",
                                  "20160518  15:32:00",
                                  "20160518  15:33:00",
                                  "20160518  15:34:00",
                                  "20160518  15:35:00"), NULL))

y <- structure(100:102, .Dim = c(3L, 1L),
               .Dimnames = list(c("20160518  15:31:00",
                                  "20160518  15:34:00",
                                  "20160518  15:35:00"), NULL))