A matrix cov_mat
stores the covariances between variables:
a_plane a_boat a_train b_plane b_boat b_train c_plane c_boat c_train d_plane …
a_plane 4.419 -0.583 0.446 0.018 -1.291 3.159 -0.954 0.488 3.111 1.100
a_boat -0.583 2.636 1.813 -1.511 -0.420 -0.757 1.698 1.668 1.091 0.120
a_train 0.446 1.813 2.668 -0.365 -0.183 1.040 1.347 1.813 0.806 -0.324
b_plane 0.018 -1.511 -0.365 2.498 1.153 1.498 -0.465 -1.157 -0.775 0.133
b_boat -1.291 -0.420 -0.183 1.153 1.043 -0.194 0.243 -0.361 -0.981 -0.040
b_train 3.159 -0.757 1.040 1.498 -0.194 4.153 -0.208 0.257 1.922 1.434
c_plane -0.954 1.698 1.347 -0.465 0.243 -0.208 1.791 0.909 0.259 0.394
c_boat 0.488 1.668 1.813 -1.157 -0.361 0.257 0.909 2.290 1.572 0.269
c_train 3.111 1.091 0.806 -0.775 -0.981 1.922 0.259 1.572 4.097 2.001
d_plane 1.100 0.120 -0.324 0.133 -0.040 1.434 0.394 0.269 2.001 2.231
…
final_need
is a data frame with a row for each category of transportation (plane, boat, train) and a column for every possible covariance within a given category:
aa ab ac ad ba bb bc bd ca cb … <dd>
plane 4.419 0.018 -0.954 1.100 0.018 2.498 -0.465 0.133 -0.954 -0.465 …
boat 2.636 -0.420 1.668 0.120 -0.420 1.043 -0.361 …
train …
<…>
To get from cov_mat
to final_need
, I've converted the file to an edgelist via igraph, then eliminated rows of that edgelist that included out-of-category covariance calculations (e.g.,
a_planecovaries with
a_boat`, but I could care less). Here's the result:
> head(cov_edgelist_slim)
from to covariance
a_plane a_plane 4.419
a_plane b_plane 0.018
a_plane c_plane -0.954
a_plane d_plane 1.100
b_plane a_plane …
… … …
I then try to use dcast()
from reshape2
, but am getting stuck on how to use the function to produce the final_need
result. Any thoughts? If there's a simpler way than the one I'm heading down, I'm happy to hear it!