1
votes

I am a basic programmer using R for social-network analysis and have some information that I am not sure how to solve.

WHAT I HAVE:

  1. An adjacency matrix stored as a csv file with the following information: a) Households in row 1 and households in column 1 interact with each other through sharing resources. b) The interactions are ties represented by kinship numbers. The smaller the number the closer (or stronger) the kinship connection. For example, 1 is parent-child kinship, and 100 is no kinship. No kinship to self is NA. c) File snippet:
     [,1] [,2] [,3] [,4] [,5]
 [1,]  NA   100  2    1    100
 [2,]  4    NA   100  100  3
 [3,]  100  3    NA   2    4
 [4,]  100  1    5    NA   100
 [5,]  1    100  4    100  NA

WHAT I NEED:

  1. I need to convert this adjacency matrix into an edge list with three columns ("HH1", "HH2", "HHKinRank") in order to complete additional kinship calculations.

  2. This edge list must be saved as a new csv file for further analysis.

  3. My greatest issue with the list, is that it will need to only list the numerical values. If there is no tie (NA) then will the edge list show this?

WHAT I HAVE DONE:

I tried assigning the csv file to a new variable HHKinRank.el <- read.csv("HouseholdKinRank.csv").

When I did this the most frustrating component was determining what libraries I may have to use. There are many function commands, such as melt, so troubleshooting is an issue because I also may be incorrectly assigning values.

I can go from an edge list to a matrix, but the opposite is hard to run the commands for.

Thank you for any assistance with this.

2

2 Answers

0
votes

You can do this using the network package for R, probably in igraph as well.

library(network)

# create the example data
adjMat <- matrix(c(NA, 100,  2,    1,    100,
                    4, NA,   100,  100,  3,
                  100, 3,    NA,   2,    4,
                  100, 1,    5,    NA,   100,
                  1,   100,  4,    100,  NA),
                 ncol = 5,byrow=TRUE)

# create a network object
net<-as.network(adjMat,matrix.type='adjacency',
                ignore.eval = FALSE,  # read edge values from matrix as attribute
                names.eval='kinship', # name the attribute
                loops=FALSE)   # ignore self-edges

# convert to an edgelist matrix
el <-as.edgelist(net,attrname = 'kinship')

# relabel the columns
colnames(el)<-c("HH1", "HH2", "HHKinRank")

# check results
el
      HH1 HH2 HHKinRank
 [1,]   1   2       100
 [2,]   1   3         2
 [3,]   1   4         1
 [4,]   1   5       100
 [5,]   2   1         4
 [6,]   2   3       100
 [7,]   2   4       100
 [8,]   2   5         3
 [9,]   3   1       100
[10,]   3   2         3
[11,]   3   4         2
[12,]   3   5         4
[13,]   4   1       100
[14,]   4   2         1
[15,]   4   3         5
[16,]   4   5       100
[17,]   5   1         1
[18,]   5   2       100
[19,]   5   3         4
[20,]   5   4       100

# write edgelist matrix to csv file
write.csv(el,file = 'myEdgelist.csv')
0
votes

Original data:

adj_mat <- matrix(
  c(NA, 100, 2, 1, 100,
    4, NA, 100, 100, 3,
    100, 3, NA, 2, 4,
    100, 1, 5, NA, 100,
    1, 100, 4, 100, NA
    ),
  nrow = 5, ncol = 5, byrow = TRUE
  )

adj_mat
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]   NA  100    2    1  100
#> [2,]    4   NA  100  100    3
#> [3,]  100    3   NA    2    4
#> [4,]  100    1    5   NA  100
#> [5,]    1  100    4  100   NA

1) Assemble row indices, column indices, and the adjacency matrix's values into a list of 3 matrices:

rows_cols_vals_matrices <-  list(row_indices = row(adj_mat),
                                 col_indices = col(adj_mat), 
                                 values = adj_mat)
rows_cols_vals_matrices
#> $row_indices
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    1    1    1    1
#> [2,]    2    2    2    2    2
#> [3,]    3    3    3    3    3
#> [4,]    4    4    4    4    4
#> [5,]    5    5    5    5    5
#> 
#> $col_indices
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    2    3    4    5
#> [2,]    1    2    3    4    5
#> [3,]    1    2    3    4    5
#> [4,]    1    2    3    4    5
#> [5,]    1    2    3    4    5
#> 
#> $values
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]   NA  100    2    1  100
#> [2,]    4   NA  100  100    3
#> [3,]  100    3   NA    2    4
#> [4,]  100    1    5   NA  100
#> [5,]    1  100    4  100   NA

2) Flatten the matrices:

vectorized_matrices <- lapply(rows_cols_vals_matrices, as.vector)
vectorized_matrices
#> $row_indices
#>  [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
#> 
#> $col_indices
#>  [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5
#> 
#> $values
#>  [1]  NA   4 100 100   1 100  NA   3   1 100   2 100  NA   5   4   1 100
#> [18]   2  NA 100 100   3   4 100  NA

3) Bind the vectors into a 3 column matrix:

melted <- do.call(cbind, vectorized_matrices)
head(melted)
#>      row_indices col_indices values
#> [1,]           1           1     NA
#> [2,]           2           1      4
#> [3,]           3           1    100
#> [4,]           4           1    100
#> [5,]           5           1      1
#> [6,]           1           2    100

4) Drop rows where column 3 is NA:

filtered <- melted[!is.na(melted[, 3]), ]
filtered
#>       row_indices col_indices values
#>  [1,]           2           1      4
#>  [2,]           3           1    100
#>  [3,]           4           1    100
#>  [4,]           5           1      1
#>  [5,]           1           2    100
#>  [6,]           3           2      3
#>  [7,]           4           2      1
#>  [8,]           5           2    100
#>  [9,]           1           3      2
#> [10,]           2           3    100
#> [11,]           4           3      5
#> [12,]           5           3      4
#> [13,]           1           4      1
#> [14,]           2           4    100
#> [15,]           3           4      2
#> [16,]           5           4    100
#> [17,]           1           5    100
#> [18,]           2           5      3
#> [19,]           3           5      4
#> [20,]           4           5    100

5) Wrap it all up into a function:

as_edgelist.adj_mat <- function(x, .missing = NA) {
  # if there arerow/colnames or non-numeric data, you'll need to to use a data frame to 
  # handle heterogenous data types
  stopifnot(is.numeric(x) & is.null(dimnames(x))) 
  melted <- do.call(cbind, lapply(list(row(x), col(x), x), as.vector))
  if (is.na(.missing)) {
    out <- melted[!is.na(melted[, 3]), ]
  } else {
    out <- melted[melted[, 3] != .missing, ]
  }
  out
}

6) Take it for a spin:

as_edgelist.adj_mat(adj_mat)
#>       [,1] [,2] [,3]
#>  [1,]    2    1    4
#>  [2,]    3    1  100
#>  [3,]    4    1  100
#>  [4,]    5    1    1
#>  [5,]    1    2  100
#>  [6,]    3    2    3
#>  [7,]    4    2    1
#>  [8,]    5    2  100
#>  [9,]    1    3    2
#> [10,]    2    3  100
#> [11,]    4    3    5
#> [12,]    5    3    4
#> [13,]    1    4    1
#> [14,]    2    4  100
#> [15,]    3    4    2
#> [16,]    5    4  100
#> [17,]    1    5  100
#> [18,]    2    5    3
#> [19,]    3    5    4
#> [20,]    4    5  100