Update dataframe B with values from dataframe A in R

Question

I am doing social network analysis and working with two data frames. Dataframe A (or "nodes") has the information related to each node of the network (i.e. id and name). Dataframe B (or "links") has two columns: "from" and "to" which basically shows how the nodes are connected between them. Each row represents a link "from" one node "to" the other. I want to use the package networkD3 to visualize the network but it has some requirements: id's should start from zero and they have to be consecutive (0,1,2, etc). Because my nodes and links are a random subset from a larger database, they are not consecutive. I sorted the "nodes" data frame based on the id and created a new column (new_id) starting from zero and with consecutive numbers. But now, I don't know how to update the "links" data frame based on the new_id's. Currently, I am converting the values in the "links" data frame to characters and then revaluing them using the plyr package. But I need to do this for a larger dataset. I am copying a sample of the two data frame that I have now:

set.seed(10)
nodes_df <- data.frame(id = c(1,3,5,6,8,10), 
     name = c("Agriculture", "Agriculture_in_Mesoamerica", "Agriculture_in_ancient_Greece",
     "Agriculture_in_ancient_Rome", "Agriculture_in_India", "Agriculture_in_China"), 
     new_id = seq(0,5))

links_df <- data.frame(from = c(3,3,5,6,8,10), 
           to = c(1,5,6,8,10,3))

In summary, I need to update the values in the links_df to correspond to the new_id values from the nodes_df.

Thank you so much in advance. I hope I was clear enough. Best regards,

This looks like either merge or just links_df$to[ match(notes_df$id,links_df$from) ]. — r2evans

Chriss Paul Chriss Paul · Accepted Answer · 2021-03-26T21:22:03

In base you just need to use merge and extract your required column

links_df$new_to <- merge(links_df, nodes_df, 
                         by.x = "to", by.y = "id",
                         all.x = TRUE)$new_id
links_df$new_from <- merge(links_df, nodes_df, 
                         by.x = "from", by.y = "id",
                         all.x = TRUE)$new_id
links_df <- links_df[,c(1,2,4,3)] # Reordering columns
links_df
  from to new_from new_to
1    3  1        1      0
2    3  5        1      1
3    5  6        2      2
4    6  8        3      3
5    8 10        4      4
6   10  3        5      5

Update dataframe B with values from dataframe A in R

3 Answers