python - Load nodes with attributes and edges from DataFrame to NetworkX

Question

I am new using Python for working with graphs: NetworkX. Until now I have used Gephi. There the standard steps (but not the only possible) are:

Load the nodes informations from a table/spreadsheet; one of the columns should be ID and the rest are metadata about the nodes (nodes are people, so gender, groups... normally to be used for coloring). Like:
```
id;NormalizedName;Gender
per1;Jesús;male
per2;Abraham;male
per3;Isaac;male
per4;Jacob;male
per5;Judá;male
per6;Tamar;female
...
```
Then load the edges also from a table/spreadsheet, using the same names for the nodes as it was in the column ID of the nodes spreadsheet with normally four columns (Target, Source, Weight and Type):
```
Target;Source;Weight;Type
per1;per2;3;Undirected
per3;per4;2;Undirected
...
```

This are the two dataframes that I have and that I want to load in Python. Reading about NetworkX, it seems that it's not quite possible to load two tables (one for nodes, one for edges) into the same graph and I am not sure what would be the best way:

Should I create a graph only with the nodes informations from the DataFrame, and then add (append) the edges from the other DataFrame? If so and since nx.from_pandas_dataframe() expects information about the edges, I guess I shouldn't use it to create the nodes... Should I just pass the information as lists?
Should I create a graph only with the edges information from the DataFrame and then add to each node the information from the other DataFrame as attributes? Is there a better way for doing that than iterating over the DataFrame and the nodes?

harryscholes harryscholes · Accepted Answer · 2017-03-02T14:50:35

Create the weighted graph from the edge table using nx.from_pandas_dataframe:

import networkx as nx
import pandas as pd

edges = pd.DataFrame({'source' : [0, 1],
                      'target' : [1, 2],
                      'weight' : [100, 50]})

nodes = pd.DataFrame({'node' : [0, 1, 2],
                      'name' : ['Foo', 'Bar', 'Baz'],
                      'gender' : ['M', 'F', 'M']})

G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight')

Then add the node attributes from dictionaries using set_node_attributes:

nx.set_node_attributes(G, 'name', pd.Series(nodes.name, index=nodes.node).to_dict())
nx.set_node_attributes(G, 'gender', pd.Series(nodes.gender, index=nodes.node).to_dict())

Or iterate over the graph to add the node attributes:

for i in sorted(G.nodes()):
    G.node[i]['name'] = nodes.name[i]
    G.node[i]['gender'] = nodes.gender[i]

Update:

As of nx 2.0 the argument order of nx.set_node_attributes has changed: (G, values, name=None)

Using the example from above:

nx.set_node_attributes(G, pd.Series(nodes.gender, index=nodes.node).to_dict(), 'gender')

And as of nx 2.4, G.node[] is replaced by G.nodes[].

python - Load nodes with attributes and edges from DataFrame to NetworkX

3 Answers

Update: