10. Graphing Network Data with Pandas ¶
10.1. Covered in this Chapter¶
How to Get Network Data from Pandas to NetworkX
How to Graph the Data
How to Customize the Graph
10.2. Getting the Data from Pandas to NetworkX¶
Pandas on its own cannot plot out network data. Instead, we must rely on two other libraries, NetworkX and Matplotlib. NetworkX is the standard Python library for working with networks. I have a forthcoming textbook, like this one, that walks users through NetworkX. Matplotlib is one of the standard plotting libraries. The purpose of this brief notebook, is to provide the code necessary for making Pandas work with NetworkX and Matplotlib to take networks stored in a Pandas DataFrame and transform the relationships into graphs.
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
Let’s now load our data and see what it looks like.
df = pd.read_csv("data/network.csv")
df
source | target | |
---|---|---|
0 | Tom | Rose |
1 | Rose | Rosita |
2 | Jerry | Jeff |
3 | Jeff | Larry |
4 | Carmen | Carmen |
5 | Rosita | Rosita |
6 | Larry | Carmen |
7 | Larry | Jerry |
This is a pretty standard format for networks. We have two columns of data, a source, and a target. Imagine drawing a line to demonstrate networks, the source is where you start drawing the line and the target is where that line ends. This is known as force in network theory and is important for understanding the relationship between nodes, or individual points, in a network graph.
We can use NetworkX’s built in function from_pandas_edgelist() and get that data straight into an edgelist.
G= nx.from_pandas_edgelist(df, "source", "target")
10.3. Graphing the Data¶
And with just two more lines of code we can plot that data out.
nx.draw(G)
plt.show()
10.4. Customize the Graph¶
We have a problem with the image above, however, it is difficult to understand who the nodes represent. Let’s give them some labels.
nx.draw(G, with_labels=True)
plt.show()
Now that we have labels, we need to make them a bit easier to read. We can do this by changing the font color to “whitesmoke” and setting the background to gray. To achieve this we first need to create a fig object to which we will append a few attributes. Next, we draw the network graph and give it a font_color of our desire. Finally, we set the facecolor to gray and plot it.
fig = plt.figure()
nx.draw(G, with_labels=True, font_color="whitesmoke")
fig.set_facecolor('gray')
plt.show()
What if I wanted each node in our network to have an individual color? We can do that too by setting up a color map.
val = []
for i in range(len(G.nodes)):
val.append(i)
nx.set_node_attributes(G, val, 'val')
fig = plt.figure()
nx.draw(G, with_labels=True, node_color=val, font_color="whitesmoke")
fig.set_facecolor('gray')
plt.show()