Efficient sending graph data structure over WAN

Question

I am looking for an efficient way of sending property graph over WAN. For the sake of latency and bandwidth I should make it as efficient as possible. One solution would be serializing the graph data structure and sending that in a chunk of smaller size over socket. This approach could be implemented in different ways. Probably the most obvious implementation would be sending graph in a serialized List of triple (vertex-edge-vertex) for all edges and serialized vertices for all disconnected vertices. However, it is clear that this solution is not efficient since vertices could be sent multiple times and can use lots of bandwidth.

I know graph data structure can be transferred using adjacency list and adjacency matrix, but I can not figure it out the way of serialization and sending that over WAN in these forms. I would be really grateful if somebody can help me with an efficient solution.

Eugen Hotaj Eugen Hotaj · Accepted Answer · 2016-04-22T15:58:51

If you use an adjacency matrix you still run into the issue of sending over unneeded data since you have to specify n^2 vertex connections (or lack thereof) so if you have a very sparsely connected graph you'll still have to send over at least n^2 bits of information.

One simple solution where you only use n^2 bits + 1 byte of information is to implicitly send the adjacency matrix.

Say you have n vertices. Number them arbitrarily from 1...n. Then create an n x nadjacency matrix A where A(i,j)==1 if an edge exists from vertex i to j and 0 if no such connection exists. As you can see, you will only need n^2 bits (not bytes!) to specify your adjacency matrix.

Then you serialize in this way: At the beginning use 1 byte (32 bits), i.e an int to specify the size of your matrix (n). Then the next n^2 bits should be your adjacency matrix.

To deserialize, read the first byte to figure out the size of your adjacency matrix then read the next n^2 bits for the edge connections.

Note: even though I'm using the term matrix here to make it conceptually easy to think about, the data need not be explicitly structured in matrix fashion but can be stored in a linear fashion. Then to access row i column j you simply do A(i*n+j).

Obviously this solution only send over the graph structure and not any actual data contained in the graph. However, I personally think this is still a good idea to do even if you need to send over actual data contained in the graph since once you have the graph structure, you can serialize the rest of the data however you want and reconstruct it on the other end.

Efficient sending graph data structure over WAN

1 Answers