5
votes

I am planning on representing a fairly large, sparse, undirected graph structure in C++. This will be of the order of 10,000+ vertices, with each vertex having a degree of around 10.

I have read some background on representing graphs as adjaciency matrices or lists, but neighther of them seem suitable for what I want to do. In my scenario:

  • Each edge in the graph will have some properties (numeric values) attached
  • After the initial graph creation, edges can be deleted, but never created
  • Vertices will never be created or deleted
  • The main query operation on the graph will be a lookup, for an edge E, as to which other edges are connected to it. This is equivalent to finding the edges connected to the vertices at each end of E.

It is this final point which makes the adjacency matrix seem unsuitable. As far as I can see, each query would require 2*N operations, where N is the number of nodes in the graph.

I believe that the adjacency list would reduce the required operations, but seems unsuitable because of the parameters I am including with each edge - i.e. because the adjacency list stores each

Is there a better way to store my data such that these query operations will be quicker, and I can store each edge with its parameters? I don't want to start implementing something which isn't the right way to do it.

2
Adjacency list is OK, you can attach the data along with the edge. What's wrong? Adjacency matrix is out of the question, unless you have a lot of memory available (near 100 MB), while there are only 10^5 edges in total. - nhahtdh
Ok, thanks. But how do I deal with the repetition present in the adjacency list? Each edge will be represented twice, once in the row of node i and once in the row of node j. - Bill Cheatham

2 Answers

7
votes

I don't see a problem with the usual object oriented approach, here. Have your Edge<V,E> and Vertex<V,E> types where V is the content stored by vertices and E is the content stored by edges. Each edge has two references to its respective vertices and two indices that say at which slot in the respective vertex to look for this edge:

template <typename V, typename E>
class Edge {
  struct Incidence {
    size_t index;
    Vertex<V,E>& v;
  };
  std::array<Incidence,2> vertices;
  E content;
};

template <typename V, typename E>
class Vertex {
  std::vector<Edge<V,E>*> edges;
};

If you remove an edge e, you go Vertex<V,E>::edges and move the position at the back to the previous position of e. Constant time removal. Linear time (in the size of the operation's result) for enumerating all adjacent edges to a particular edge. Sounds good to me.

1
votes

Does something like this seem plausible? This stores edge adjacency:

#include <vector>
#include <map>

typedef int vertex;

struct edge
{
    vertex a,b;
    // other data
};

bool operator< (const edge &a, const edge &b)
{
    unsigned long long ahash = a.a;
    ahash << 32;
    ahash |= a.b;
    unsigned long long bhash = b.a;
    bhash << 32;
    bhash |= b.b;
    return ahash < bhash;
}

// support for your query operation
typedef std::map<edge, std::vector<edge &> > edge_adjacency;

It seems like you kind of want to map edges to vertices, and then use something pretty standard.