I'm trying to wrap my mind around graph data right now. I'm finding it difficult to think in terms of property graphs. On the vertex centric indeces docs page, there is an example involving twitter data. The Gremlin code is:
g = TitanFactory.open(conf)
// graph schema construction
g.makeKey('name').dataType(String.class).indexed(Vertex.class).make()
time = g.makeKey('time').dataType(Long.class).make()
if(useVertexCentricIndices)
g.makeLabel('tweets').sortKey(time).make()
else
g.makeLabel('tweets').make()
g.commit()
// graph instance construction
g.addVertex([name:'v1000']);
g.addVertex([name:'v10000']);
g.addVertex([name:'v100000']);
g.addVertex([name:'v1000000']);
for(i=1000; i<1000001; i=i*10) {
v = g.V('name','v' + i).next();
(1..i).each {
v.addEdge('tweets',g.addVertex(),[time:it])
if(it % 10000 == 0) g.commit()
}; g.commit()
}
The explanation is that each edge represents someone tweeting a tweet vertex. This doesn't make sense to me as a schema. Why should any two nodes be connected? If the answer is that the edge connects different tweets that a user has tweeted, then one edge connects more than one node. This would mean that Titan is a hypergraph, which I thought it wasn't.
In short, can someone explain this example better than the docs?