1
votes

I have a simple graph model. In this graph, Each node has an attribute {NodeId}. Each Edge will link two nodes without other attributes. It is an directed graph and has about 10^6 nodes.

Here is my situation: I created index on attribute {NodeId} at first.Then I created 10^6 nodes. In this time, I have a graph with 10^6 nodes and no edges. When I want to randomly add edges, I found that the speed is very slow. I can only add about 40 edges per second.

Did I miss any configurations? I don't think this is a reasonable speed.

The Code for adding edges:

public static void addAnEdge(GraphClient client, Node a, Node b)
    {
        client.Cypher
        .Match("(node1:Node)", "(node2:Node)")
        .Where((Node node1) => node1.Id == a.Id)
        .AndWhere((Node node2) => node2.Id == b.Id)
        .Create("node1-[:Edge]->node2")
        .ExecuteWithoutResults();
    }

Should I add index on edges? If so, How to do it in neo4jClient? Thanks for your help.


Batch all my queries into one transaction is a good ieal. I execute following statement in my browser(http://localhost:7474):

MATCH (user1:Node), (user2:Node)
WHERE user1.Id >= 5000000 and user1.Id <= 5000100 and user2.Id >= 5000000 and user2.Id <= 5000100
CREATE user1-[:Edge]->user2

In this statement I create 10000 edges in one transaction. So I think the http overhead is not so serious now. The result is:

Created 10201 relationships, statement executed in 322969 ms.

That means I add 30 edges per second.

2
Are you issuing the same request every time ? If yes, then you have the http overhead on every edge creation. You should batch your statements in transactionsChristophe Willemsen
I don't know anything about neo4jclient, but in general you could try to group queries in transactions: github.com/Readify/Neo4jClient/wiki/TransactionsMartin Preusse
Thanks for your reply. I want to provide an interface that adding one edge. So I don't want to batch them into one transaction. Is the http overhead so serious if I use the local Neo4j? @ChristopheWillemsenIamVeryClever
The http overhead is serious with any http client for neo4jChristophe Willemsen
I tried another way to reduce the http overhead. Could you please see my modification? Thank you @ChristopheWillemsenIamVeryClever

2 Answers

3
votes

The ideal solution is to pass pairs of nodes to be related in one parameters map, then with UNWIND you can iterate those pairs and create the relationship, this is really performant as long as you have an index on the Id property of the Node nodes.

I don't know how you can do it with Neo4jClient, but here is the Cypher statement :

UNWIND {pairs} as pair
MATCH (a:Node), (b:Node)
WHERE a.Id = pair.start AND b.Id = pair.end
CREATE (a)-[:EDGE]->(b)

The parameters to be sent along with the query should have this form :

{
  "parameters": {
    "pairs": [
      {
        "start": "1",
        "end": "2"
      },
      {
        "start": "3",
        "end": "4"
      }
    ]
  }
}

UPDATE

The Neo4jClient author kindly gave me the equivalent code in Neo4jClient :

var parameters = new [] {
       new {start = 1, end = 2},
       new {start = 3, end = 4}
   };

   client.Cypher
       .Unwind(parameters, "pair")
       .Match("(a:Node),(b:Node)")
       .Where("a.Id = pair.start AND b.Id = pair.end")
       .Create("(a)-[:EDGE]->(b)")
       .ExecuteWithoutResults();
0
votes

In your updated Cypher query, you MATCH a cartesian product of all your nodes. That is very slow. Have a look at the EXPLAIN of your query.

And see this question for an explanation how to deal with cartesian products: Why does neo4j warn: "This query builds a cartesian product between disconnected patterns"?

Do you have an index on the Id property? Ideally, you should use a uniqueness constraint. This automatically adds a very fast index.

In your query, try to first MATCH the first nodes, use WITH to collect them in a list and then MATCH the second batch of nodes:

MATCH (user1:Node)
WHERE user1.id >= 50000 and user1.id <= 50100
WITH collect(user1) as list1
MATCH (user2:Node)
WHERE user2.id >= 50000 and user2.id <= 50100
UNWIND list1 as user1
CREATE (user1)-[:EDGE]->(user2)