2
votes

Same cypher query taking different time when executed through different consoles:

Executed via spring-data-neo4j: (took 8 seconds)

@Query(
"MATCH (user:User {uid:{0}})-[:FRIEND]-(friend:User)" +
"RETURN friend"
)
public List<User> getFriends(String userId);

Executed via http://localhost:7474/browser/ : (took 250 ms)

Executed via http://localhost:7474/webadmin/#/console/ : (took 18 ms)

Even though queries executed via console are very fast and taking time under acceptable range but for production I have to execute those queries from my java app and in which case the time taken by queries are totally unacceptable.

Edit:

@NodeEntity
public class User extends AbstractEntity {

    @RelatedToVia(elementClass = Friendship.class, type = FRIEND, direction = BOTH)
    private Set<Friendship>     friendships;

    ...
}
1
What if you repeat the query? Possibly the first query is slow (250 ms) and the from console it is fast? What if you run the query from the console first? - František Hartman
How did you model the entities? Are friends included in users as a set annotated with @Fetch? In that case the reason for the long time taken from SDN could be in loading objects into memory. Also advanced mapping is faster then the simple one. - remigio
@remigio is right, especially if you use rest database which is still very ineffective for these kind of queries - František Hartman
@frant.hartm I executed the same query alternatively various times and every time experienced the same behaviour - Yatendra
Have a look at this link for an explanation about the difference between simple and advanced mapping. It could make the difference. Moreover @frant.hartm pointed out a good remark: are you using the rest server or the embedded version of Neo4j? - remigio

1 Answers

-1
votes

Make sure when benchmarking that you run tests at least 3 times and plan to drop the first one. I find that it takes one or two runs to warm the cache for Neo4j for any given query. This is different than most RDBMS which don't cache the same way.

My practice is to run test queries 5 times and drop the first 2. This works pretty consistently for me regardless of the size of data set (I test with tens of thousands and hundreds of thousands of nodes) and the complexity of the script (I have some Cypher statements that run more than 50 or so lines, multiple WITHs, etc). What I've found is that after the first two runs, the performance from one run to the next of the same query tends to stay right around the same value..

So make sure that in production you've warmed your cache for your common queries. And make sure that you've got sufficient memory available to the JVM and Neo4j. Most data sets are smaller than a few GB and so can fit in memory, if Neo4j has access to enough memory.

Finally, be sure you have your indexes in place (e.g. CREATE INDEX ON :User(uid)). With most of the graph and its properties loaded in memory, and the right indexes in place, Neo4j should really perform.