This is my first project using Neo4j and the associated spatial plug in. I am experiencing performance well below what I was expecting and below what's needed for this project. As a noob I may be missing something or have misunderstood something. Help is appreciated and needed.
I am experiencing very slow response time for Neo4j and Spatial plugin when trying to find surrounding OSM ways to a point specified by lat/lon to process GPS reading from a driven trip. I am calling spatial.closest ("layer', {lon, lat), 0.01) which is taking 6-11 seconds to process and return approximately 25 - 100 nodes.
I am running Neo4j community edition 3.0.4 and spatial 0.20 running on MacBook Pro 16GB / 512GB SSD. The OSM data is massachusetts-latest.osm (Massachusetts, USA.) I am accessing it via bolt and Cypher. Instrumented testing has been done from browser client, python client, java client as well as a custom version of spatial that reports timing for the spatial stored procedure. The Neo4j database is approximately 44GB in size, contains 76.5M nodes and 118.2M relationships. The schema and data are 'as-is' from the OSMImport.
To isolate the performance I added a custom version of spatial.closest( ) named spatial.timedClosest( ). The timedClosest() stored procedure takes the same input and has the same calls as spatial.closest(), but returns a Stream instead of a Stream. The Stream has timing information for the stored procedure.
The stored procedure execution time is split evenly between the internal call to getLayerOrThrow( ) and SpatialTopologyUtils.findClosestEdges( ).
1) Why does getLayer(layerName) take so long to execute? I am very surprised to observe getLayer(layerName) takes so long: 2.5 - 5 seconds. There is only one layer, the OSM layer, directly off the root node. I see the same hit on calls to spatial.getLayer(). Since the layer is an argument to many of the spatial procedures, this is a big deal. Anyone have insight into this?
2) Is there a way to speed up SpaitalTopologyUtils.findClosestEdges( )? Are there additional indexes that could be added to speed up the spatial proximity search?
My understanding is Neo4j is capable of handling billions of nodes / relationships. For this project I am planning to load North America OSM data. From my understanding of spatial plug in, it has spatial management and searching capabilities that would provide a good starting foundation.