Here's the rundown of what I need:
- A graph database
- Each node is a document; there will be hundreds of types of nodes; each of these several hundred types will have its own consistent schema.
- Can scale to billions of nodes
- Each node also has a (lat,lng) cooordinate in addition to the edges between nodes
- I want to use (lat,lng) as a shard key so this can be scaled to a large sharded, replicated cluster. Edge traversals will occur ~95% within nearby (lat,lng) locations.
- I want to be able to issue geo+document queries. For example "Show me all the graph nodes/documents matching this query { ... } ordered by distance from (lat_0, lng_0)"
- I want something that's well-documented, has an active developer community, is recommended for production use, and likely to be around for years.
Here are problems with existing databases:
- MongoDB: no graph support, no joins
- Neo4j: no sharding
- OrientDB: no geospatial indexing
- ArangoDB: can do WITHIN queries but cannot have additional query clauses (e.g. MongoDB's geoNear has a query parameter)
Is there anything that fits my use case?