8
votes

this is a question on best practice, i understand that there are a lot of different options for doing this, but i would like your opinions as to how you would approach solving this problem. Please take it as though performance is critical in this system, in other words scalable.

I have recently found the wonders of graph database, so i came up with a theoretical situation where a company wants to manage it's customers relationships, and in order to do so they are going to use neo4j which is great, and allows for really great management of the customers, different staff members and their relationships, which is all great, however the company now wants to create a web based interface which will need authentication, and anyone in the neo4j database should be able to login to the system in order to see how they are related to other people in the company's database, so each user must have a password/email/id associated with their name.

So my question is, in this case scenario, is it best to store the password_hash/password_salt/id/email in a mysql database and then based on the node look it up on the mysql database. Or is it better to store the password_hash/password_salt/id/email in the hash tables inside the nodes.

Also each store has 1000s of products, and they can be stored in the graph database or i can store the products in the mysql database and then look up the product there, and do the changes there, because the products are not related to each other, so no point in storing them in the graph database, so should they be not stored there to improve performance?

So my question boils down to this: is it best for large projects to use a graph database along with the more common rdms database such as mysql? if not, then what is the point at which you start to use these two database systems?

apologies in advance for my lack of knowledge regarding database terminology.

3

3 Answers

13
votes

Graph DB is mainly used for maintaining relations. If app has a graph DB that does not mean that app needs to store everything in Graph DB.

Every node request on Graph is in memory and thus if you have unnecessary properties in your node it will be bloated and may make things slower and take more memory.I usually decide what needs to go in graph and what needs to go in DB by very simple rule.

High level property (that defines the relation and other important properties that defines the node) goes in graph whereas additional information goes in RDMS.

For example in FB may be FBID, Name goes in Graph as it defines the relationship of one node with another. But when user clicks on someones facebook ID, he/she gets to see other users DOB, Age , College .All these can go in RDBMS.

PS: RDMS has another advantage, it can be used for quick analytics. I know with graph also you can do that but i am not sure if its as scalable and easy as RDBMS.

Downside to this approach is : You need to maintain two DBS.

3
votes

Unless you have a proven case for a two-DB solution, I'd say fewer moving parts would keep you more agile, more able to change things quickly. If later you find a use case that is difficult, then weigh up the cost/ benefit of introducing a second storage. A two-DB architecture is not unheard of, but comes with an overhead.

Specific to security, there is no reason why Neo4j or any other reasonable NOSQL solution couldn't do that: http://spring.neo4j.org/docs#tutorial_security

1
votes

You should use both in case there is data where it does not make much sense to store it in a graph DB such as neo4j/orientDB (and some data would be better off in a graph DB as opposed to a relational DB). Forcing data on one platform may cause issues with performance/scalability down the line.