3
votes

Is it bad design / difficult to implement two or more databases for an application?

For instance, let's say I have User objects which I would like to store into a relational database. These 'User' objects have relationships with one another and have user feeds (think of Twitter / Facebook) and I want to store these relationships to be able to find friends of a friend, to see how "deep" I am into a chain of feeds, etc. These relationships would be stored in the graph database.

Is there any better way to go about this or would using a graph database for relationships and a relational database for data storage be the best solution?

4
I wanna to implement the same approach in my project - store all data in RDBMS and all relationships in neo4j. If u already implemented this - is it a good idea or not ??? If select (without any relationship) some huge data from RDBMS, it's faster than with graph DB??? Any other pros and cons? - Тарас Красниця

4 Answers

4
votes

Disclaimer: I haven't used the enterprise version of neo4j yet, so it may have capabilities to help you here that I am not aware of.

If you can keep everything in neo4j, that's best as it keeps complexity low on several fronts, including data modeling, keeping data in sync, and keeping queries easy and atomic instead of splitting them between separate databases.

It would help to know what your requirements are for using RDBMS, and if those requirements justify the complexity introduced from above.

If you are determined to do this, then you've got a choice between high data redundancy, or going with a more skeletal neo4j db which only keeps IDs, relationships, and minimal data.

With high data redundancy, mirroring most if not all data in neo4j, then you've got the added complexity of keeping everything in sync and consistent between your dbs (not trivial at all). This buys you richer queries through neo4j since most data is all in the same db, and greater ability to cut off from your rdbms and go with just neo4j in the future. But any query that alters both dbs will not be automatically atomic...you'll have to do some kind of enforcement in your server side code for this, and that's likely to be tricky.

With a skeletal approach, most of your neo4j queries will have ID inputs and ID outputs. Rich data may not be available, so then you'll take those IDs and do your select on your rdbms on those ids for the data you need. Any queries involving relationships and expecting data back will require usage of both dbs, which can be troublesome for developers, though probably fine on your server-side code. You'll be avoiding issues of synchronization and probably atomicity as well, since the common data between the two dbs will be minimal.

It's worth noting that there are some solutions for integrating with other databases in a similar pattern to what you've proposed.

GraphAware has a concept of graph-aided search which offers integration between ElasticSearch and neo4j, though this is primarily to address requirements for rich searchability. This can be used to either feed ElasticSearch for use as a pure search engine, or it can let you store all your data in ES and boost or affect results based upon relational data in neo4j.

1
votes

A little bit late for the party. Generally it is not recommended to use two databases in one project at the same time. Just like @InverseFalcon pointed out in the answer above, there will be plenty of problems with such an architecture.

To my understanding, you can use a distributed graph database, which can not only handle the relationships perfectly well, but also solve the storage problem of large amounts of data.

1
votes

I am going to disagree with all of those who say that you shouldn't use two different kinds of databases in one project. I've been building systems for 30+ years and in all of the enterprise systems that I've had an opportunity to work on, we always used the best tools for each part of the project. It's common to combine object/graph databases with relational databases and document databases into the same project. We will support high-speed link analysis in the object/graph database and then hop out to the relational database for metadata and then into the document database for the original source information. Or you can reverse the entire process and start with a document search, then do the high-speed link analysis.

In many cases, it's a matter of being forced to use exabytes of legacy relational data, but needing to add new capabilities such has high-speed link analysis. Instead of attempting to port all of the relational data over to the object/graph database, we just move over the information we need.

Don't force a square peg into a round hole. Use the best tool for each part of the project.

0
votes

If you are looking to combine the relational and graph data model. You can use SQL Server 2017+ versions, where graph capabilities are added. The graph relationships are integrated into Transact-SQL and receive the benefits of using SQL Server as the foundational database management system.

As mentioned in the documentation:

Graph extensions are fully integrated in SQL Server engine. Use the same storage engine, metadata, query processor, etc. to store and query graph data. Query across graph and relational data in a single query. Combining graph capabilities with other SQL Server technologies like columnstore, HA, R services, etc. SQL graph database also supports all the security and compliance features available with SQL Server.

References