4
votes

I am working hierarchical data, as in the tree structure. i want to know what is the best way to store them in database.

I started with adjacency list, in MySQL. But the performance seems to dip as the data is increasing. I have around 20,000 rows stored in a MySQL table with parent child relationship and will increase in future. Fetching data is taking very long time as I have to write many self joins depending upon the depth of the tree.

So I was searching for best way to store this kind of data. In once place I found Nested Sets is better way than adjacency lists. Then I was advised to look upon NoSQL, if that would solve my problem. So I am confused now whether to remain in SQL or go into No SQL or if there is any other best way to handle this kind of data.

So can anyone suggest me what is the best way??

2
How do you intend to manipulate your data ? Do you need strong consistency or performance on a particular kind of operation ? Do you want SQL ?LMeyer
I need better performance, as I'm using many joins in my SQL queries. I'm already using MySQL. I intend to have many reads, than writes in to database.Kushi

2 Answers

4
votes

If MySQL is giving you more troubles than it solves, I'd take a look at MongoDB, CouchDB or ElasticSearch (depending on your use case). Maybe even Neo4j. Your choice should come down to several points such as replication, scaling capacity, consistency... I advise you to read carefully some official documentations before you decide. Here's a starting point for comparison.

Going NoSQL will get rid of all the joins and improve your performance but you'll still need to implement a proper hierarchy using adjacency list, nested sets, materialized paths and such...

Keep in mind NoSQL technologies above pretty much all use eventual consistency, which essentially means that your data might not be consistent at a given time among some nodes. If this is a problem you should stick to RDBMS.

1
votes

Postgres has native support for it, using ltree:

-- Ltree type presentation
-- Farshid Ashorui

-- First of all, this is an extension (included with standard installation)
CREATE EXTENSION IF NOT EXISTS ltree;

-- We need to specify `ltree` type.
CREATE TABLE IF NOT EXISTS tree(
    id serial primary key,
    letter char,
    path ltree
);


-- we are using `gist` index for super fast indexing of the path.
-- read more here: http://patshaughnessy.net/2017/12/15/looking-inside-postgres-at-a-gist-index
-- This is Postgres’s GiST index API to find and match descendant nodes
CREATE INDEX IF NOT EXISTS tree_path_idx ON tree USING GIST (path);


-- @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

-- Root of heirarchy
insert into tree (letter, path) values ('A', 'A');
insert into tree (letter, path) values ('B', 'A.B');
-- Notice here, we are deviating 
insert into tree (letter, path) values ('C', 'A.C');
insert into tree (letter, path) values ('D', 'A.C.D');
insert into tree (letter, path) values ('E', 'A.C.E');
insert into tree (letter, path) values ('F', 'A.C.F');
-- Back to B path
insert into tree (letter, path) values ('G', 'A.B.G');





-- @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
-- Search for A.C path
select * from tree where path <@ 'A.C';
-- More advanced one:
select * from tree where strpos(path::varchar, 'A.B.G') = 1;