355
votes

I have a MySQL table which is as follows:

id name parent_id
19 category1 0
20 category2 19
21 category3 20
22 category4 21
... ... ...

Now, I want to have a single MySQL query to which I simply supply the id [for instance say id=19] then I should get all its child ids [i.e. result should have ids '20,21,22']....

The hierarchy of the children is not known; it can vary....

I know how to do it using a for loop... but how to achieve the same using a single MySQL query?

16
Suppose the hierarchy is 7 levels deep. What do you expect the output table to look like?Jonathan Leffler
MySQL (still) doesn't support hierarchical queries (as other modern DBMS do). You will need to write a stored procedure or use a different datamodel.a_horse_with_no_name
MYSQL 8.0 will support Recursive query using CTE (Common Table Expressions)user3712320
What about getting the full list of posts starting from the last comment id? Or the last child?joe

16 Answers

510
votes

For MySQL 8+: use the recursive with syntax.
For MySQL 5.x: use inline variables, path IDs, or self-joins.

MySQL 8+

with recursive cte (id, name, parent_id) as (
  select     id,
             name,
             parent_id
  from       products
  where      parent_id = 19
  union all
  select     p.id,
             p.name,
             p.parent_id
  from       products p
  inner join cte
          on p.parent_id = cte.id
)
select * from cte;

The value specified in parent_id = 19 should be set to the id of the parent you want to select all the descendants of.

MySQL 5.x

For MySQL versions that do not support Common Table Expressions (up to version 5.7), you would achieve this with the following query:

select  id,
        name,
        parent_id 
from    (select * from products
         order by parent_id, id) products_sorted,
        (select @pv := '19') initialisation
where   find_in_set(parent_id, @pv)
and     length(@pv := concat(@pv, ',', id))

Here is a fiddle.

Here, the value specified in @pv := '19' should be set to the id of the parent you want to select all the descendants of.

This will work also if a parent has multiple children. However, it is required that each record fulfills the condition parent_id < id, otherwise the results will not be complete.

Variable assignments inside a query

This query uses specific MySQL syntax: variables are assigned and modified during its execution. Some assumptions are made about the order of execution:

  • The from clause is evaluated first. So that is where @pv gets initialised.
  • The where clause is evaluated for each record in the order of retrieval from the from aliases. So this is where a condition is put to only include records for which the parent was already identified as being in the descendant tree (all descendants of the primary parent are progressively added to @pv).
  • The conditions in this where clause are evaluated in order, and the evaluation is interrupted once the total outcome is certain. Therefore the second condition must be in second place, as it adds the id to the parent list, and this should only happen if the id passes the first condition. The length function is only called to make sure this condition is always true, even if the pv string would for some reason yield a falsy value.

All in all, one may find these assumptions too risky to rely on. The documentation warns:

you might get the results you expect, but this is not guaranteed [...] the order of evaluation for expressions involving user variables is undefined.

So even though it works consistently with the above query, the evaluation order may still change, for instance when you add conditions or use this query as a view or sub-query in a larger query. It is a "feature" that will be removed in a future MySQL release:

Previous releases of MySQL made it possible to assign a value to a user variable in statements other than SET. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL.

As stated above, from MySQL 8.0 onward you should use the recursive with syntax.

Efficiency

For very large data sets this solution might get slow, as the find_in_set operation is not the most ideal way to find a number in a list, certainly not in a list that reaches a size in the same order of magnitude as the number of records returned.

Alternative 1: with recursive, connect by

More and more databases implement the SQL:1999 ISO standard WITH [RECURSIVE] syntax for recursive queries (e.g. Postgres 8.4+, SQL Server 2005+, DB2, Oracle 11gR2+, SQLite 3.8.4+, Firebird 2.1+, H2, HyperSQL 2.1.0+, Teradata, MariaDB 10.2.2+). And as of version 8.0, also MySQL supports it. See the top of this answer for the syntax to use.

Some databases have an alternative, non-standard syntax for hierarchical look-ups, such as the CONNECT BY clause available on Oracle, DB2, Informix, CUBRID and other databases.

MySQL version 5.7 does not offer such a feature. When your database engine provides this syntax or you can migrate to one that does, then that is certainly the best option to go for. If not, then also consider the following alternatives.

Alternative 2: Path-style Identifiers

Things become a lot easier if you would assign id values that contain the hierarchical information: a path. For example, in your case this could look like this:

ID NAME
19 category1
19/1 category2
19/1/1 category3
19/1/1/1 category4

Then your select would look like this:

select  id,
        name 
from    products
where   id like '19/%'

Alternative 3: Repeated Self-joins

If you know an upper limit for how deep your hierarchy tree can become, you can use a standard sql query like this:

select      p6.parent_id as parent6_id,
            p5.parent_id as parent5_id,
            p4.parent_id as parent4_id,
            p3.parent_id as parent3_id,
            p2.parent_id as parent2_id,
            p1.parent_id as parent_id,
            p1.id as product_id,
            p1.name
from        products p1
left join   products p2 on p2.id = p1.parent_id 
left join   products p3 on p3.id = p2.parent_id 
left join   products p4 on p4.id = p3.parent_id  
left join   products p5 on p5.id = p4.parent_id  
left join   products p6 on p6.id = p5.parent_id
where       19 in (p1.parent_id, 
                   p2.parent_id, 
                   p3.parent_id, 
                   p4.parent_id, 
                   p5.parent_id, 
                   p6.parent_id) 
order       by 1, 2, 3, 4, 5, 6, 7;

See this fiddle

The where condition specifies which parent you want to retrieve the descendants of. You can extend this query with more levels as needed.

88
votes

From the blog Managing Hierarchical Data in MySQL

Table structure

+-------------+----------------------+--------+
| category_id | name                 | parent |
+-------------+----------------------+--------+
|           1 | ELECTRONICS          |   NULL |
|           2 | TELEVISIONS          |      1 |
|           3 | TUBE                 |      2 |
|           4 | LCD                  |      2 |
|           5 | PLASMA               |      2 |
|           6 | PORTABLE ELECTRONICS |      1 |
|           7 | MP3 PLAYERS          |      6 |
|           8 | FLASH                |      7 |
|           9 | CD PLAYERS           |      6 |
|          10 | 2 WAY RADIOS         |      6 |
+-------------+----------------------+--------+

Query:

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS';

Output

+-------------+----------------------+--------------+-------+
| lev1        | lev2                 | lev3         | lev4  |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS          | TUBE         | NULL  |
| ELECTRONICS | TELEVISIONS          | LCD          | NULL  |
| ELECTRONICS | TELEVISIONS          | PLASMA       | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS  | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS   | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL  |
+-------------+----------------------+--------------+-------+

Most users at one time or another have dealt with hierarchical data in a SQL database and no doubt learned that the management of hierarchical data is not what a relational database is intended for. The tables of a relational database are not hierarchical (like XML), but are simply a flat list. Hierarchical data has a parent-child relationship that is not naturally represented in a relational database table. Read more

Refer the blog for more details.

EDIT:

select @pv:=category_id as category_id, name, parent from category
join
(select @pv:=19)tmp
where parent=@pv

Output:

category_id name    parent
19  category1   0
20  category2   19
21  category3   20
22  category4   21

Reference: How to do the Recursive SELECT query in Mysql?

16
votes

Did the same thing for another quetion here

Mysql select recursive get all child with multiple level

The query will be :

SELECT GROUP_CONCAT(lv SEPARATOR ',') FROM (
  SELECT @pv:=(
    SELECT GROUP_CONCAT(id SEPARATOR ',')
    FROM table WHERE parent_id IN (@pv)
  ) AS lv FROM table 
  JOIN
  (SELECT @pv:=1)tmp
  WHERE parent_id IN (@pv)
) a;
13
votes

Try these:

Table definition:

DROP TABLE IF EXISTS category;
CREATE TABLE category (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(20),
    parent_id INT,
    CONSTRAINT fk_category_parent FOREIGN KEY (parent_id)
    REFERENCES category (id)
) engine=innodb;

Experimental rows:

INSERT INTO category VALUES
(19, 'category1', NULL),
(20, 'category2', 19),
(21, 'category3', 20),
(22, 'category4', 21),
(23, 'categoryA', 19),
(24, 'categoryB', 23),
(25, 'categoryC', 23),
(26, 'categoryD', 24);

Recursive Stored procedure:

DROP PROCEDURE IF EXISTS getpath;
DELIMITER $$
CREATE PROCEDURE getpath(IN cat_id INT, OUT path TEXT)
BEGIN
    DECLARE catname VARCHAR(20);
    DECLARE temppath TEXT;
    DECLARE tempparent INT;
    SET max_sp_recursion_depth = 255;
    SELECT name, parent_id FROM category WHERE id=cat_id INTO catname, tempparent;
    IF tempparent IS NULL
    THEN
        SET path = catname;
    ELSE
        CALL getpath(tempparent, temppath);
        SET path = CONCAT(temppath, '/', catname);
    END IF;
END$$
DELIMITER ;

Wrapper function for the stored procedure:

DROP FUNCTION IF EXISTS getpath;
DELIMITER $$
CREATE FUNCTION getpath(cat_id INT) RETURNS TEXT DETERMINISTIC
BEGIN
    DECLARE res TEXT;
    CALL getpath(cat_id, res);
    RETURN res;
END$$
DELIMITER ;

Select example:

SELECT id, name, getpath(id) AS path FROM category;

Output:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 19 | category1 | category1                               |
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
| 23 | categoryA | category1/categoryA                     |
| 24 | categoryB | category1/categoryA/categoryB           |
| 25 | categoryC | category1/categoryA/categoryC           |
| 26 | categoryD | category1/categoryA/categoryB/categoryD |
+----+-----------+-----------------------------------------+

Filtering rows with certain path:

SELECT id, name, getpath(id) AS path FROM category HAVING path LIKE 'category1/category2%';

Output:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
+----+-----------+-----------------------------------------+
10
votes

The best approach I've come up with is

  1. Use lineage to store\sort\trace trees. That's more than enough, and works thousands times faster for reading than any other approach. It also allows to stay on that pattern even if DB will change(as ANY db will allow that pattern to be used)
  2. Use function that determines lineage for specific ID.
  3. Use it as you wish (in selects, or on CUD operations, or even by jobs).

Lineage approach descr. can be found wherever, for example Here or here. As of function - that is what enspired me.

In the end - got more-or-less simple, relatively fast, and SIMPLE solution.

Function's body

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE DEFINER=`root`@`localhost` FUNCTION `get_lineage`(the_id INT) RETURNS text CHARSET utf8
    READS SQL DATA
BEGIN

 DECLARE v_rec INT DEFAULT 0;

 DECLARE done INT DEFAULT FALSE;
 DECLARE v_res text DEFAULT '';
 DECLARE v_papa int;
 DECLARE v_papa_papa int DEFAULT -1;
 DECLARE csr CURSOR FOR 
  select _id,parent_id -- @n:=@n+1 as rownum,T1.* 
  from 
    (SELECT @r AS _id,
        (SELECT @r := table_parent_id FROM table WHERE table_id = _id) AS parent_id,
        @l := @l + 1 AS lvl
    FROM
        (SELECT @r := the_id, @l := 0,@n:=0) vars,
        table m
    WHERE @r <> 0
    ) T1
    where T1.parent_id is not null
 ORDER BY T1.lvl DESC;
 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
    open csr;
    read_loop: LOOP
    fetch csr into v_papa,v_papa_papa;
        SET v_rec = v_rec+1;
        IF done THEN
            LEAVE read_loop;
        END IF;
        -- add first
        IF v_rec = 1 THEN
            SET v_res = v_papa_papa;
        END IF;
        SET v_res = CONCAT(v_res,'-',v_papa);
    END LOOP;
    close csr;
    return v_res;
END

And then you just

select get_lineage(the_id)

Hope it helps somebody :)

9
votes

If you need quick read speed, the best option is to use a closure table. A closure table contains a row for each ancestor/descendant pair. So in your example, the closure table would look like

ancestor | descendant | depth
0        | 0          | 0
0        | 19         | 1
0        | 20         | 2
0        | 21         | 3
0        | 22         | 4
19       | 19         | 0
19       | 20         | 1
19       | 21         | 3
19       | 22         | 4
20       | 20         | 0
20       | 21         | 1
20       | 22         | 2
21       | 21         | 0
21       | 22         | 1
22       | 22         | 0

Once you have this table, hierarchical queries become very easy and fast. To get all the descendants of category 20:

SELECT cat.* FROM categories_closure AS cl
INNER JOIN categories AS cat ON cat.id = cl.descendant
WHERE cl.ancestor = 20 AND cl.depth > 0

Of course, there is a big downside whenever you use denormalized data like this. You need to maintain the closure table alongside your categories table. The best way is probably to use triggers, but it is somewhat complex to correctly track inserts/updates/deletes for closure tables. As with anything, you need to look at your requirements and decide what approach is best for you.

Edit: See the question What are the options for storing hierarchical data in a relational database? for more options. There are different optimal solutions for different situations.

6
votes

Simple query to list child's of first recursion:

select @pv:=id as id, name, parent_id
from products
join (select @pv:=19)tmp
where parent_id=@pv

Result:

id  name        parent_id
20  category2   19
21  category3   20
22  category4   21
26  category24  22

... with left join:

select
    @pv:=p1.id as id
  , p2.name as parent_name
  , p1.name name
  , p1.parent_id
from products p1
join (select @pv:=19)tmp
left join products p2 on p2.id=p1.parent_id -- optional join to get parent name
where p1.parent_id=@pv

The solution of @tincot to list all child's:

select  id,
        name,
        parent_id 
from    (select * from products
         order by parent_id, id) products_sorted,
        (select @pv := '19') initialisation
where   find_in_set(parent_id, @pv) > 0
and     @pv := concat(@pv, ',', id)

Test it online with Sql Fiddle and see all results.

http://sqlfiddle.com/#!9/a318e3/4/0

5
votes

Based on the @trincot answer, very good explained, I use WITH RECURSIVE () statement to create a breadcrumb using id of the current page and go backwards in the hierarchy to find the every parent in my route table.

So, the @trincot solution is adapted here in the opposite direction to find parents instead of descendants.

I also added depth value which is usefull to invert result order (otherwise the breadcrumb would be upside down).

WITH RECURSIVE cte (
    `id`,
    `title`,
    `url`,
    `icon`,
    `class`,
    `parent_id`,
    `depth`
) AS (
    SELECT   
        `id`,
        `title`,
        `url`,
        `icon`,
        `class`,
        `parent_id`,
        1 AS `depth` 
    FROM     `route`
    WHERE    `id` = :id
      
    UNION ALL 
    SELECT 
        P.`id`,
        P.`title`,
        P.`url`,
        P.`icon`,
        P.`class`,
        P.`parent_id`,
        `depth` + 1
    FROM `route` P
        
    INNER JOIN cte
        ON P.`id` = cte.`parent_id`
)
SELECT * FROM cte ORDER BY `depth` DESC;

Before upgrade to mySQL 8+, I was using vars but it's deprecated and no more working on my 8.0.22 version !

EDIT 2021-02-19 : Example for hierarchical menu

After @david comment I decided to try to make a full hierarchical menu with all nodes and sorted as I want (with sorting column which sort items in each depth). Very usefull for my user/authorization matrix page.

This really simplifies my old version with one query on each depth (PHP loops).

ERP authroization Matrix

This example intergrates an INNER JOIN with url table to filter route by website (multi-websites CMS system).

You can see the essential path column that contains CONCAT() function to sort the menu in the right way.

SELECT R.* FROM (
    WITH RECURSIVE cte (
        `id`,
        `title`,
        `url`,
        `icon`,
        `class`,
        `parent`,
        `depth`,
        `sorting`,
        `path`
    ) AS (
        SELECT 
            `id`,
            `title`,
            `url`,
            `icon`,
            `class`,
            `parent`,
            1 AS `depth`,
            `sorting`,
            CONCAT(`sorting`, ' ' , `title`) AS `path`
        FROM `route`
        WHERE `parent` = 0
        UNION ALL SELECT 
            D.`id`,
            D.`title`,
            D.`url`,
            D.`icon`,
            D.`class`,
            D.`parent`,
            `depth` + 1,
            D.`sorting`,
            CONCAT(cte.`path`, ' > ', D.`sorting`, ' ' , D.`title`)
        FROM `route` D
        INNER JOIN cte
            ON cte.`id` = D.`parent`
    )
    SELECT * FROM cte
) R

INNER JOIN `url` U
    ON R.`id` = U.`route_id`
    AND U.`site_id` = 1

ORDER BY `path` ASC  
4
votes

You can do it like this in other databases quite easily with a recursive query (YMMV on performance).

The other way to do it is to store two extra bits of data, a left and right value. The left and right value are derived from a pre-order traversal of the tree structure you're representing.

This is know as Modified Preorder Tree Traversal and lets you run a simple query to get all parent values at once. It also goes by the name "nested set".

4
votes

Something not mentioned here, although a bit similar to the second alternative of the accepted answer but different and low cost for big hierarchy query and easy (insert update delete) items, would be adding a persistent path column for each item.

some like:

id | name        | path
19 | category1   | /19
20 | category2   | /19/20
21 | category3   | /19/20/21
22 | category4   | /19/20/21/22

Example:

-- get children of category3:
SELECT * FROM my_table WHERE path LIKE '/19/20/21%'
-- Reparent an item:
UPDATE my_table SET path = REPLACE(path, '/19/20', '/15/16') WHERE path LIKE '/19/20/%'

Optimise the path length and ORDER BY path using base36 encoding instead real numeric path id

 // base10 => base36
 '1' => '1',
 '10' => 'A',
 '100' => '2S',
 '1000' => 'RS',
 '10000' => '7PS',
 '100000' => '255S',
 '1000000' => 'LFLS',
 '1000000000' => 'GJDGXS',
 '1000000000000' => 'CRE66I9S'

https://en.wikipedia.org/wiki/Base36

Suppressing also the slash '/' separator by using fixed length and padding to the encoded id

Detailed optimization explanation here: https://bojanz.wordpress.com/2014/04/25/storing-hierarchical-data-materialized-path/

TODO

building a function or procedure to split path for retreive ancestors of one item

3
votes

Just use BlueM/tree php class for make tree of a self-relation table in mysql.

Tree and Tree\Node are PHP classes for handling data that is structured hierarchically using parent ID references. A typical example is a table in a relational database where each record’s “parent” field references the primary key of another record. Of course, Tree cannot only use data originating from a database, but anything: you supply the data, and Tree uses it, regardless of where the data came from and how it was processed. read more

Here is an example of using BlueM/tree:

<?php 
require '/path/to/vendor/autoload.php'; $db = new PDO(...); // Set up your database connection 
$stm = $db->query('SELECT id, parent, title FROM tablename ORDER BY title'); 
$records = $stm->fetchAll(PDO::FETCH_ASSOC); 
$tree = new BlueM\Tree($records); 
...
2
votes

Its a little tricky one, check this whether it is working for you

select a.id,if(a.parent = 0,@varw:=concat(a.id,','),@varw:=concat(a.id,',',@varw)) as list from (select * from recursivejoin order by if(parent=0,id,parent) asc) a left join recursivejoin b on (a.id = b.parent),(select @varw:='') as c  having list like '%19,%';

SQL fiddle link http://www.sqlfiddle.com/#!2/e3cdf/2

Replace with your field and table name appropriately.

2
votes

enter image description here

It's a category table.

SELECT  id,
        NAME,
        parent_category 
FROM    (SELECT * FROM category
         ORDER BY parent_category, id) products_sorted,
        (SELECT @pv := '2') initialisation
WHERE   FIND_IN_SET(parent_category, @pv) > 0
AND     @pv := CONCAT(@pv, ',', id)

Output:: enter image description here

0
votes

This works for me, hope this will work for you too. It will give you a Record set Root to Child for any Specific Menu. Change the Field name as per your requirements.

SET @id:= '22';

SELECT Menu_Name, (@id:=Sub_Menu_ID ) as Sub_Menu_ID, Menu_ID 
FROM 
    ( SELECT Menu_ID, Menu_Name, Sub_Menu_ID 
      FROM menu 
      ORDER BY Sub_Menu_ID DESC
    ) AS aux_table 
    WHERE Menu_ID = @id
     ORDER BY Sub_Menu_ID;
-1
votes

I found it more easily to :

1) create a function that will check if a item is anywhere in the parent hierarchy of another one. Something like this (I will not write the function, make it with WHILE DO) :

is_related(id, parent_id);

in your example

is_related(21, 19) == 1;
is_related(20, 19) == 1;
is_related(21, 18) == 0;

2) use a sub-select , something like this:

select ...
from table t
join table pt on pt.id in (select i.id from table i where is_related(t.id,i.id));
-2
votes

I have made a query for you. This will give you Recursive Category with a Single Query:

SELECT id,NAME,'' AS subName,'' AS subsubName,'' AS subsubsubName FROM Table1 WHERE prent is NULL
UNION 
SELECT b.id,a.name,b.name AS subName,'' AS subsubName,'' AS subsubsubName FROM Table1 AS a LEFT JOIN Table1 AS b ON b.prent=a.id WHERE a.prent is NULL AND b.name IS NOT NULL 
UNION 
SELECT c.id,a.name,b.name AS subName,c.name AS subsubName,'' AS subsubsubName FROM Table1 AS a LEFT JOIN Table1 AS b ON b.prent=a.id LEFT JOIN Table1 AS c ON c.prent=b.id WHERE a.prent is NULL AND c.name IS NOT NULL 
UNION 
SELECT d.id,a.name,b.name AS subName,c.name AS subsubName,d.name AS subsubsubName FROM Table1 AS a LEFT JOIN Table1 AS b ON b.prent=a.id LEFT JOIN Table1 AS c ON c.prent=b.id LEFT JOIN Table1 AS d ON d.prent=c.id WHERE a.prent is NULL AND d.name IS NOT NULL 
ORDER BY NAME,subName,subsubName,subsubsubName

Here is a fiddle.