Tree关系或者说层次结构是开发过程中遇到的普遍的问题,如何把这种关系持久化到数据库中,并能支持高效的SQL查询,这方面有不少讨论和解决方案,这边文章是对这主题相关内容的总结,介绍的比较肤浅,有兴趣的同学可以阅读一下我在参考资料中列举的文章和书籍。
Adjacency List Model(邻接列表模式)
样例表结构和数据准备SQL
CREATE TABLE category(
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
parent INT DEFAULT NULL);
INSERT INTO category
VALUES(1,'ELECTRONICS',NULL),(2,'TELEVISIONS',1),(3,'TUBE',2),
(4,'LCD',2),(5,'PLASMA',2),(6,'PORTABLE ELECTRONICS',1),
(7,'MP3 PLAYERS',6),(8,'FLASH',7),
(9,'CD PLAYERS',6),(10,'2 WAY RADIOS',6);
SELECT * FROM category ORDER BY category_id;
+-------------+----------------------+--------+
| category_id | name | parent |
+-------------+----------------------+--------+
| 1 | ELECTRONICS | NULL |
| 2 | TELEVISIONS | 1 |
| 3 | TUBE | 2 |
| 4 | LCD | 2 |
| 5 | PLASMA | 2 |
| 6 | PORTABLE ELECTRONICS | 1 |
| 7 | MP3 PLAYERS | 6 |
| 8 | FLASH | 7 |
| 9 | CD PLAYERS | 6 |
| 10 | 2 WAY RADIOS | 6 |
+-------------+----------------------+--------+
在Adjacency List Model中,每一项都要保留一个指针指向它的父项,最顶层的用NULL或者其他的特殊值来表示。
下面我们来看看Adjacency List Model中主要其中应用场景下SQL的写法:
获取整棵树
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS';
+-------------+----------------------+--------------+-------+
| lev1 | lev2 | lev3 | lev4 |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS | TUBE | NULL |
| ELECTRONICS | TELEVISIONS | LCD | NULL |
| ELECTRONICS | TELEVISIONS | PLASMA | NULL |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS | NULL |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL |
+-------------+----------------------+--------------+-------+
6 rows in set (0.00 sec)
查找树中的全部叶子节点
SELECT t1.name FROM
category AS t1 LEFT JOIN category as t2
ON t1.category_id = t2.parent
WHERE t2.category_id IS NULL;
+--------------+
| name |
+--------------+
| TUBE |
| LCD |
| PLASMA |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+--------------+
获取单一路径
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH';
+-------------+----------------------+-------------+-------+
| lev1 | lev2 | lev3 | lev4 |
+-------------+----------------------+-------------+-------+
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH |
+-------------+----------------------+-------------+-------+
1 row in set (0.01 sec)
Adjacency List Model的局限性
- 普通的SQL很难处理Adjacency List Model,例如想要获取某个目录的全路径就必须知道目录所在的目录层级。
- 另外,删除某个节点的时候也必须要小心,否则容易造成该节点的子节点成为孤儿节点。
- 通过客户端代码或者存储过程可以解决其中的一些问题。通过过程语句,我们可以从顶层迭代从而获取整个树路径和单一路径;也可以在删除节点的时候,重置子节点的父节点值。
路径实体化(Materialized Path)模式
这种方式用的也很多,就是将节点的层次路径加以保存。
| ENAME |
PATH |
| KING |
1 |
| JONES |
1.1 |
| SCOTT |
1.1.1 |
| ADAMS |
1.1.1.1 |
| FORD |
1.1.2 |
| SMITH |
1.1.2.1 |
| BLAKE |
1.2 |
| ALLEN |
1.2.1 |
| WARD |
1.2.2 |
| CLARK |
1.3 |
| MILLER |
1.3.1 |
各种查询通过借助PATH值来实现,例如
获取某个节点的全部父节点
Nested Set Model(嵌套集合)模式
在嵌套集合模式父节点会包含它的子节点,这种层次关系通过节点的左、右值来体现的,如下图。
CREATE TABLE nested_category (
category_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(20) NOT NULL,
lft INT NOT NULL,
rgt INT NOT NULL
);
INSERT INTO nested_category
VALUES(1,'ELECTRONICS',1,20),(2,'TELEVISIONS',2,9),(3,'TUBE',3,4),
(4,'LCD',5,6),(5,'PLASMA',7,8),(6,'PORTABLE ELECTRONICS',10,19),
(7,'MP3 PLAYERS',11,14),(8,'FLASH',12,13),
(9,'CD PLAYERS',15,16),(10,'2 WAY RADIOS',17,18);
SELECT * FROM nested_category ORDER BY category_id;
+-------------+----------------------+-----+-----+
| category_id | name | lft | rgt |
+-------------+----------------------+-----+-----+
| 1 | ELECTRONICS | 1 | 20 |
| 2 | TELEVISIONS | 2 | 9 |
| 3 | TUBE | 3 | 4 |
| 4 | LCD | 5 | 6 |
| 5 | PLASMA | 7 | 8 |
| 6 | PORTABLE ELECTRONICS | 10 | 19 |
| 7 | MP3 PLAYERS | 11 | 14 |
| 8 | FLASH | 12 | 13 |
| 9 | CD PLAYERS | 15 | 16 |
| 10 | 2 WAY RADIOS | 17 | 18 |
+-------------+----------------------+-----+-----+
树状的显示如下图
左、右值的赋值逻辑是“When working with a tree, we work from left to right, one layer at a time, descending to each node’s children before assigning a right-hand number and moving on to the right. This approach is called the modified preorder tree traversal algorithm.”
在给某个节点右值进行赋值之前,先降低到该节点的子节点,
获取整棵树
SELECT node.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND parent.name = 'ELECTRONICS'
ORDER BY node.lft;
+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| TELEVISIONS |
| TUBE |
| LCD |
| PLASMA |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+----------------------+
跟Adjacency List Model的不同之处是查询不需要关注树的深度
查找树中的全部叶子节点
SELECT name
FROM nested_category
WHERE rgt = lft + 1;
+--------------+
| name |
+--------------+
| TUBE |
| LCD |
| PLASMA |
| FLASH |
| CD PLAYERS |
| 2 WAY RADIOS |
+--------------+
获取单一路径
SELECT parent.name
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
AND node.name = 'FLASH'
ORDER BY parent.lft;
+----------------------+
| name |
+----------------------+
| ELECTRONICS |
| PORTABLE ELECTRONICS |
| MP3 PLAYERS |
| FLASH |
+----------------------+
获取节点深度
SELECT node.name, (COUNT(parent.name) - 1) AS depth
FROM nested_category AS node,
nested_category AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rgt
GROUP BY node.name
ORDER BY node.lft;
+----------------------+-------+
| name | depth |
+----------------------+-------+
| ELECTRONICS | 0 |
| TELEVISIONS | 1 |
| TUBE | 2 |
| LCD | 2 |
| PLASMA | 2 |
| PORTABLE ELECTRONICS | 1 |
| MP3 PLAYERS | 2 |
| FLASH | 3 |
| CD PLAYERS | 2 |
| 2 WAY RADIOS | 2 |
+----------------------+-------+
如果新增节点
如果想要在“TELEVISIONS”和“PORTABLE ELECTRONICS”节点间新增一个节点,新节点的左、右值就是10和11,该节点右边节点的左、右值都要加2,MYSQL 4.x的实现如下!
LOCK TABLE nested_category WRITE;
SELECT @myRight := rgt FROM nested_category
WHERE name = 'TELEVISIONS';
UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight;
INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2);
UNLOCK TABLES;
如果要给一个没有子节点的节点增加一个子节点,上面的SQL要稍作调整。
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft FROM nested_category
WHERE name = '2 WAY RADIOS';
UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft;
UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft;
INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2);
UNLOCK TABLES;
删除节点
如果删除的是一个叶子节点,除了删除节点本身,还要修改该节点右边节点的左、右值。SQL如下
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'GAME CONSOLES';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
如果要删除某个节点以及它的子节点。SQL如下
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'MP3 PLAYERS';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
如果删除父节点的时候,不要求删除子节点,而是将子节点提升到删除节点的级别。SQL如下
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'PORTABLE ELECTRONICS';
DELETE FROM nested_category WHERE lft = @myLeft;
UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;
UNLOCK TABLES;
参考资料
Storing Hierarchical Data in a Database
Managing Hierarchical Data in MySQL
一种理想的在关系数据库中存储树型结构数据的方法
Joe Celko’s Trees and Hierarchies in SQL for Smarties
A Nested Set Implementation in Java and PostgreSQL
Text Based Nested Set Implementation
Trees in SQL: Nested Sets and Materialized Path