Trees In SQL
Tree关系或者说层次结构是开发过程中遇到的普遍的问题,如何把这种关系持久化到数据库中,并能支持高效的SQL查询,这方面有不少讨论和解决方案,这边文章是对这主题相关内容的总结,介绍的比较肤浅,有兴趣的同学可以阅读一下我在参考资料中列举的文章和书籍。
Adjacency List Model(邻接列表模式)
样例表结构和数据准备SQL
CREATE TABLE category( category_id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(20) NOT NULL, parent INT DEFAULT NULL);
INSERT INTO category VALUES(1,'ELECTRONICS',NULL),(2,'TELEVISIONS',1),(3,'TUBE',2), (4,'LCD',2),(5,'PLASMA',2),(6,'PORTABLE ELECTRONICS',1), (7,'MP3 PLAYERS',6),(8,'FLASH',7), (9,'CD PLAYERS',6),(10,'2 WAY RADIOS',6);
SELECT * FROM category ORDER BY category_id; +-------------+----------------------+--------+ | category_id | name | parent | +-------------+----------------------+--------+ | 1 | ELECTRONICS | NULL | | 2 | TELEVISIONS | 1 | | 3 | TUBE | 2 | | 4 | LCD | 2 | | 5 | PLASMA | 2 | | 6 | PORTABLE ELECTRONICS | 1 | | 7 | MP3 PLAYERS | 6 | | 8 | FLASH | 7 | | 9 | CD PLAYERS | 6 | | 10 | 2 WAY RADIOS | 6 | +-------------+----------------------+--------+
在Adjacency List Model中,每一项都要保留一个指针指向它的父项,最顶层的用NULL或者其他的特殊值来表示。
下面我们来看看Adjacency List Model中主要其中应用场景下SQL的写法:
获取整棵树
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS'; +-------------+----------------------+--------------+-------+ | lev1 | lev2 | lev3 | lev4 | +-------------+----------------------+--------------+-------+ | ELECTRONICS | TELEVISIONS | TUBE | NULL | | ELECTRONICS | TELEVISIONS | LCD | NULL | | ELECTRONICS | TELEVISIONS | PLASMA | NULL | | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH | | ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS | NULL | | ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL | +-------------+----------------------+--------------+-------+ 6 rows in set (0.00 sec)
查找树中的全部叶子节点
SELECT t1.name FROM category AS t1 LEFT JOIN category as t2 ON t1.category_id = t2.parent WHERE t2.category_id IS NULL; +--------------+ | name | +--------------+ | TUBE | | LCD | | PLASMA | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +--------------+
获取单一路径
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH'; +-------------+----------------------+-------------+-------+ | lev1 | lev2 | lev3 | lev4 | +-------------+----------------------+-------------+-------+ | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH | +-------------+----------------------+-------------+-------+ 1 row in set (0.01 sec)
Adjacency List Model的局限性
- 普通的SQL很难处理Adjacency List Model,例如想要获取某个目录的全路径就必须知道目录所在的目录层级。
- 另外,删除某个节点的时候也必须要小心,否则容易造成该节点的子节点成为孤儿节点。
- 通过客户端代码或者存储过程可以解决其中的一些问题。通过过程语句,我们可以从顶层迭代从而获取整个树路径和单一路径;也可以在删除节点的时候,重置子节点的父节点值。
路径实体化(Materialized Path)模式
这种方式用的也很多,就是将节点的层次路径加以保存。
| ENAME | PATH |
| KING | 1 |
| JONES | 1.1 |
| SCOTT | 1.1.1 |
| ADAMS | 1.1.1.1 |
| FORD | 1.1.2 |
| SMITH | 1.1.2.1 |
| BLAKE | 1.2 |
| ALLEN | 1.2.1 |
| WARD | 1.2.2 |
| CLARK | 1.3 |
| MILLER | 1.3.1 |
各种查询通过借助PATH值来实现,例如
获取某个节点的全部父节点
Nested Set Model(嵌套集合)模式
在嵌套集合模式父节点会包含它的子节点,这种层次关系通过节点的左、右值来体现的,如下图。
CREATE TABLE nested_category ( category_id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(20) NOT NULL, lft INT NOT NULL, rgt INT NOT NULL ); INSERT INTO nested_category VALUES(1,'ELECTRONICS',1,20),(2,'TELEVISIONS',2,9),(3,'TUBE',3,4), (4,'LCD',5,6),(5,'PLASMA',7,8),(6,'PORTABLE ELECTRONICS',10,19), (7,'MP3 PLAYERS',11,14),(8,'FLASH',12,13), (9,'CD PLAYERS',15,16),(10,'2 WAY RADIOS',17,18); SELECT * FROM nested_category ORDER BY category_id; +-------------+----------------------+-----+-----+ | category_id | name | lft | rgt | +-------------+----------------------+-----+-----+ | 1 | ELECTRONICS | 1 | 20 | | 2 | TELEVISIONS | 2 | 9 | | 3 | TUBE | 3 | 4 | | 4 | LCD | 5 | 6 | | 5 | PLASMA | 7 | 8 | | 6 | PORTABLE ELECTRONICS | 10 | 19 | | 7 | MP3 PLAYERS | 11 | 14 | | 8 | FLASH | 12 | 13 | | 9 | CD PLAYERS | 15 | 16 | | 10 | 2 WAY RADIOS | 17 | 18 | +-------------+----------------------+-----+-----+
树状的显示如下图
左、右值的赋值逻辑是“When working with a tree, we work from left to right, one layer at a time, descending to each node’s children before assigning a right-hand number and moving on to the right. This approach is called the modified preorder tree traversal algorithm.”
在给某个节点右值进行赋值之前,先降低到该节点的子节点,
获取整棵树
SELECT node.name FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND parent.name = 'ELECTRONICS' ORDER BY node.lft; +----------------------+ | name | +----------------------+ | ELECTRONICS | | TELEVISIONS | | TUBE | | LCD | | PLASMA | | PORTABLE ELECTRONICS | | MP3 PLAYERS | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +----------------------+
跟Adjacency List Model的不同之处是查询不需要关注树的深度
查找树中的全部叶子节点
SELECT name FROM nested_category WHERE rgt = lft + 1; +--------------+ | name | +--------------+ | TUBE | | LCD | | PLASMA | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +--------------+
获取单一路径
SELECT parent.name FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.name = 'FLASH' ORDER BY parent.lft; +----------------------+ | name | +----------------------+ | ELECTRONICS | | PORTABLE ELECTRONICS | | MP3 PLAYERS | | FLASH | +----------------------+
获取节点深度
SELECT node.name, (COUNT(parent.name) - 1) AS depth FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.name ORDER BY node.lft; +----------------------+-------+ | name | depth | +----------------------+-------+ | ELECTRONICS | 0 | | TELEVISIONS | 1 | | TUBE | 2 | | LCD | 2 | | PLASMA | 2 | | PORTABLE ELECTRONICS | 1 | | MP3 PLAYERS | 2 | | FLASH | 3 | | CD PLAYERS | 2 | | 2 WAY RADIOS | 2 | +----------------------+-------+
如果新增节点
如果想要在“TELEVISIONS”和“PORTABLE ELECTRONICS”节点间新增一个节点,新节点的左、右值就是10和11,该节点右边节点的左、右值都要加2,MYSQL 4.x的实现如下!
LOCK TABLE nested_category WRITE; SELECT @myRight := rgt FROM nested_category WHERE name = 'TELEVISIONS'; UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight; INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2); UNLOCK TABLES;
如果要给一个没有子节点的节点增加一个子节点,上面的SQL要稍作调整。
LOCK TABLE nested_category WRITE; SELECT @myLeft := lft FROM nested_category WHERE name = '2 WAY RADIOS'; UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft; INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2); UNLOCK TABLES;
删除节点
如果删除的是一个叶子节点,除了删除节点本身,还要修改该节点右边节点的左、右值。SQL如下
LOCK TABLE nested_category WRITE; SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'GAME CONSOLES'; DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight; UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight; UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight; UNLOCK TABLES;
如果要删除某个节点以及它的子节点。SQL如下
LOCK TABLE nested_category WRITE; SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'MP3 PLAYERS'; DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight; UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight; UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight; UNLOCK TABLES;
如果删除父节点的时候,不要求删除子节点,而是将子节点提升到删除节点的级别。SQL如下
LOCK TABLE nested_category WRITE; SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1 FROM nested_category WHERE name = 'PORTABLE ELECTRONICS'; DELETE FROM nested_category WHERE lft = @myLeft; UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight; UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight; UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight; UNLOCK TABLES;
参考资料
Storing Hierarchical Data in a Database
Managing Hierarchical Data in MySQL
Joe Celko’s Trees and Hierarchies in SQL for Smarties
A Nested Set Implementation in Java and PostgreSQL
