Tree-structured data is common in project development. For example, it is typical for Forum topic messages.
Each topic node) can have n message subnodes ). These messages can have their own messages. Therefore, this structure is a tree. This article describes how to store this tree structure in a database.
Assume there is a tree as follows:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" Untitled "border =" 0 "alt =" Untitled "src =" http://www.bkjia.com/uploads/allimg/131228/222KWG6-0.png "height =" 268 "/>
Method 1
Note: In this example, the database is SQLite, so the SQL statement is only valid for SQLite. You can refer to this method for other databases.
To store data in a database, the simplest and most direct method is to store the parent node ID of each element.
Currently, the naming of this method depends on the parent node method. Therefore, the table structure is designed as follows:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" 598F8C3BAEC249C7B7C21FCAE42C097F "border =" 0 "alt =" 598F8C3BAEC249C7B7C21FCAE42C097F "src =" http://www.bkjia.com/uploads/allimg/131228/222KQY9-1.jpg "height =" 97 "/>
The stored data is in the following format:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" inherit "border =" 0 "alt =" d91e3167473f4f75b42e8542953be78c "src =" http://www.bkjia.com/uploads/allimg/131228/222KS3A-2.jpg "height =" 250 "/>
In this structure, it is very easy to query the direct subnodes of a node, for example, to query the subnodes of node D.
select * from tree1 where parentid=4
If you want to insert a node, for example, under node D, insert another M node.
You only need the following SQL:
INSERT INTO tree1 (value,parentid) VALUES('M',4);
This structure is a little complicated when you look for all the subnodes of a node. Both SELECT and DELETE may involve obtaining all the subnodes. For example, if you want to delete a node and delete all its subnodes, you must first obtain the IDs of all the subnodes because the subnodes are not just directly subnodes, it may also contain subnodes. For example, to delete a node D and Its subnodes, you must first find all the subnodes under the node D and then delete them. The SQL statement is as follows:
Select nodeid from tree1 where parentid = 4 -- Return 8, 9 select nodeid from tree1 where parentid in (8, 9) -- return 10, 11, 12 select nodeid from tree1 where parentid in (10, 11, 12) -- return NULL delete from tree1 where nodeid in)
If only the D node is deleted, but other nodes are not deleted but upgraded, you must modify the parentid of the child node before deleting the D node.
As demonstrated above, the biggest disadvantage of this method of relying on the parent node is that it cannot directly obtain all the child nodes of a node. Therefore, if you want to select all the sub-nodes, it requires tedious steps, which is not conducive to aggregation.
For some database products that support recursive query statements, such as Microsoft SQL Server, you can use CTE technology to implement recursive query. For example, you want to query all the subnodes of node D. You only need the following statement:
WITH tmp AS(SELECT * FROM Tree1 WHERE nodeid = 4UNION ALLSELECT a.* FROM Tree1 AS a,tmp AS b WHERE a.parentid = b. nodeid)SELECT * FROM tmp
However, it is complicated to implement databases that do not support recursive queries.
Method 2
Another method for comparison is the storage path. Name it Path Enumeration.
This method stores the path from the root node to each node.
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" 55778B9842DC47279FFCFF48B54ABDA1 "border =" 0 "alt =" 55778B9842DC47279FFCFF48B54ABDA1 "src =" http://www.bkjia.com/uploads/allimg/131228/222KW020-3.jpg "height =" 235 "/>
This data structure shows the depth of the subnode at a glance.
If you want to query subnodes under a node, you only need to match them according to the path. For example, you want to query all subnodes under node D.
select * from tree2 where path like '%/4/%'
Or, for efficiency, write it directly
select * from tree2 where path like '1/4/%'
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" 214EF7DB11684064ABB9C4FCBDDD5CD4 "border =" 0 "alt =" 214EF7DB11684064ABB9C4FCBDDD5CD4 "src =" http://www.bkjia.com/uploads/allimg/131228/222KW213-4.jpg "height =" 128 "/>
It is also easy to perform aggregation operations, such as querying the total number of nodes under node D.
Select count (*) from tree2 where path like '2014/1/% ';
It is a little effort to insert a node. Insert yourself, find the Path of the parent node, and update the ID generated by yourself to the path. For example, you need to insert M nodes behind node L.
First insert your own M, and then get a nodeid such as nodeid = 13. Then M is inserted to the end of L. Therefore, we can find that the path of L is 1/4/8/12 /, therefore, the update M path is 1/4/8/12/13.
Update tree2 setpath = (select path from tree2 where nodeid = 12) -- start stitching here | last_insert_rowid () | '/' wherenodeid = last_insert_rowid ();
One obvious disadvantage of this method is that the length of the path field is limited, which means that the node depth cannot be increased without limit. Therefore, this method is suitable for storing small tree structures.
Method 3
The following describes a method called a closure table.
This method records the relationship between all nodes in the tree, not only the direct parent-child relationship, it requires two tables, except the node table itself, you also need to use a table to store the relationship between the node ancestor and the child node and add a row of nodes pointing to itself), and add a field to indicate the depth as needed. Therefore, this method has a large amount of data. The table structure is as follows:
Tree3 table:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" E1D5EEEE05EF4188ADE17192C9B95ECC "border =" 0 "alt =" E1D5EEEE05EF4188ADE17192C9B95ECC "src =" http://www.bkjia.com/uploads/allimg/131228/222KWT7-5.jpg "height =" 74 "/>
NodeRelation table:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" C3E90EA4EEBE490D87035F98DFC39EA2 "border =" 0 "alt =" C3E90EA4EEBE490D87035F98DFC39EA2 "src =" http://www.bkjia.com/uploads/allimg/131228/222KS606-6.jpg "height =" 98 "/>
In the example tree, the inserted data is as follows:
Tree3 table data
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" 20ADFF42DB6E45CC9CA0C287DA49C5B5 "border =" 0 "alt =" 20ADFF42DB6E45CC9CA0C287DA49C5B5 "src =" http://www.bkjia.com/uploads/allimg/131228/222KU535-7.jpg "height =" 241 "/>
NodeRelation table data
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" inherit "border =" 0 "alt =" 9F3B8EC76E0B4D67830FF29B6F6EEC4E "src =" http://www.bkjia.com/uploads/allimg/131228/222KR2Z-8.jpg "height =" 637 "/>
We can see that the NodeRelation table has a large amount of data. However, the query is very convenient. For example, to query sub-elements of node D
Only
select * from NodeRelation where ancestor=4;
To query the direct subnode of node D, add depth = 1
select * from NodeRelation where ancestor=4 and depth=1;
To query all the parent nodes of node J, SQL:
select * from NodeRelation where descendant=10;
If you want to insert a new node, such as adding a child node M after the L node, the inserted node has a node relationship besides M itself. That is, the nodes that are new to the inserted M nodes have descendant relationships. This is actually very simple, as long as there is a descendant relationship with the L node, and the M node will certainly have a descendant relationship, and the depth of the L node is X and the depth of the M node must be X + 1. Therefore, after inserting M nodes, find the nodes whose L nodes are descendants and have a descendant relationship with M nodes, and insert them to the data table.
Insert into tree3 (value) VALUES ('M'); -- INSERT the node insert into NodeRelation (ancestor, descendant, depth) select n. ancestor, last_insert_rowid (), n. depth + 1 -- Here depth + 1 serves as the depth of the M node from NodeRelation nwhere n. descendant = 12 Union ALLselect last_insert_rowid (), last_insert_rowid (), 0 -- add itself
In some cases, the depth field is not required even when depth is not required.
It is also easy to delete a node. For example, to delete node D, in addition to deleting node D in the tree3 table, you also need to delete the relationship in the NodeRelation table.
First, the relationships that take node D as the descendant are to be deleted, and those that take node D as the descendant are also deleted:
Delete from NodeRelation where descendant in (select descendant from NodeRelation where ancestor = 4); -- query the nodes with the D node as the ancestor, that is, the descendant of the D node.
Although this method is thorough, it also deletes the relationship between node D and its original subnodes.
If you only want to split the relationship between node D and node A, and retain the relationship between its original child nodes, you need to add restrictions.
Limit that the ancestor of the link to be deleted does not take D as the ancestor, that is, if the relation uses D as the ancestor, it does not need to be deleted. Therefore, add the preceding SQL statement as a condition.
Delete from NodeRelation where descendant in (select descendant from NodeRelation where ancestor = 4); -- query the nodes with the D node as the ancestor, that is, the descendant of the D node. And ancestor not in (select descendant from NodeRelation where ancestor = 4)
In the preceding SQL statement, the descendant of the D node is queried. If the ancestor of a link does not belong to the descendant of the D node, and the descendant of the Link belongs to the descendant of the D node, delete it.
In this case, the relationship between the self-built nodes of node D is retained. In the above example, the actually deleted node relationship is:
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" inherit "border =" 0 "alt =" maid "src =" http://www.bkjia.com/uploads/allimg/131228/222KV554-9.jpg "height =" 151 "/>
If you want to delete node H
650) this. width = 650; "style =" border-bottom: 0px; border-left: 0px; border-top: 0px; border-right: 0px "title =" inherit "border =" 0 "alt =" 8579eb3db87cda-5b5daaeaa9e182395 "src =" http://www.bkjia.com/uploads/allimg/131228/222KT933-10.jpg "height =" 201 "/>
Summary:
The preceding three methods have their own advantages and disadvantages. You can select an appropriate data model based on your actual needs.
---------------------------------
Reference: SQL Antipatterns
This article is from the "one blog" blog, please be sure to keep this source http://cnn237111.blog.51cto.com/2359144/1226911