Schema design of database table with tree structure __ Database

Source: Internet
Author: User
Tags function definition

schema design of database table with tree-shaped structure


In the process of programming, we often use the tree structure to characterize the relationship of some data, such as Enterprise's subordinate departments, column structure, commodity classification and so on, usually, these tree-like structures need to use the database to complete the persistence. However, all kinds of relational databases are stored in the form of two-dimensional tables, so it is not possible to store the tree in the DBMS directly, and the design of the appropriate schema and its corresponding CRUD algorithm is the key to realize the structure of the relational database.

The ideal tree structure should have the following characteristics: The data storage redundancy is small, intuitive, the retrieval traversal process is simple and efficient, node additions and deletions to check crud operation efficiency. Inadvertently search on the internet to a very clever design, the original is English, after looking at the feeling a little meaning, so they sorted out. This article will introduce two kinds of tree structure schema design scheme: One is intuitive and simple design idea, the other is based on the left-right value coding improvement scheme.

I. Basic Data

This article cites an example of a food family tree to explain how foods are organized by category, color, and variety, as follows:

schema design driven by inheritance relation

The most intuitive analysis of tree structure is the relationship between nodes, by showing the parent node of a node and thus being able to establish a two-dimensional relational table, the tree-table structure of this scheme is usually designed to: {node_id,parent_id}, which can be described as shown in the following illustration:

The advantages of this scheme are obvious: design and implementation are natural, very intuitive and convenient. Disadvantages are of course also very prominent: as a direct record of the relationship between the nodes, any crud operations on the tree will be inefficient, which is mainly due to frequent "recursive" operations, the recursive process of continuous access to the database, each time the database IO has a time overhead. Of course, this approach is not useless, in the case of a relatively small tree, we can use the caching mechanism to do the optimization, the tree information loaded into memory processing, to avoid direct database IO operations performance overhead.

three, schema design based on left-right value coding

In general application based on database, the requirement of query is always greater than deletion and modification. In order to avoid the "recursion" process in tree structure query, a novel left-and-right coding scheme, which has no recursive query and infinite Grouping, is designed to save the data of tree based on the first order traversal.

The first time you see this kind of table structure, it is believed that most people do not know how the left value (Lft) and the right value (RGT) are calculated, and this table design does not seem to save the parent-child node inheritance relationship. But when you point your finger at the number in the table from 1 to 18, you should find something. Yes, the order in which you move your fingers is the order in which the tree is first traversed, as shown in the following figure. When we start from the root node food to the left, marked as 1, and in the direction of the forward traversal, sequentially in the path of the traversal callout number, finally we return to the root node food, and on the right to write 18.

The first time you see this kind of table structure, it is believed that most people do not know how the left value (Lft) and the right value (RGT) are calculated, and this table design does not seem to save the parent-child node inheritance relationship. But when you point your finger at the number in the table from 1 to 18, you should find something. Yes, the order in which you move your fingers is the order in which the tree is first traversed, as shown in the following figure. When we start from the root node food to the left, marked as 1, and in the direction of the forward traversal, sequentially in the path of the traversal callout number, finally we return to the root node food, and on the right to write 18.

Based on this design, we can infer that all the left values greater than 2, and the right value is less than 11 nodes are fruit nodes, the entire tree structure through the left and right values stored down. However, this is not enough, our goal is to be able to do crud operations on the tree, that is, need to construct a matching algorithm associated with it.

four, tree-shaped structure crud algorithm

(1) Getting descendants of a node

Only one SQL statement is required to return the forward traversal list of the descendant nodes of the node, taking fruit as an example: select* from the tree WHERE Lft BETWEEN 2 and an order by Lft ASC. The results of the query are as follows:

So how many offspring nodes are there in the end of a node? By the left and right values of the node we can circle the descendants of the node, the total number of descendants = (right-left-value –1)/2, in the case of fruit, the total number of descendants: (11–2–1)/2 = 4. At the same time, in order to show the tree structure more intuitively, we need to know the node in the tree level, through the left and right value of the SQL query can be implemented, take fruit as an example: SelectCount (*) from the trees WHERE Lft <= 2 and Rgt >=11. For the sake of description, we can create a view for tree, add a sequence of levels, the column value can be written a custom function to calculate, the function definition is as follows:

The CREATE FUNCTION dbo. Countlayer
(
    @node_id int
)
RETURNS int
as
begin
	Declare @result int
	Set @result = 0
	declare @lft int
	declare @rgt int
	if exists (select node_id from tree where node_id = @node_id)
	begin
  select @lft = lft, @rgt = Rgt from tree where node_id = @node_id
		Select @result = count (*) where LFT ; = @lft and Rgt >= @rgt end return @result end go

Based on the hierarchical computing function, we create a view that adds a new sequence of record node levels:

CREATE VIEW dbo. TreeView
as
SELECT node_id, Name, Lft, RGT, dbo. Countlayer (node_id) as Layer from dbo. Lft Go

Creates a stored procedure that calculates all descendants of a given node and corresponding levels:

CREATE PROCEDURE [dbo]. [Getchildrennodelist]
(
	@node_id int
)
as
declare @lft int
declare @rgt int
if exists (select node_id from tree where no de_id = @node_id)
	begin
		Select @lft = lft, @rgt = Rgt from tree where node_id = @node_id
		select * from Treevie W where Lft between @lft and @rgt order by Lft ASC end Go

Now, we use the above stored procedure to compute node fruit all descendants node and corresponding level, the query result is as follows:


From the above implementation, we can see that using the left and right value coding design, in the tree query traversal, only need to do 2 database query, eliminate recursion, plus query conditions are the comparison of numbers, query efficiency is extremely high, with the growing tree size, The design scheme based on the left-right value coding will be more efficient than the traditional recursive scheme query efficiency. Of course, we have only given a simple algorithm to obtain the node descendants, the real use of this tree we need to implement the insertion, delete the same layer translation nodes and other functions.

(2) Get the tree path of a node

Assuming that we want to get the tree path of a node, the analysis of the left and right values requires only one SQL statement to complete, taking fruit as an example: select* from tree WHERE Lft < 2 and Rgt > One order by Lft ASC, relatively complete The stored procedures:

CREATE PROCEDURE [dbo]. [Getparentnodepath]
(
	@node_id int
)
as
declare @lft int
declare @rgt int
if exists (select node_id from tree where No de_id = @node_id)
	begin
		Select @lft = lft, @rgt = Rgt from tree where node_id = @node_id
		select * from Treevie W where Lft < @lft and Rgt > @rgt ORDER by Lft ASC end Go

(3) Add a descendant node to a node assuming that we want to add a new child node "Apple" Under Node "Red", the tree will become the following figure, where the red node is the new node.


I'm sure everyone should be able to infer how to write SQL scripts by carefully observing the changes in the values of the nodes in the graph. We can give a relatively complete stored procedure for inserting child nodes:

CREATE PROCEDURE [dbo]. [Addsubnode]
(
	@node_id int,
	@node_name varchar
) as
declare @rgt int
if exists (select node_id from  Tree where node_id = @node_id)
	begin
		SET xact_abort on
		begin transction
		Select @rgt = Rgt from tree where node_id = @node_id
		Update tree Set Rgt = Rgt + 2 where Rgt >= @rgt
		update tree Set Lft = Lft + 2 where Lft & gt;= @rgt
		insert into tree (Name, Lft, RGT) VALUES (@node_name, @rgt, @rgt + 1)
		COMMIT TRANSACTION
		SET xact_a Bort off the Go

(4) Delete a node

If we want to delete a node, all the descendants of that node are also deleted. The number of these deleted nodes is: (the right value of the deleted node-the left value of the deleted node + 1)/2, and the left and right values of the remaining nodes will be adjusted if they are greater than the left and right values of the deleted node. To see what happens to the tree, take beef as an example, the removal effect is shown in the following illustration.

Then we can construct the corresponding stored procedure:

CREATE PROCEDURE [dbo]. [Delnode]
(
	@node_id int
)
as
declare @lft int
declare @rgt int
if exists (select node_id from tree where No de_id = @node_id)
	begin
		SET xact_abort
		on begin transction
			Select @lft = lft, @rgt = Rgt from tree where node_id = @node_id
			Delete from tree where Lft >= @lft and Rgt <= @rgt
			update tree Set Lft = lft– (@rgt- @lft + 1) where LfT > @lft
			update tree Set Rgt = rgt– (@rgt-@lft + 1) where Rgt > @rgt
			COMMIT transacti On
		SET xact_abort off end go

v. Summary

We can make a summary of the tree structure schema design scheme which realizes infinite grouping through the left and right value coding:

(1) Advantages: In the elimination of the recursive operation of the premise of the implementation of infinite grouping, and query conditions are based on the comparison of the number of plastic, high efficiency.

(2) Disadvantage: the addition of nodes, delete and modify the cost of large, will involve a variety of data in the table changes.

Of course, this article only gives several common crud algorithm implementations, we can also add their own, such as the same layer node translation, node down, node up and move operations. Interested friends can do their own coding to achieve, here is not listed. It is noteworthy that the implementation of these algorithms may be more cumbersome, will involve a number of update statements in the order of execution, if the sequential scheduling is not considered in detail, the emergence of bugs will be the entire tree structure of the whole table has a stunning damage. Therefore, in the large-scale modification of the tree structure, you can use temporary table as an intermediary to reduce the complexity of the code, at the same time, it is strongly recommended to make a full backup of the table before making changes in order to prepare for a rainy day. In most database based application systems, this scheme is more applicable than the traditional database schema constructed by parent-child inheritance relationship.

References: "Storing hierarchical Data in a Database Article"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.