SQL unlimited hierarchical structure

Last Update:2018-12-05 Source: Internet

Author: User

Tags pear

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently, for project reasons, we need to designDatabaseThe database mainly describes the structure and file information of the file system and finds a lot of information on the Internet. However, many designs either aim at a large number of queries, you can either modify or insert a large amount of data. For file systems, query, insert, and modify operations are quite frequent. Therefore, database design is extremely important, I want to talk about some of my ideas,I hope you can give me some comments.!!

At the beginning, in order to pull out the prototype system as soon as possible, I used name (local node), parent (parent node), and ancestor (ancestor node) to describe its structure, then the recursive algorithm is used to operate the database. In this way, the function can be basically completed. However, the database cannot afford this burden when the folder depth is too high, so we are looking for a solution.

FirstMethodTo improve the SQL statement logic: This method should be optimized for the existing database design and program design.WorkThe quantity is the smallest, but I have not tested whether it is feasible or not. First, write it out and sort out the ideas:

Through the analysis of system functions and database structures, function operations are nothing more than copying, moving, deleting, and creating files. The main problem of these functions is that when performing similar operations on folders, this is also the reason for recursion. Then, observe the data in the database and find that the child node of the query node only needs to improve the SQL statement:

$ Rs = mysql_query (select ancestor from info where name = $ id); (query the ancestor node of the node to be operated );

$ Ancestor. = $ Rs [0] ['ancestor']. ":". $ ID;

$ RS1 = mysql_query (Select name from info where ancestor like $ ancestor % );

Well, this operation seems to be able to get the child node without traversing. The specific effect will be posted after I test it.

The second method is designed for databases with more queries and fewer modifications. I personally think it is not suitable for the current project. However, the ideas and methods are better and I will post them here. (Reprinted)

In terms of product classification, multi-level tree structure forums, mail lists, and many other places, we all encounter the following problem: how to store multi-level structured data? InPHPIn applications, relational databases are usually used to store background data. They can store a large amount of data and provide efficient data retrieval and Update Services. However, the basic form of relational data is an interactive table, which is a flat structure. If you want to store a multi-level tree structure in a relational database, you need to perform reasonable translation work. Next, I will discuss what I have seen and some practical experiences with you:

There are basically two common design methods for storing hierarchical data in a flat database:

1. adjacency list Model)

2. Modified Preorder tree traversal algorithm)

I am not a computer professional and have never learned anything about data structures. Therefore, these two names are literally translated by myself. If they are wrong, please give me more advice.
These two things seem very scary and easy to understand. Here I use a simple food directory as our sample data.

Our data structure is like this.

Below isCode:


Food
|
|---Fruit
|    |
|    |---Red
|    |    |
|    |    |--Cherry
|    |
|    |---Yellow
|          |
|          |--Banana
|
|---Meat
     |
     |--Beef
     |
     |--Pork

To take care of PHP fans who are confused about English
Food: Food
Fruit: Fruit
RED: red
Cherry: Cherry
Yellow: Yellow
Banana: bananas
Meat: Meat
Beef: Beef
Pork: Pork

Adjacency list model)
This mode is often used.TutorialAnd the book also introduced. We add a property parent to each node to represent the parent node of the node and describe the entire tree structure through a flat table. According to this principle, the data in the example can be converted into the following table:

The following code is used:


+-----------------------+
|   parent |    name    |
+-----------------------+
|          |    Food    |
|   Food   |   Fruit    |
|   Fruit  |    Green   |
|   Green  |    Pear    |
|   Fruit  |    Red     |
|   Red    |    Cherry  |
|   Fruit  |    Yellow  |
|   Yellow |    Banana  |
|   Food   |    Meat    |
|   Meat   |    Beef    |
|   Meat   |    Pork    |
+-----------------------+

We can see that Pear is a child node of Green and Green is a child node of Fruit. The root node 'food' has no parent node. To briefly describe this problem, the name is used in this example to represent a record. In the actual database, you need to use the numerical id to mark each node. The table structure of the database should be like this: id, parent_id, name, descr partition ption.
With such a table, we can store the entire multi-level tree structure through the database.

Show multi-level tree
If we need to display such a multi-level structure, we need a recursive function.

The following code is used:


<? Php
// $ Parent is the parent of the children we want to see
// $ Level is increased when we go deeper into the tree,
// Used to display a nice indented tree

Function display_children ($ parent, $ level)
{
// Obtain all child nodes of a parent node $ parent
$ Result = mysql_query ('select name from Tree '.
'Where parent = "'. $ parent .'";');

// Display each subnode
While ($ ROW = mysql_fetch_array ($ result ))
{
// Display the node name in indentation
Echo str_repeat ('', $ level). $ row ['name']." n ";

// Call this function again to display the subnode of the subnode

Display_children ($ row ['name'], $ level + 1 );
}
}
?>;

Use this function for the root node (Food) of the entire structure to print the entire multi-level tree structure. Because Food is the root node, its parent node is empty, so we call it like this: display_children ('', 0 ). The contents of the entire tree are displayed:


Food 
Fruit 
 Red 
  Cherry 
 Yellow 
  Banana 
Meat 
 Beef 
 Pork

If you only want to display a part of the entire structure, such as the Fruit part, you can call display_children ('fruit', 0 );

Using the same method, we can know the path from the root node to any node. For example, the path of Cherry is "Food>; Fruit>; Red ". To get such a path, we need to start from the deepest level "Cherry" and query it to get its parent node "Red" and add it to the path, then we can query the Red parent node and add it to the path, and so on until "Food" at the highest level"

The following code is used:


<? PHP
// $ Node is the deepest Node
Function get_path ($ node)
{
// Query the parent node of this node
$ Result = mysql_query ('select parent from Tree '.
'Where name = "'. $ node .'";');
$ ROW = mysql_fetch_array ($ result );

// Save the path with an Array
$ Path = array ();

// If it is not the root node, continue to query up
// (The root node does not have a parent node)
If ($ row ['parent']! = '')
{
// The last part of the path to $ node, is the name
// Of the parent of $ node
$ Path [] = $ row ['parent'];

// We shoshould add the path to the parent of this node
// To the path
$ Path = array_merge (get_path ($ row ['parent']), $ path );
}

// Return the path
Return $ path;
}
?>;

If you use this function for "Cherry": print_r (get_path ('cherry'), you will get an array like this:


Array 
( 
  [0] =>; Food 
  [1] =>; Fruit 
  [2] =>; Red 
)

How to print it into the desired format is yours.

Disadvantages:
This method is simple, easy to understand, and easy to use. But there are also some shortcomings. It is mainly because the running speed is very slow. Because each node needs to query the database, when the data volume is large, many queries are required to complete a tree. In addition, due to recursive operations, each level of recursion consumes some memory, so the space utilization efficiency is relatively low.

Now let's take a look at another method that does not use Recursive Computing and is faster. This is the pre-sorted traversal tree (modified preorder tree traversal algorithm)
This method is rarely used and is not easy to understand for the first time. However, because this method does not use recursive query algorithms, it has higher query efficiency.

First, we will draw the multilevel data on the paper in the following way, write 1 on the left side of the root node Food, and then write 2 down the tree to the left side of the Fruit, and then proceed, along the edge of the entire tree, each node is marked with numbers on the left and right. The last digit is 18 on the right of the Food. In the figure below, you can see the multilevel structure of the entire labeled number. (Not understood? Point your finger at the number from 1 to 18 to see what's going on. I still don't understand. I'll try again. Pay attention to moving your finger ).
These numbers indicate the relationship between each node. "Red" numbers are 3 and 6, and it is the child node of "Food" 1-18. Similarly, we can see that all nodes with a left value greater than 2 and a right value less than 11 are child nodes of "Fruit" 2-11.

The following code is used:


                                1 Food 18
                                    |
                +-------------------------------------------+
                |                                                       |
            2 Fruit 11                                    12 Meat 17
                |                                                       |
    +------------------------+                   +-----------------------+
    |                                |                   |                              |
 3 Red 6               7 Yellow 10         13 Beef 14            15 Pork 16
    |                                |
4 Cherry 5               8 Banana 9

In this way, the entire tree structure can be stored in the database through left and right values. Before proceeding, let's take a look at the data tables that have been organized below.

The following code is used:


+-----------------------+-----+-----+
|   parent |    name  | lft   | rgt |
+-----------------------+-----+-----+
|              |    Food    | 1   | 18  |
|   Food   |   Fruit      | 2   | 11  |
|   Fruit    |    Red      | 3   |  6  |
|   Red     |    Cherry  | 4   |  5  |
|   Fruit    |    Yellow  | 7   | 10  |
|   Yellow |    Banana | 8   |  9  |
|   Food   |    Meat     | 12  | 17  |
|   Meat   |    Beef     | 13  | 14  |
|   Meat   |    Pork     | 15  | 16  |
+-----------------------+-----+-----+

Note: Because "Left" and "right" have special meanings in SQL, we need to use "LFT" and "rgt" to represent left and right fields. In addition, the "parent" field is no longer required to represent the tree structure. That is to say, the following table structure is enough.

The following code is used:


+------------+-----+-----+
|    name    | lft | rgt |
+------------+-----+-----+
|    Food    | 1   | 18  |
|   Fruit    | 2   | 11  |
|    Red     | 3   |  6  |
|    Cherry  | 4   |  5  |
|    Yellow  | 7   | 10  |
|    Banana  | 8   |  9  |
|    Meat    | 12  | 17  |
|    Beef    | 13  | 14  |
|    Pork    | 15  | 16  |
+------------+-----+-----+

Now we can get the data from the database. For example, we need to get all the nodes under "Fruit" to write the query statement as follows:
Select * from tree where LFT between 2 and 11;
The query returns the following results.

The following code is used:


+------------+-----+-----+
|    name    | lft | rgt |
+------------+-----+-----+
|   Fruit    | 2   | 11  |
|    Red     | 3   |  6  |
|    Cherry  | 4   |  5  |
|    Yellow  | 7   | 10  |
|    Banana  | 8   |  9  |
+------------+-----+-----+

As you can see, all these nodes can be obtained through a single query. In order to display the entire tree structure as the recursive function above, we also need to sort such queries. Sort by the Left value of the node:

Select * from tree where LFT between 2 and 11 order by lft asc;

The remaining questions show how to indent the hierarchy.

The following code is used:


<? PHP
Function display_tree ($ root)
{
// Obtain the left and right values of the root node.
$ Result = mysql_query ('select lft, rgt FROM Tree'. 'Where name = "'. $ root .'";');
$ Row = mysql_fetch_array ($ result );

// Prepare an empty right value Stack
$ Right = array ();

// Obtain all child nodes of the Root Node
$ Result = mysql_query ('select name, lft, rgt FROM Tree '.
'Where lft between'. $ row ['lft ']. 'and '.
$ Row ['rgt ']. 'order by lft asc ;');

// Display each row
While ($ ROW = mysql_fetch_array ($ result ))
{
// Only check stack if there is one
If (count ($ right)>; 0)
{
// Check whether the node should be removed from the stack
While ($ right [count ($ right)-1] <$ row ['rgt '])
{
Array_pop ($ right );
}
}

// Indent the node name
Echo str_repeat ('', count ($ right). $ row ['name']." N ";

// Add this node to the stack
$ Right [] = $ row ['rgt '];
}
}
?>;

If you run the above functions, the results will be the same as those of recursive functions. But our new function may be faster, because there are only two database queries.

It is easier to know the path of a node. If we want to know the path of cherry, we will use its left and right values 4 and 5 for a query.

SELECT name FROM tree WHERE lft <4 AND rgt>; 5 order by lft ASC;

The following result is displayed:

The following code is used:


+------------+
|    name    |
+------------+
|   Food     |
|   Fruit    |
|    Red     |
+------------+

How many child nodes does a node have? Very simple, descendant Total = (right-left-1)/2
Descendants = (right-left-1)/2
Don't believe it? Calculate it by yourself.
Using this simple formula, we can quickly calculate that the "Fruit 2-11" node has four child nodes, while the "Banana 8-9" node has no child nodes, that is to say, it is not a parent node.

Amazing, right? Although I have used this method many times, it is amazing every time I do this.

This is indeed a good method, but is there any way to help us establish such a data table with left and right values? Here we will introduce a function that can automatically convert a table with the name and parent structure into a data table with the left and right values.

The following code is used:


<?php
function rebuild_tree($parent, $left) {
   // the right value of this node is the left value + 1
   $right = $left+1;

   // get all children of this node
   $result = mysql_query('SELECT name FROM tree '.
                          'WHERE parent="'.$parent.'";');
   while ($row = mysql_fetch_array($result)) {
       // recursive execution of this function for each
       // child of this node
       // $right is the current right value, which is
       // incremented by the rebuild_tree function
       $right = rebuild_tree($row['name'], $right);
   }

   // we've got the left value, and now that we've processed
   // the children of this node we also know the right value
   mysql_query('UPDATE tree SET lft='.$left.', rgt='.
                $right.' WHERE name="'.$parent.'";');

   // return the right value of this node + 1
   return $right+1;
}
?>;

Of course, this function is a recursive function. We need to run this function from the root node to reconstruct a tree with left and right values.

Rebuild_tree ('food', 1 );

This function looks a little complicated, but its function is the same as the number of a table manually. It is to convert a three-dimensional multi-layer structure into a data table with left and right values.

How can we add, update, and delete a node for such a structure?

There are two methods to add a node:

First, retain the original name and parent structure, and add data to the data in the old method. After each added piece of data, use the rebuild_tree function to re-number the entire structure.

Second, the more efficient way is to change all the values on the right of the new node. For example, we want to add a new fruit "Strawberry" (Strawberry) which will become the last child node of the "Red" node. First, we need to make some space for it. The right value of "Red" should be changed from 6 to 8, and the left and right values of "Yellow 7-10" should be changed to 9-12. And so on, we can know that if we want to free up space for the new value, we need to add 2 to all nodes with the left and right values greater than 5 (5 is the right value of the last "Red" subnode. So we perform database operations like this:

UPDATE tree SET rgt = rgt + 2 WHERE rgt>; 5;
UPDATE tree SET lft = lft + 2 WHERE lft>; 5;

In this way, we can create a new data node in the new inserted value. Its left and right values are 6 and 7 respectively.

Insert into tree SET lft = 6, rgt = 7, name = 'strawberry ';

Make another query! How is it? Soon.

Now, you can use two different methods to design your multi-level database structure. It depends on your personal judgment, however, I prefer the second method for a structure with a large number of layers. If you want to query a small amount of data that requires frequent addition and updates, the first method is simpler.

In addition, if the database supports rebuild_tree () and space-free operations, you can also write the database trigger function, which is automatically executed during insertion and update, in this way, the running efficiency is better, and the SQL statement you add to the new node becomes simpler.

The third method, without in-depth research, is mainly to see the idea, selfLearning(Reprinted)

I agree to adopt the encoding marking method for marking any level of data.
For example, each level of data is a four-digit value with a full value of 9999. VChar is used for number access.
"Num." is used as the prefix, followed by level-1, level-2, and level-3 encoding... For example, if a piece of information is on level 13, its number is "num.000100010013"
The recursive column tree is based on the condition restrictions during the search, for example, using "Num. xxxx % "is used as a keyword to retrieve all data at the first level. To place a class in another parent class, you only need to retrieve" Num. xxxxxxxx... "Replace the previous" Num. xxxx... "Replace it with the ID of the parent class.
Note the following two points:
First, this data table must have a Max value table, that is, a Max value should be used to record the maximum value already stored in each class;
Second, when deleting data, it cannot be deleted. Instead, it is marked with a delete tag, which is updated when a new record is inserted, effectively saving resources.

This tag seems to use a large number of tags. In fact, it is a string operation and does not consume too much CPU resources. After the index is set, the conditional search of the right wildcard can be customized on the left end to achieve high efficiency.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More