Chapter One: tree structure data in an RDB

Source: Internet
Author: User

Chapter One: tree structure data in an RDB

In this chapter, I will write a basic knowledge to understand the problem

The role of a model

The function of RBD processing tree model is summarized as two points:

1 saving data for a tree in an RDB table

2 related nodes of the query node for efficiency

1 saving data for a tree in an RDB table

We can define the criteria for whether the model has the ability to store hierarchical data

The original hierarchy can be reproduced by all the nodes that are saved

If the original tree structure cannot be reproduced through the saved data, it cannot be said that the model implements the tree.

2 related nodes of the query node for efficiency

In general, we save data to a database for search, data is stored in hierarchical data, may query any node of the relational data (such as sub-tree, parent node, child nodes, ancestor nodes, descendant nodes, etc.). This is an important function of a tree model.

The following [A], [B], [C] and so on represent the node.

                [A]               |           +---+---+           |       |          [B]     [C]           |       |         +-+-+   +-+-+         |   |   |   |        [D] [E] [F] [G]                     |                   +-+-+                   |   |                  [H] [I]
When we query the subtree of [C], we get all the nodes by tracking the branches that are stretched down.
           [C]           |         +-+-+         |   |        [F] [G]             |           +-+-+           |   |          [H] [I]
Tracking down branches is very regular, so many people think that if you keep the hierarchical data in the database, we will easily find the relevant nodes of any node.
The tree structure data will be searched frequently as a collection of data.

node

relationship

node set

[b] parent node [a]
[b" child nodes [d] [E]
[h] ancestor node [a] [C] [G]
[c] descendant node [f] [G] [H] [I]
[c] partial wood [c] [F] [G] [H] [I]
The term effective is very important here, otherwise there is no need to design queries. The first thing to do when querying is to get all the nodes, and then filter by query criteria. However, in a table of millions data, it takes a lot of execution time and memory to get all the nodes, and it is unrealistic to get all the nodes each time. If the table has a large number of millions associated nodes, the design of this system will be very difficult to reach the target at speed. Two examples

Examples show how to use a query to find related nodes in hierarchical data.

Find ancestor nodes using the "breadcrumbs list" on the web.
1 instances of ancestor node lookups
        [Top Page]            | --[products]            |     | --[smart Phone Apps]            |     +--[desktop Apps]            |--[news            |     | --[2015/10]            |     | --[2015/09]            |     | --[2015/08]            |     +--[2015/07]            +--[about US]                  |--[corporate philosophy]                  |--[access]                  +--[contact Us]
When we query [Corporate philosophy] nodes, the nodes that are stretched up by the trace up to their top are as follows.
Top Page > About Us > Corporate philosophy
All excessive pages are considered to be tree structures, and the home page is the root node. The breadcrumbs Listcan be reproduced when the ancestor node of the Corporate philosophy is able to be found.
2 Examples of descendant node lookups

The Forum has the ability to reply to comments as a tree, and when all annotations are treated as nodes, the stored data has a hierarchy. When querying the comments in the forum, we will use the Find subtree node.

        [2015/03/31]            | --[let ' s Party tonight! (John)]            |     | --[i'll join! (Mike)]            |     |     +--[thanks! (John)]            |     | --[can not join tonight. (Anne)]            |     |     +--[when OK? (John)]            |     +--[ok, but the 2 hours only. (Tom)]            |           | --[me too. (Eucen)]            |           +--[i see! (John)]            +--[i losted my smart phone. (Bill)]                  | --[when? (Diana)]                  |     +--[yesterday. (Bill)]                  +--[did you check GPS? (Fred)]                        +--[yet (Bill)]
As above, on March 31, 2015 There are two threads, when only show Let's party tonight! (John) You will need to query all of John's subtrees for comment. In this case, the method of finding all nodes cannot be used because of the pure efficiency of the forum's massive data nodes.
Problem points of three-relationship model

Again, the role of the RBD processing tree model is reiterated.

1 saving data for a tree in an RDB table

2 related nodes of the query node for efficiency

It's hard to write an efficient query for a few reasons.

1 to search, first save all ancestor nodes of each node

The relative depth of 2-node relational data increases exponentially

3 levels of depth do not depend on the content of the data

4 It is difficult to assign indexes to hierarchical data in a database. Because the tree structure has two-dimensional diffusion, but the exponent is one-dimensional structure

5, wait.

If you find a group of descendants from a specified node, you can determine by tracing the branch direction, which is a simple matter. However, SQL is descriptive programming (declarative programming), so it is not possible to determine a relationship for each node, such as using imperative programming in a separate process.

In declarative programming, how can we express the relationship between a search query subtree and an ancestor node?

Database engineers around the world are challenging this issue, and the key to the problem with hierarchical data in an RDB is "how to efficiently query any node-related nodes".

Append of four columns

To address the storage hierarchy tree data, engineers have developed four models to solve this problem.

adjacency table Models (adjacency list model)

Route enumeration model (path enumeration models)

Nested collection models (nested set model)

Close table model (closure table models)

In either model, it is all about adding columns to the RDB to hold the hierarchical data. The columns you add can be divided into two categories depending on their role.

1 Hierarchy Columns

2 Efficiency Search Columns

1 Hierarchy Columns

You need to save the hierarchical data in the RDB table. Also used in queries as a column for a search condition.

2 Efficiency Search Columns

Chapter One: tree structure data in an RDB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.