Optimal Binary Search Tree

Source: Internet
Author: User

1. Problem profiling: Set S = {x1, x2, ·, xn} to an ordered set, and x1, x2 ,···, xn indicates the Binary Search Tree of an ordered set. It uses the vertex of a binary tree to store elements in an ordered set and has the following properties: element x stored in each vertex is greater than the element stored in any vertex in the left subtree and less than the element stored in any vertex in the right subtree. The leaf vertices in a binary tree are the open intervals of shapes such as (xi, xi + 1. Search for an element x in the binary search tree that represents S. The returned result is in two situations: (1) locate x = xi (2) at the internal vertex of the binary tree) in the leaf vertices of a binary tree, determine that the probability of finding the element x = xi is bi in case (1). In case (2) to determine the probability of x (xi, xi + 1) as ai. The conventions are x0 =-∞, xn + 1 = + ∞, with a set of {a0, b1, a1 ,...... Bn, an} is called the access probability distribution of set S. Optimal Binary Search Tree: in a binary tree T that represents S, the node depth of the storage Element xi is set to ci, and the node depth of the leaf node (xj, xj + 1) is dj. Note: during the search process, each comparison enters the following layer. For a successful search, the number of comparisons is the number of layers where the comparison is located plus 1. For unsuccessful searches, the retrieved key code belongs to the set of possible key codes represented by that external node. The number of comparisons equals to the number of layers of the other nodes. For the inner node of the graph, layer 0th needs to compare the number of operations to 1, layer 1st needs to compare two times, and layer 2nd needs to compare three times. P indicates the average number of comparisons required for a search in the binary search tree T. P is also called the average route length of the binary search tree T. Generally, the average route length of different binary search trees is different. For ordered set S and their access probability distribution (a0, b1, a1 ,...... Bn, an), locate a binary search tree with the minimum average route length in all the binary search trees that represent the ordered set S. Set Pi to the probability of ai retrieval. Set qi to the probability of retrieving the identifier X that satisfies the ai <X <ai + <= I <= n (assuming a0 = -- ∞ and an + 1 = + ∞ ). For a set of n key codes, the key codes are n! Different binary search trees can be arranged. (Different Binary Trees with n knots, the number of catarns ). How to evaluate these binary search trees can be measured by the tree's search efficiency. For example, the possible Binary Search Tree of the identifier set {1, 2, 3 }={ do, if, stop} is: if P1 = 0.5, P2 = 0.1, P3 = 0.05, q0 = 0.15, q1 = 0.1, q2 = 0.05, q3 = 0.05, calculate the average comparison times (cost) of each tree ). Pa (n) = 1 × p1 + 2 × p2 + 3 × p3 + 1 × q0 + 2 × q1 + 3 × (q2 + q3) = 1 × 0.5 + 2 × 0.1 + 3 × 0.05 + 1 × 0.05 + 2 × 0.1 + 3 × (0.05 + 0.05) = 1.5 Pb (n) = 1 × p1 + 2 × p3 + 3 × p2 + 1 × q0 + 3 × (q2 + q3) = 1x0.5 + 2x0.05 + 3x0.1 + 1x0.15 + 2x0.05 + 3x(0.05 + 0.05) = 1.6 Pc (n) = 1 × p2 + 2 × (p1 + p3) + 2 × (q0 + q1 + q2 + q3) = 1 × 0.1 + 2 × (0.5 + 0.05) + 2 × (0.15 + 0.1 + 0.05 + 0.05) = 1.9 Pd (n) = 1 × p3 + 2 × p1 + 3 × p2 + 1 × q3 + 2 × Q0 + 3 × (q1 + q2) = 1 × 0.05 + 2 × 0.5 + 3 × 0.1 + 1 × 0.05 + 2 × 0.15 + 3 × (0.1 + 0.05) = 2.15 Pe (n) = 1 × p3 + 2 × p1 + 3 × p2 + 1 × q3 + 2 × q0 + 3 × (q1 + q2) = 1 × 0.05 + 2 × 0.5 + 3 × 0.1 + 1 × 0.05 + 2 × 0.15 + 3 × (0.1 + 0.05) = 2.15 therefore, in the preceding example, the minimum average path length is Pa (n) = 1.5. We can conclude that the deeper the hierarchy of nodes in the binary search tree, the more times the comparison is required. Therefore, we need to construct a minimum binary tree, generally, you should try to put the nodes with higher search probability at a higher level. 2. Optimal sub-structure: If k is selected as the root, then 1, 2 ,..., K-1 and a0, a1 ,..., The ak-1 will all be on the left subtree L, and the rest of the nodes (k + 1 ,..., N and ak, ak + 1 ,..., An) is located on the right subtree R. COST (L) and COST (R) are the costs of the left and right subtree of the binary search tree T. The COST of the search tree T is: P (k) + COST (L) + COST (R) + ....... If T is the best, the above formula and COST (L) and COST (R) must be the minimum value. Proof: a binary search tree T contains vertex xi,..., xj and leaf vertex (XI-1, xi),... (xj, xj + 1) sub-tree can be seen as an ordered set {xi,..., xj} about the complete set for {XI-1, A binary search tree (T itself can be regarded as an ordered set) of xj + 1 ). According to the access distribution probability of S, the probability of being searched at the vertex of the subtree is :. The storage probability distribution of {xi, ·, xj} is {ai-1, bi ,..., Bj, aj}, where ah and bk are the following conditional probabilities :. Set Tij to an ordered set {xi, ·, xj} about the storage probability distribution as {ai-1, bi ,..., Bj, aj} is an optimal binary search tree. Its average route length is pij, and the elements stored in the root vertex of Tij are xm, the average route lengths of the Left subtree Tl and right subtree Tr are pl and pr, respectively. Because the vertices depth in Tl and Tr is their depth minus 1 in Tij, we get: Because Ti is a binary search tree about the set {xi ,&, xm-1, therefore, Pl> = Pi M-1. If Pl> Pi M-1 are used, replacing Tl WITH Ti and M-1 will produce a binary search tree with a shorter average route length than Tij. This conflicts with Tij as the optimal binary search tree. Therefore, Tl is an optimal binary search tree. Likewise, Tr is also an optimal binary search tree. Therefore, the optimal binary search tree has the optimal substructure. 3. Recursive relationship: Based on the optimal sub-structure nature of the optimal binary search tree problem, the recursive formula for pij calculation can be established as follows: at the initial time: Remember wi, j pi, j is m (I, j), then m (1, n) = w1, n p1, n = p1, n is the optimal value. The recursive formula for m (I, j) calculation is: 4. Solution Process: 1) When no internal node exists, T [1] [0] is constructed. T [2] [1], T [3] [2]……, T [n + 1] [n] 2) construct the optimal binary search tree with only one internal node T [1] [1], T [2] [2]…, T [n] [n], and m [I] [I] can be obtained. At the same time, an array can be used as the root node element: s [1] [1] = 1, s [2] [2] = 2... S [n] [n] = n 3) construction has 2, 3 ,...... And the optimal binary search tree for n internal nodes. ...... R (lower mark difference) 0 T [1] [1], T [2] [2],…, T [n] [n], 1 T [1] [2], T [2] [3],…, T [n-1] [n], 2 T [1] [3], T [2] [4],…, T [N-2] [n],... R T [1] [r + 1], T [2] [r + 2],…, T [I] [I + r],..., T [n-r] [n]... N-1 T [1] [n] specific code: [cpp] // 3d11-1 optimal binary search tree dynamic planning # include "stdafx. h "# include <iostream> using namespace std; const int N = 3; void OptimalBinarySearchTree (double a [], double B [], int n, double ** m, int ** s, double ** w); void Traceback (int n, int I, int j, int ** s, int f, char ch); int main () {double a [] = {0.15, 0.1, 0.05, 0.05}; double B [] = {0.00, 0.5, 0.1, 0.05 }; cout <"Probability Distribution of Ordered Sets:" <endl; for (int I = 0; I <N + 1; I ++) {Cout <"a" <I <"=" <a [I] <", B "<I <" = "<B [I] <endl;} double ** m = new double * [N + 2]; int ** s = new int * [N + 2]; double ** w = new double * [N + 2]; for (int I = 0; I <N + 2; I ++) {m [I] = new double [N + 2]; s [I] = new int [N + 2]; w [I] = new double [N + 2];} OptimalBinarySearchTree (a, B, N, m, s, w); cout <"minimum average length of the Binary Search Tree: "<m [1] [N] <endl; cout <" the optimal binary tree constructed is: "<endl; Traceback (N, 1, N, s, 0, '0'); for (int I = 0; I <N + 2; I ++) {del Ete m [I]; delete s [I]; delete w [I];} delete [] m; delete [] s; delete [] w; return 0 ;} void OptimalBinarySearchTree (double a [], double B [], int n, double ** m, int ** s, double ** w) {// initialize the for (int I = 0; I <= n; I ++) If no internal node exists) {w [I + 1] [I] = a [I]; m [I + 1] [I] = 0;} for (int r = 0; r <n; r ++) // r represents the difference between the start and end of the Mark {for (int I = 1; I <= n-r; I ++) // I is the subscript of the starting element {int j = I + r; // j is the subscript of the ending element // construct T [I] [j] fill in w [I] [j], m [I] [j], s [I] [j] // use I as the root, left of it Tree is empty, right subtree is node w [I] [j] = w [I] [J-1] + a [j] + B [j]; m [I] [j] = m [I + 1] [j]; s [I] [j] = I; // If I is not selected as the root, if k is set to its root, k = I + 1 ,...... J // The left subtree is a node: I, I + 1 ...... K-1, right subtree for node: k + 1, k + 2 ,...... J for (int k = I + 1; k <= j; k ++) {double t = m [I] [k-1] + m [k + 1] [j]; if (t <m [I] [j]) {m [I] [j] = t; s [I] [j] = k; // root node element} m [I] [j] + = w [I] [j] ;}} void Traceback (int n, int I, int j, int ** s, int f, char ch) {int k = s [I] [j]; if (k> 0) {if (f = 0) {// Root cout <"Root:" <k <"(I: j) :(" <I <"," <j <") "<endl;} else {// subtree cout <ch <" of "<f <": "<k <" (I: j) :( "<I <", "<j <") "<endl;} int t = K-1; if (t> = I & t <= n) {// backend left Subtree Traceback (n, I, t, s, k, 'L');} t = k + 1; if (t <= j) {// trace back the right subtree Traceback (n, t, j, s, k, 'R') ;}} 4. Construct the optimal solution: in the OptimalBinarySearchTree algorithm, s [I] [j] is used to store the elements in the root node of the optimal subtree T (I, j. When s [I] [n] = k, xk is the root node element of the Binary Search Tree. The left subtree is T (1, k-1 ). Therefore, I = s [1] [k-1] indicates that the root node element of T (1, k-1) is xi. By analogy, it is easy to construct the optimal binary search tree in O (n) time based on the information recorded by s. 5. Complexity Analysis and Optimization: Three arrays m, s, and w are used in the algorithm. Therefore, the required space complexity is O (n ^ 2 ). The main calculation workload of the algorithm is computing. For a fixed r, it requires O (j-I + 1) = O (r + 1 ). Therefore, the total time consumed by the algorithm is :. In fact, it can be obtained from the Quadrilateral inequality of Dynamic Planning acceleration principle, and the time complexity of this state transition equation is O (n ^ 2 ). Therefore, the improved code for the algorithm is as follows: [cpp] // 3d11-1 optimal binary search tree dynamic programming acceleration principle quadrilateral inequality # include "stdafx. h "# include <iostream> using namespace std; const int N = 3; void OptimalBinarySearchTree (double a [], double B [], int n, double ** m, int ** s, double ** w); void Traceback (int n, int I, int j, int ** s, int f, char ch); int main () {double a [] = {0.15, 0.1, 0.05, 0.05}; double B [] = {0.00, 0.5, 0.1, 0.05 }; cout <"Probability Distribution of Ordered Sets:" <endl; for (int I = 0; I <N + 1; I ++) {cout <"a" <I <"=" <a [I] <", B "<I <" = "<B [I] <endl;} double ** m = new double * [N + 2]; int ** s = new int * [N + 2]; double ** w = new double * [N + 2]; for (int I = 0; I <N + 2; I ++) {m [I] = new double [N + 2]; s [I] = new int [N + 2]; w [I] = new double [N + 2];} OptimalBinarySearchTree (a, B, N, m, s, w); cout <"minimum average length of the Binary Search Tree: "<m [1] [N] <endl; cout <" the optimal binary tree constructed is: "<endl; Traceback (N, 1, N, s, 0, '0'); for (int I = 0; I <N + 2; I ++) {Delete m [I]; delete s [I]; delete w [I];} delete [] m; delete [] s; delete [] w; return 0 ;} void OptimalBinarySearchTree (double a [], double B [], int n, double ** m, int ** s, double ** w) {// initialize the for (int I = 0; I <= n; I ++) If no internal node exists) {w [I + 1] [I] = a [I]; m [I + 1] [I] = 0; s [I + 1] [I] = 0 ;}for (int r = 0; r <n; r ++) // r indicates the difference between the start and end of the Mark {for (int I = 1; I <= n-r; I ++) // I is the subscript of the starting element {int j = I + r; // j is the subscript of the ending element int i1 = s [I] [J-1]> I? S [I] [J-1]: I; int j1 = s [I + 1] [j]> I? S [I + 1] [j]: j; // construct T [I] [j] fill in w [I] [j], m [I] [j], s [I] [j] // use I as the root, and its left subtree is empty, w [I] [j] = w [I] [J-1] + a [j] + B [j]; m [I] [j] = m [I] [i1-1] + m [i1 + 1] [j]; s [I] [j] = i1; // If I is not selected as the root and k is set as the root, k = I + 1 ,...... J // The left subtree is a node: I, I + 1 ...... K-1, right subtree for node: k + 1, k + 2 ,...... J for (int k = i1 + 1; k <= j1; k ++) {double t = m [I] [k-1] + m [k + 1] [j]; if (t <m [I] [j]) {m [I] [j] = t; s [I] [j] = k; // root node element} m [I] [j] + = w [I] [j] ;}} void Traceback (int n, int I, int j, int ** s, int f, char ch) {int k = s [I] [j]; if (k> 0) {if (f = 0) {// Root cout <"Root:" <k <"(I: j) :(" <I <"," <j <") "<endl;} else {// subtree cout <ch <" of "<f <": "<k <" (I: j) :( "<I <", "<j <") "<endl;} int t = K-1; if (t> = I & t <= n) {// trace the left subtree Traceback (n, I, t, s, k, 'L ');} t = k + 1; if (t <= j) {// trace the right subtree Traceback (n, t, j, s, k, 'R ');}}} running result

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.