Best binary search tree:
Given a sorted sequence of n different keywords K=<k1,k2,..., kn> (so k1<k2<...<kn) we want to construct a binary tree with these keywords. For each keyword Ki, there is a probability pi representing its search frequency.
Some of the values to search for may not be in K, so we also have n+1 "pseudo-keyword" d0,d1,d2,..., dn to represent values that are not in K. D0 represents all values less than K1, the DN represents all values greater than KN, and the i=1,2,..., n-1 pseudo-Keyword di represents all values between Ki and K (i+1).
For each pseudo-keyword di there is also a probability of Qi representing the corresponding search frequency.
Suppose that the cost of a search equals the number of nodes accessed, that is, the depth of the node found in the search is added to 1. The expected cost of a search in T is:
For a given set of probabilities, we want to construct a two-fork search tree with the lowest desired search cost, which we call the optimal binary search tree.
This problem is solved by dynamic programming method:
Step 1: The structure of the optimal binary search tree:
Consider a binary search tree of any subtree, it must contain continuous keyword KI,...,KJ (1<=i<=j<=n), and its leaf nodes must be pseudo-keyword D (i-1),...., DJ.
Optimal substructure:
If an optimal binary search tree T contains a subtree t ' with the keyword ki,...,kj (1<=i<=j<=n), then T ' must be the optimal solution for the child problem that contains the keyword KI,...,KJ and the pseudo-keyword D (i-1),...., DJ.
Step 2: A recursive algorithm
Root[i, j] preserves the root node KR's subscript R.
Step 3: Compute the desired search cost for the optimal binary search tree
def optimal_bst (p,q,n):
e=[[0 for J, Range (n+1)]for I in range (n+2)]
w=[[0 to J in range (n+1)]for i-range (n+ 2)]
root=[[0 for J. Range (n+1)]for I in range (n+1)] for
I in range (n+2):
e[i][i-1]=q[i-1]
w[i][i-1]=q[ I-1] for
L in range (1,n+1): For
I in range (1,n-l+2):
j=i+l-1
e[i][j]=float ("INF")
w[i][j]=w[i][ J-1]+P[J]+Q[J] for
R in range (i,j+1):
t=e[i][r-1]+e[r+1][j]+w[i][j]
if T<E[I][J]:
e[i][j]=t
root[i][j]=r return
e,root
if __name__== "__main__":
p=[0,0.15,0.1,0.05,0.1,0.2]
q=[ 0.05,0.1,0.05,0.05,0.05,0.1]
E,root=optimal_bst (p,q,5) for
I in range (5+2): for
J in Range (5+1):
print (E[i][j], "", end= ')
print () for
I in range (5+1): for
J in Range (5+1):
print (root[i][j ], "", end= ')
print ()
Run:
>>>
= = Restart:d:\program files\python\test\algorithms\ algorithm Introduction \39-optimal-bst.py =
0 0 0 0 0 0.1
0.05 0.45000000000000007 0.9 1.25 1.75 2.75
0 0.1 0.4 0.7 1.2 2.0
0 0 0.05 0.25 0.6 1.2999999999999998
0 0 0 0.05 0.30000000000000004 0.9
0 0 0 0 0.05 0.5
0 0 0 0 0 0.1 , 0 0 0 0 0 0 0 1 1 2 2 2
0 , 0 2 2 2 4 0 0 0 3 4 5
0 0 0 0 4 5 0 0 0 0 0-5