The following is an article in AekdyCoin. The source is unknown, but the source is amazing.
The algorithms mentioned below are not used much because they are more complex than the practices of Tarjan and nlogn. But out of curiosity, I did some research.
As we all know, the relationship between lca and rmq is as close as a couple. Lca can be converted from dfs to a + 1rmq problem at a time, while + 1rmq can be produced using O (nlogn)-O (1) or O (n)-O (1, however, it will be difficult to write and detailed, and the specific practices will not be described in detail.
The O (n)-O (1) Online lca algorithm has nothing to do with rmq, and the algorithm is concise and easy to implement, apart from the complexity of proof. However, it does not affect the complexity of proof, because when using it directly, you do not need to consider too much proof, so you can understand it.
First, we will introduce the solution to the lca on a Complete Binary Tree.
The preceding figure shows a Complete Binary Tree with 15 points. All nodes are allocated numbers in the middle order of the tree. The number starts from 1. What is the form of lca on this tree?
Note that this tree goes down from the root node, and the number of consecutive zeros at the end of each node decreases sequentially. For any vertex x, if the number of consecutive 0 records at the end is k, its subtree has 2 k + 1-1 nodes, and the left and right subtree has 2-1 nodes. In addition to the last k + 1 digits, each child id may have the same prefix as x. The k-bit of the Left subtree is 0, and the k-bit of the right subtree is 1. That is, if x
= "A1b", a is any 01 string, and B is k consecutive zeros. Then, the structure of the node in the subtree of x is "ac ", c is any 01 string with a length of k + 1.
Ha! If you are smart, you can find that, on such a Complete Binary Tree, the lca of any two points can be quickly obtained using bitwise operations: set the numbers x and y in the middle order of the two points and their lca to r, so that z = x ^ y (by bit or ), k is the position of 1 to the left in the z binary representation. The left part of k in r is the same as x and y (because of XOR), and the k bit is 1, k is 0 on the right.
The rule mentioned above is very good, but it is complicated to describe.-|. That is, if the bitwise operation time is a constant time, the lca of any two nodes in the complete binary tree can be solved in O (1. However, the general lca problem is any given tree, and the structure is not necessarily binary. How can this full binary tree approach help us?
If we can find a ing relationship to link the two, converting the lca query to a completely binary cross, we can achieve O (1. The ing method is as follows:
First, define the number of consecutive zeros at the end of the binary representation of the function h (x) to x, also known as the height of x.
1. traverse the entire tree in sequence (Why ?) Allocation number;
2. For each vertex x, find the maximum h function value of all nodes in the subtree with it as the root, and set this value to I (x ).
This figure provides an example of assigning numbers and finding I (x. Note that in the graph, the I value of each vertex on the path marked with red is the same, which is equal to the h function value of the vertex with the largest depth in the path, this value is also its I value. Why? There is an obvious conclusion that the father's I value is always no less than the son's I value. The reason for the red path is not hard to understand. Note that there is only one vertex with the maximum h function value in a subtree.
In this step, it is too early to combine the General lca problem with the full Binary Tree lca. The following is an extremely important conclusion:
If z is the ancestor of x, I (z) is also the ancestor of I (x) In the Complete Binary Tree (a node in the tree can also be understood as its own ancestor ).
Proof: I (z) ≥i (x) first ). If I (z) = I (x), the conclusion is obvious.
For I (z)> I (x), h (I (z) = I. Assume that a position k (k> I) exists on the left of the I-bit, And I (z) and I (x) are the same on the left of k, in the k-th bit, they are different. Because I (z)> I (x), I (z) is 1 in the k-th bit, while I (x)
This digit is 0. So there must be a N so that I (x) <N <I (z), the k bit of this N and Its left is the same as I (z, the right side of the k-th digit of N is 0. Because the number allocation process is sequential, N must appear in the subtree of z. According to the definition of I value, h (N) <= h (I (N)
So there is no such k, that is, I (z) and I (x) are exactly the same on the left side of the I bit. Since I (z) is 1 on the I bit and I (x) is 0, we can know that I (z) is I (x) based on the nature of the given Complete Binary Tree).
It can be determined that the I value of the lca (set to z) of x and y must be the same ancestor of I (x) and I (y.
We reverse the paths from x and y to the root node (for example, the paths 7 and 10 are given), and we can find that they have a public prefix, in fact, this section is composed of their common ancestor, and the most reliable is z. In the path sequence, the height of the I values of previous and subsequent vertices does not drop in sequence. In the Complete Binary Tree, we can find the ancestor of a point at any height. If we can determine the height of I (z), we can get the I (z) accordingly. How can I calculate the I (z) height?
According to the method of finding the I value in the tree, we know that the I value in the tree is only a subset of the Complete Binary Tree, and some values are not obtained. For example, if 2 is the father of 8, then the value of 2 (0010) will not appear in I. The I value in the path sequence from a point to the root node is not as continuous as in a Complete Binary Tree. Therefore, to facilitate processing, we need to save the height information of all I values from the node to the Root Node path. This is a good implementation. The binary string is used to indicate whether a height in the ancestor of this node can be reached. because the size of the tree is generally not too large, it is enough to store A 32-bit int. We use A (x) to represent this number.
So that the height of the lca of I (x) and I (y) is I, and then j is A (x) and A (y) h (I (z) = j ≥i. As a result, I (z) is obtained.
Now all we need to do is find this z. If I is equal to I (z), there may be many, but which one is what we want? Previously, when I was evaluated on the tree, the red path was marked as a chain with the same I value. At the same time, such an I value will only appear on this chain. Therefore, in the end, I (z)'s face is used to find the answer.
Node x and y must be in the subtree of z, and the closest ancestor of x and y on the I (z) chain is respectively named X' and y ', in fact, they are our alternative answers. If I (x) = I (z), then it is clear that x' is equal to x itself. What if it is not equal?
You can turn around a little. For example, in the above 7 and 10 cases, the nearest ancestor of 10 in I (z) is 2, but only when I (z) = "1000" is known, it is difficult to determine the location of this 2. Even so, son 9 of 2 is really good to find, because 9, in the chain where the I value is equal to "1010", is out of the top position. If we calculate the node at the top of each I-value chain in preprocessing, then we can quickly find the son of X' and y' in their respective paths to x and y, because the I value of the two sons is very easy to get. That is to say, we can find the sons of X' and y' to find X' and Y. The final answer z is actually the one with a smaller depth in x' and Y.
Since then, we have solved the entire problem perfectly. The steps below are summarized as follows:
1. Preprocessing:
A) dfs first traverses and allocates numbers (in fact, the last traversal is also a line, but the middle order is not allowed), and records the father of each vertex;
B) Calculate the I value of each vertex and the corresponding A value, and the node number with the minimum depth of the chain record corresponding to each I value;
2. For the given query x and y, find the lca of I (x) and I (y) and find the height of the number I;
3. Use A (x) and A (y) under the restrictions of I to find the height j of I (z), and then obtain I (z );
4. calculate x': If I (x) = I (z), then x' = x; otherwise, obtain the k where the height of x is less than the maximum height of j, calculate the shortest point w of the ancestor whose height is k, then x' is the father of w;
5. Calculate y' using the same method as step 4. z is the point with a smaller depth in x' and Y.
The preceding algorithm pre-processing can be achieved using two dfs operations, with the complexity being O (n), while the subsequent queries are based on bitwise operations and the results of previous processing to achieve O (1 ). The code involves many bitwise operations, but the code length is very short. It is as concise as the offline Tarjan.
Yes, this is not something that ordinary people can come up with. Either the person is a God-level person, or he has developed such an algorithm after a variety of days. In the algorithm, ing from the general lca problem to the full binary tree will be confusing, and only after you know all the steps will you discover how subtle the conversion is, and then all kinds of emotion. In addition, the correctness of the entire practice requires a lot of proof, and almost every conversion requires many special properties. In short, this online algorithm is a variety of strange and magical. However, this does not affect us to enjoy the aesthetic and convenience brought about by this algorithm. At least, it provides a good way to solve the problem, that is, to move the problem closer to the classic problem, MAP and unify it. This idea can make full use of the knowledge first and use them to learn more.