Tarjian algorithm
Lca:lca (Least Common Ancestor), as the name implies, refers to a common node in a tree that is closest to two points. That is to say, on the road to the root of the two points, there must be a common node, we are required to find a common node, the depth as deep as possible points. It can also be expressed in another way, that is, if the tree is viewed as a graph, this finds the shortest distance between the two points.
LCA algorithm has an on-line algorithm and offline algorithm, so-called online algorithm is real-time, and offline algorithm is required to read all the requests at once, and then in a unified processing. And in the process of processing is not necessarily in accordance with the requested input sequence to handle. The incoming request may be processed first during the execution of the algorithm.
Tarjan algorithm. This algorithm is based on the check set and DFS. Offline algorithms.
Now let's look at the situation where we are dealing with the query associated with the X node and check the set. Since a node has been processed, it is attributed to the collection where the parent node is located, so in the already processed nodes (including the x itself), the X node itself constitutes the set of X, the parent node of the X node and the subtree that is the root of all the processed sibling nodes of X, consisting of the father[of the X. X], the parent of the parent node of the X-node and the subtree of all processed sibling nodes with the parent node of x as the root of the father[father[x [LCA]] collection ... (above these words if look awkward, analyze the sentence composition, also can refer to the right side of the picture) assume that there is a query (x, y) (the node is processed), in and check to find that y belongs to the collection of the root is Z, then z is x and y lca,x to y path length is lv[x]+lv[y]-lv [Z]*2. Add all the path lengths you have passed to get the answer. Now there is another problem: in the query (x, y) mentioned above, Y is the node that has been processed. So, what if Y has not yet been processed? As a matter of fact, simply by adding two queries (x, y) and (y,x) to the query list, you can make sure that both queries have and only one of them is processed (the one that is temporarily unable to process). The x,x query does not have to be stored at all. If you use optimization measures such as path compression in the implementation of the set, the complexity of one query can be considered as constant, and the whole algorithm is linear.
Tarjan as an offline off-line algorithm, before the program starts, you need to store all the nodes that are waiting to be queried, and then the program executes TARJANLCA () from the root of the tree. If there is a multi-fork tree below
According to TARJANLCA implementation algorithm can be seen, only when a subtrees tree all traversal processing is completed, only the root node of the subtree is marked as black (initialization is white), assuming that the program is traversed by the tree structure above, first starting from Node 1, and then recursively processing the root 2 subtree, when the subtree 2 processing is complete , Node 2, 5, 6 are black, and then back to 3 sub-trees, the first to be dyed black is node 7 (because node 7 as a leaf without deep search, direct processing), and then node 7 will see all queries (7, X) node pair, if present (7, 5), because node 5 has been dyed black, so it can be concluded (7, 5 The nearest public ancestor is find (5). Ancestor, that is, Node 1 (since the 2 subtree is processed, subtree 2 and Node 1 are union,find (5) to return the root 1 of the merged tree, at which point the value of the root's ancestor is 1). Someone will ask if there is no (7, 5), but there is (5, 7) asked how to deal with it? We can do a trick when the program is initialized, to store all of the questions (A, b) and (b, a), so that integrity can be guaranteed.
Tarjan algorithm to find the nearest public ancestor