A set of data structures (disjoint set)

Last Update:2014-11-28 Source: Internet

Author: User

Tags new set

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

first, and the concept of the collection:First, in order to elicit and check the set, first introduced several concepts: 1, equivalence relation (equivalent Relation) reflexivity, symmetry, transitivity. If an equivalence relationship exists between A and B, recorded as A~b. 2. Equivalence class: An equivalence class for an element a (a belongs to s) is a subset of S, which contains all elements that are related to a. Note that the equivalence class forms a division of S: Each member of S is exactly mutually exclusive in an equivalence class. To determine if a~b, we just need to verify that both A and B belong to the same equivalence class. 3, and check set: That is the equivalence class, the same equivalence class (and check set) element 22 has an equivalence relationship between the different and set elements there is no equivalent relationship. 4, related properties: Find/union the input data is initially a class of n sets (Collection), each containing an element. The initial description is that all relationships are false (except for reflexive). Each collection has a different element, thus S (i) &&s (j) = NULL, which makes these collections disjoint (disjoint). There are two operations that can be performed, that is, finding the collection to which a given element belongs and merging two collections. An operation is find, which returns the name of the collection that contains the given element (that is, the equivalence class). Ling An operation is to add a relationship. If we want to add relationship a~b, we first need to know if A and B already exist. This can be done by performing find tests on A and B to see if they belong to the same equivalence class. If they are not in the same class, then we use the negation operation (Union), which merges the two equivalence classes containing A and b into a new equivalence class.
second, the basic data structure of the collection:We lead directly to the best data structure for representing and checking the set, and as for why not select other data structures such as linked lists, you can refer to the introduction to algorithms and data structure and algorithm analysis. Here, we use the tree to represent each set, because every element on the tree has the same root. In this way, we can name the collection that is located with the root. So for each element, we should record the collection where it resides, and here we can assume that the tree is not explicitly stored in an array: Each member of the array P[i] represents the father of element I. If I is the root, then p[i]=0 (can also be set to p[i]=i). For such a collection, when we perform a union operation of two sets, we are the root of one node pointing to the root of another tree. Union (5,6): Union (7,8): Union (5,7): After such a process, the non-explicit representation of the tree above (the parent node of the first behavior node I p[i], and the second behavior node i) is:

0	0	0	0	0	5	5	7
1	2	3	4	5	6	7	8

or (root node p[i]=i)

1	2	3	4	5	5	5	7
1	2	3	4	5	6	7	8

third, and check the implementation of the operation:Now, let's consider the specific implementation of the find-union operation mentioned above. And check the set operation: makeset (int x[]): Create a new collection whose only member is X. Unionset (int x, int y): Merges dynamic sets containing x and Y (such as s (x) and S (y)) into a new set (the union of the two sets). int findset (int x): Returns the collection number where x is located. Declaration: The value of p[] array--p[i] is the parent node of node I numsets--const constant, which represents the number of initial sets, i.e. the number of initial elements makeset (int x[]) function: void makeset (int x []) {int i; for (i=0;i<numsets;i++) {P[i] = 0;//or P[i] = i}}
Function parsing: This initialization process, in which each element in the array is assigned a parent node, determines which collection the element belongs to. Since the initialization, each element is a separate collection, we set each element's parent node to 0 (or the node itself). Unionset (int x, int y) function: void Unionset (int x, int y) {int a = Findset (x); int B = Findset (y); P[B] = A; }
Function parsing: The Unionset function implements a collection of two elements corresponding to a set. Here, we implement the basic method, and its optimization method is explained in the later Optimization section. The basic method here is to implement the idea of merging two sets of methods, that is, the parent node of the root node of one collection is set to the root node of another collection. According to the above thought, the steps here are obvious: firstthrough the Findset functionGettwo elements corresponding to the collection number, and then set the parent node of one of the collections to the root node of another collection, which completes all merge operations. Findset (int x) function:
int findset (int x) {if (P[x] <= 0)//or if (p[x] = = x) return x; else return Findset (p[x]); }
Function Resolution: Depending on the lookup step, if the parent node of the node is 0 (or the node itself), then the node itself is returned, because that node is the root node of the tree, if the parent node of the node is not 0 (or not the node itself), then the node is the parent node, then returns the collection number of the parent node, This collection number is also the collection number for the node. According to the above code, we find that it is implemented by recursive algorithm idea, the recursive termination condition is to find the root node, the theoretical basis for the recursive pattern is that the node belongs to the collection and the node's parent node belongs to the same collection.
Four, and check the optimization of the operation:First, let's look at an example: Initially, we have 5 elements, each forming 5 separate sets: Now, we follow the previous unionset function to do some collection merging operations: Unionset (2,1): Unionset (3,1): Unionset (4,1): Now, we're going to examine this process, and when we're done with three unionset operations, This set becomes a linked list form. The problem with this is that the search efficiency will be very low. Why is the overall performance of the linked list poor? For the find-union, we often use the operation. If you use a linked list data structure to store and search for a set, we often specify the first element of the list as the representative of the collection in which it resides. So for the find operation, for the tail element, to find to the linked list head node, you need to traverse the entire list, with a time complexity of O (n). For the merge union operation, the collection of two elements is merged together, the two linked list will be merged, generally can use two methods, one is the head connected, one is the head and tail connected. However, regardless of the merge method, you need to go through the lookup process, find the head node of the list, and then connect. In general, the time complexity required for finding and merging operations is higher than the implementation of the root tree. Here, the optimization method we use is to avoid the heuristic strategy of increasing the depth of the tree after merging as much as possible. The first heuristic is the merge by rank (Union by rank). The idea is to point the root of a tree with fewer nodes to the root of a tree that contains more nodes. For each node, the rank represents an upper bound of the node height. In a merge by rank, a root with a smaller rank points to a root with a large rank in the Union operation. The second heuristic is path Compression. Path compression is performed during a findSet2 operation regardless of the method used to perform the unionset. The operation is FindSet2 (x), where the effect of path compression is that each node on the path from X to root makes its parent node the root. Now, let's talk about both methods: 1, merge by Rank: Here, we need to add field Rank. During initialization, you need to initialize the rank field for each node (collection). (1), makeset (int x[]) void MakeSet (int x[]){int i; for (i=0;i<numsets;i++) {p[x[i]] = 0;//or P[i] = i rank[i] = 0; }} (2), merge by rank process:A root with a smaller rank points to a root with a large rank in the Union operation. void Unionbyrank (int x, int y) {int FX = Findset (x);int fy = findset (y); if (Rank[fx] > Rank[fy])P[fy] = FX;Else {p[fx] = fy;if (rank[fx] = = Rank[fy])rank[fy]++; } } function parsing:Here are some of the main focuses:A, why to find the root node? B, under what circumstances need to update the rank? C, what should be the different measures for comparing the size of different rank? Answer:A, we merge two elements, that is, the merging of two elements corresponding to the set, that is, two different trees. In order to increase the depth of the merged tree as small as possible, we should consider the rank of the tree, that is, the rank of the root node, not the rank of the son node. B, the reason for updating the rank is that the rank of the merged set has changed. There is only one change: that is, the rank of the two set before the merge is equal, so that there must be a root node of the tree of one of the sets pointing to the other set, so that the rank of the new set must increase by 1. C, we merge with a small rank tree to the rank of the tree, after comparison, by changing the p[i] to achieve the merging process. illustrate the benefits of such a merger:Existing Operation Union (5,3): For the traditional union operation, the function Unionset (5,3): First, find the corresponding set of 5, 3, i.e. Findset (5) =4, Findset (3) = 1; then, p[1] = 4; So get the following set 1: For merge by Rank, that is, function Unionbyrank (5,3): First, find 5, 3 corresponding set, namely Findset (5) =4, Findset (3) = 1; Then, since the RAN of 4 K is less than 1 rank, thus has p[4] = 1; then get the following set of 2: By comparing sets 1 and set 2, it is obvious that the structure of set 2 is more uniform, which will bring great convenience to the finding operation. 2, Path Compression: For path compression, its key process is to make the tree structure biased towards the number of nodes of the direct-attached root node increase, thereby reducing the find operation. For the entire path compression process, the core is that each findset operation must be traversed to the node dynamically connected to the root node. In other words, the tree structure of the entire collection is adjusted every time findset. void findsetwithpathcompression (int x) {if (p[x] = = 0)//or if (p[x] = = x) return x; else return p[x] = findsetwithpathcompression (p[x]); } function parsing: Carefully observe the difference between the findsetwithpathcompression () function and the Findset () function, the difference is P[x] = findsetwithpathcompression (P[x]). What is the effect of this distinction? As an example:for the collection shown (root tree), we would like to find the corresponding collection number for 6:Obviously, for Findset (6), it needs to traverse the node 6,2,1,4, after four steps to find the root node, get the set number 4. For Findsetwithpathcompression (6), it will also traverse the node 6,2,1,4, but due to p[x] = Findsetwithpathcompression (P[x]), p[6] = p[2] = p[1] = 4. As a result, the root tree of the entire set has undergone structural adjustment: changes in the structure of the tree result in a simplification of the subsequent lookup operation, without altering the entire set of content (we do not care about how the collection is implemented and stored, and we are concerned about what the relationship and elements are in the collection).
Five, and check the application examples: Description If a family member is too large to judge whether two are relatives, it is not easy indeed. A relative diagram is given to ask whether any of the two individuals given are related. Rule: X and Y are relatives, Y and Z are relatives, then X and Z are also relatives. If x, Y is a relative, then the relatives of X. Are relatives of Y, relatives of Y are also relatives of X. Input The first line: three integers n, m, p (n< =5000,m< =5000, p< =5000), respectively, to indicate that there are n individuals, M relatives, ask p to relative relationship. the next M-line: Two Mi, MJ (1< =mi, mj< =n) per line, indicating that MI and MJ are related. Next P line: two numbers per line pi, pj, ask if Pi and PJ are related. Output P line, one ' Yes ' or ' No ' per line. Indicates that the answer to the I question is "with" or "does not have" kinship. problem Analysis for this problem, the initial feeling is the use of graph connectivity to achieve, but, because of the huge amount of data, to use the diagram to achieve such a process is unrealistic. Note that the provisions in the topic essentially specify the equivalence relationship between X and Y. Therefore, to synthesize this problem, we can use and check the way to achieve the problem solving. In this way, the test instructions is converted to:n elements, constructs a set by M equivalence relation, and determines whether P is in the same set as the element. Code Solutions #include <stdio.h> #define MAX 5000
int P[max]; void MakeSet (int x); void Unionset (int x, int y); void findsetwithpathcompression (int x);
int main (void) { int n,m,p; int a,b;&nbsp ; int x,y; scanf ("%d%d%d", &n, &m, &p); &N Bsp MakeSet (n); while (m--) { SC ANF ("%d%d", &a, &b); Unionset (A, b); } & nbsp while (p--) { scanf ("%d%d", &x,&y ); if (findsetwithpathcompression (x) = = Findsetwithpathcompression (y)) &N Bsp printf ("yes\n"); else &NBS P printf ("no\n"); } } & nbsp; void MakeSet (int x) { int i; for (i = 0; i < x; i++) P[i] = 0; } other Functions are Writen in above Content.

From for notes (Wiz)

A set of data structures (disjoint set)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More