ACM Experience summary [Reproduced]

Source: Internet
Author: User
By admin on November 5 th, 2009
First of all, what I want to say is that I am a very common acmer. I have never participated in any computer and mathematics competitions in high school, and I have never been as talented as Ben, I wrote this article to help students who have just entered the university or ACM team a little bit. I hope you can take less detours, I hope that you can help huali achieve the glory I have not achieved.

(1). Initial Stage

(2). Key Points

(3) Dynamic Planning (DP)
As the saying goes, it depends on a person's algorithm level. As long as he looks at the level of his DP question, it will be OK. However, in the changeable field of ACM, there are a few algorithms that have floated, only DP has almost never disappeared. If you ask me what type of questions have the highest probability of appearing on the field, I can tell you without hesitation that it is DP. We can also see how important the status of DP is, so it should be difficult for such a question type to appear in almost every game. Why should we start with DP? Indeed, DP is very difficult. It has many variants and covers a wide range of knowledge. It is really amazing, but I want to explain how to get started with DP. The first is the most basic models of DP, such as LCS (longest common subsequence), Lis (longest ascending subsequence), maximum common subsegment sum, and number tower, matrix concatenation is one of the most typical problems. At the beginning, it may be difficult for everyone to understand basic ideas such as bottom-up and overlapping sub-structures in DP, for these questions, you can first read others' code and the explanation in the book, and then understand it repeatedly. After understanding it, You can tap the code yourself. If there is something you really don't understand, you can take a look at other questions and look back at the previous questions, which may have a very clear effect. After having thoroughly understood several typical DP problems, you can make the variants of these classic problems, such as the largest public sub-segments and variants-the largest sub-matrix and the largest M sub-segments and, variants of Longest Common subsequences and longest ascending subsequences-Longest Common ascending subsequences and so on. In addition, we can try to use some important applications of DP. The most important issue is the number of backpacks. The problem with backpacks is a big branch of DP, I can't find other better words to describe him.) There are also many variants, such as the most basic 01 backpack, and the expanded multiple backpacks, full backpacks, and group backpacks, tree-type dp (This point will be introduced soon) applies a lot of General backpacks and so on. Next I will talk about the most basic 01 backpacks, multiple backpacks and full backpacks, the first is the simplest 01 backpack. The pseudocode is as follows:
For I = 1. n
For v = V .. 0
F [v] = max {f [v], F [V-C [I] + W [I]}
Why do we need to push back here? In fact, the principle is very simple, because it is actually using the concept similar to the rolling array, but he does not need to open two arrays, just need to open an array. Why? Because the value of F [I] [v] in the traditional two-dimensional array is composed of max (F [I-1] [v], f [I-1] [V-C [I] + W [I, therefore, an error occurs when the value of F [v] is from F [V '] (V' = V) in the upper loop, because the value of F [v] in the upper-layer loop originally calculated for F [V '] has been overwritten by the new value, it must be in a large to small loop. The second problem is that multiple backpacks can be converted into 01 backpacks, but the same type of items with the same value can be considered as multiple different types of items with the same value, that is, there is a loop more than 01 backpacks, note that the two layers of loops must be large to small, and the principle is similar to that of the 01 backpack. Finally, it is a complete backpack problem. The pseudocode is as follows:
For I = 1. n
For V = 0 .. v
F [v] = max {f [v], F [V-C [I] + W [I]}
This pseudocode is different from the pseudo code of the 01 backpack only in the V loop order. Why is this change feasible? First of all, let's think about why the V = V .. 0 in the 01 backpack should be reversed. This is because we need to ensure that the State f [I] [v] in the I-th loop is recursive from the state f [I-1] [V-C [I. In other words, this is to ensure that each item is selected only once, and to ensure that the policy of "selecting the I-item" is considered, it is based on a sub-result f [I-1] [V-C [I] That has never been selected for item I. Now, the unique feature of a backpack is that each type of item can be an unlimited number of items. Therefore, when considering the policy of "adding a first item, however, you need a sub-result f [I] [V-C [I] that may have been selected for Type I. Therefore, you can use V = 0 .. v. This is why this simple program is established. Here, I would like to recommend the nine articles about backpacks written by dd Niu of Zhejiang University, which is the most classic material for getting started and improving backpacks. Now let's talk about tree-based DP. Tree-based DP is actually DP, just the DP built on the tree model. However, although tree-based DP is simple, it is a very difficult point in DP, you need to understand and answer more questions. The last step is to compress the State DP, which is also a difficult point of DP. The so-called State refers to the use of binary or other hexadecimal numbers to represent the State so as to achieve space compression, this kind of state design is generally clever, and the many bit operations involved are also a great challenge to the encoding capability, between State compression DP is implemented by means of memory-based search (the so-called memory-based search is another Recursive Implementation form of DP, that is, the so-called top-down, it involves the search knowledge point again. It is recommended that you wait for the relevant content to be learned and then come back to learn this knowledge point. The typical problems of State compression include Board coverage and artillery positions.

(4). Search (including DFS, BFs, *)
Search is also a very important part of ACM, and its scope is quite wide. First, it is the most basic deep-Priority Search DFS, the so-called DFS, in fact, it is to enumerate all possibilities by recursion to get the desired results. A very important technique in the search is pruning, that is, to manually delete some possibilities that do not need to be searched, to improve the efficiency of our program, the most famous topic of DFS is the eight queens, sticks and so on. In fact, there are too many DFS questions. There are many questions on PKU for us to practice. The other is the breadth-first search (BFS). The basic idea of breadth-first search is to establish a queue (the queue is a basic data structure, which I will explain in the next section ), then, each time we pull out a node listed by the Team to expand all the possibilities, and then free us from the queue to wait for the next expansion, until we find the answer or cannot expand, the typical problems of BFS include horse jumping and digital eight. BFS has a very common technique or optimization, that is, bidirectional BFs. The idea is the same. It is to expand from both the starting point and the ending point. When there is an intersection, it means that the answer is found, this saves a lot of space and time compared to the common BFs. BFS also has another common extension: priority queue (BFS), the so-called priority queue (priority queue is an implementation of queue, and I will also explain it in the next section ), that is, the first element of the queue is always the smallest, so that each extension is the smallest element currently. The so-called minimum element actually refers to the optimal solution currently, we can use this greedy method to speed up the search for answers. Of course, the specific efficiency depends on the data of the question. After talking about so many BFS queues, it is actually a special situation of a *. The Chinese name of a * is a heuristic search technology, which is commonly used in AI, A * the most basic application is to find the shortest path, evaluate the value of the current vertex through an evaluation function, and then put the extended vertex into a priority queue according to its value, then, isn't the first element of the team that we come up with each time the point we want the best at the moment? If STL (C ++'s standard template library, which will also be explained in the next section) is used to implement the priority queue, A * has almost no more code than BFs, there is nothing more than an evaluation function, but the problem is how to better design an evaluation function. The classic question of a * is snake and Digital 8.

(5). c ++ application STL
STL is a standard template library of C ++, providing us with a considerable number of ready-made library functions and data structures. STL can greatly shorten the length of our code, this can greatly reduce the probability of errors. So you may be wondering, why do I hate STL? The reason is simple. We have to pay a very high price, that is, efficiency. Next I will briefly introduce the simple application of STL in ACM. The first is the library functions in STL. Among them, we have the most common sort sorting functions, find, lower_bound, upper_bound, and other search functions to simplify our code, in addition, the most common problems are sequential containers and associated containers, in fact, ordered containers can replace some common basic data structures to a considerable extent, such as vector, which can replace variable-length arrays (which can be used to easily implement adjacent tables), and list can replace linked lists, stack can replace stack, deque can replace dual-end queue, priority_queue can replace the priority queue we mentioned earlier, and map in the associated container can implement the index between any two types of data, set can check whether an element exists in a collection.

(6) Basics of Data Structure
Data structures are widely used in ACM, but there are few questions about the basis of a data structure. They generally play a secondary role, such as the priority queue we mentioned earlier, hash is also very commonly used. As we mentioned above, in the BFS process, we need to filter out the vertices we need from each extended vertex and put them into the queue, which are unnecessary vertices. Generally, they are vertices in the same state as a previously searched vertex, while the method for determining whether the status is the same often uses hash to save the previously searched status, and to determine whether the status of each extended vertex already exists. If not, then we put it into the queue.

(7) improvement of data structure (including querying sets, tree arrays, and line segment trees)
There are also some relatively advanced applications of data structures. These knowledge points may be examined separately as a knowledge point. The first is to check the set, the basic idea of querying a set is to select an element from a set as a representative element of a set. Operations on these elements can be used to merge these elements, kruscal, a classic algorithm for Minimum Spanning Tree, also uses a query set to determine whether two elements belong to the same edge. This is described in the graph theory summary in the next section. There are also a lot of questions available in the query set. The typical questions include the food chain, gay bugs, gang groups, and so on. Another variant of the query set is to delete an element from a collection, we know that the common query set does not include the function of deleting elements. The implementation of deleting elements is actually very simple, that is, creating an index for each vertex, at first, the index of each vertex points to itself. When an element is deleted, a new vertex that does not belong to any set is created first, you can index the deleted vertex to the new vertex. The second part is a tree array, which supports calculating the sum of elements in the interval within the complexity of O (nlogn). His idea is clever, that is, the summary of the tree array: suppose C [] is a tree array, and a [] is the original array, there is such a relationship between the two, c [I] indicates the sum of 2 ^ k elements starting from a [I] (k is the number of 0 contained in the end after I is converted to binary ). The properties of bitwise operations can be obtained: for I, I & (-I) = 2 ^ K, then the basic functions of the tree array can be understood. Tree arrays also have many questions, such as pku_1990, pku_2828, and pku_2155. Finally, I would like to talk about the line segment tree, which is a very powerful data structure. It supports modifying the elements in the range within the complexity of O (nlogn, unlike the query set and tree array, the implementation of the Line Segment tree is very flexible. There is almost no definite formula for a single question. Of course, the most basic idea is that each node represents a range, and its left and right subtree is the left half interval and the right half interval, which are recursively defined in this way, until it is a point or contains a unit. The line segment tree also has several basic models and typical examples, first of all, I want to talk about a series of problems such as the simple but classic application of the Line Segment tree to introduce the concept of the Line Segment tree, that is, the dyeing problem. Taking pku_2777 as an example, the question is that there is a board with a length of l cm, each centimeter can be regarded as a unit interval, the color type is represented by a number, the color of each interval is 1 at the beginning, now we need to perform O operations on this tree. There are two types of operations: the first is to dye the colors from range A to B into the C color, the second is that there are several colors in the range A to B. The simplest idea is to open an array a [] with a length of L, and a [I] stores the color of the I cell, but is it feasible? Let's take a look at the data range. The range of L is 100,000, and the range of O is 100,000. The algorithm complexity is O (LO), which obviously times out, therefore, we chose to use the line tree to help us solve this problem. In fact, the term "line tree" does not elaborate on its powerful functions well. I prefer his academic term "Interval Tree, like other trees, he can maintain the tree in O (logl) time, which means he can) within a period of time. It is precisely because of the efficiency of the Line Segment tree that it has been fully applied to a series of problems such as rmq, rectangular area merging, and perimeter. Here I want to talk about the rmq problem (that is, the problem of finding the maximum or lowest value in the range). As we all know, the rmq problem has an O (nlogn) preprocessing and O (1) offline algorithms for finding the maximum or minimum values in any interval (the so-called offline algorithm means that the value of the interval cannot be dynamically changed or inserted in the process), that is, the St (sparse table) algorithm, in contrast, the line segment tree does not have an efficient advantage, but the st algorithm has a limitation, that is, it does not support online operations, and the line segment tree does not have this restriction. Therefore, we can see that the line segment tree is powerful. Another advanced application of the Line Segment tree is the maintenance of the most value information in the interval. There are many typical problems and some difficulties, pku_2482, pku_1151 (calculate the area after N rectangles are merged), pku_1177 (calculate the perimeter of the border after N rectangles are merged), and so on. The basic idea of maintaining the greatest value is to recursively maintain each subtree and use the information of the subtree to maintain the father.

(8). String (including KMP algorithm, trie tree, suffix array)
String processing is also a considerable area of knowledge in ACM, and it also has a considerable practical application. In fact, I have little knowledge about strings, therefore, we can only briefly talk about several basic algorithms. The KMP algorithm is the most basic linear algorithm for string matching. The core of this algorithm is its understanding of the next array, this array is obtained by preprocessing a string, which successfully reduces the complexity of matching the two strings to a linear value. However, the KMP algorithm can only match two strings. What if we want to match multiple strings? Trie helps us solve this problem. Trie is actually a letter tree. Each node of the tree has 26 English letters, and a string is inserted by marking these nodes, after inserting n strings, we can match the N strings at the same time. The trie tree has a major disadvantage, that is, the space he needs is exponential. If the length of all strings exceeds 15, we should consider other methods. The last part is the suffix array. The main essence of the algorithm lies in the understanding and application of the height array. two articles in the National Team papers specifically introduce the suffix array. I will not go into details here.

(9) Graph Theory (including knowledge points such as the shortest path, the least spanning tree, And the strongly connected component)
The reason why we put the graph theory content at the end is that there are too many knowledge points in graph theory and the aspects involved are too wide, I would like to give a brief summary of my questions. The first is the shortest path, which is mainly divided into multiple sources and the shortest path. The algorithm used is a very classic Floyd algorithm with the complexity of O (N ^ 3), which is quite simple to implement, the main idea is DP. There is also the single-source shortest circuit. The most primitive method is Bellman-Ford, but the probability of this algorithm is not high, because he has a very good alternative, that is, spfa, the implementation of spfa is a bit similar to a wide search, and the code is very short. The question of whether a negative ring exists in a graph for Bellman-Ford can be completely replaced, in addition, finding the shortest path in the sparse graph is more efficient than using the Dijkstra algorithm optimized by the priority queue. It is indeed a practical good thing. Of course, the most famous Dijkstra algorithm is definitely not to mention. This algorithm is the most commonly used algorithm for finding the shortest path, but because he uses the greedy principle, therefore, you must note that Dijkstra cannot process images with negative edges, while Dijkstra also has several very classic variants. One is to extend Dijkstra to two-dimensional to find short circuits, the other is to use a * To find a short circuit. Have you noticed that the division between different knowledge blocks is becoming increasingly unclear when you learn it? Because our brain is gradually forming a knowledge network, each knowledge point is organically linked together. The second big point is the minimal spanning tree, which also has two well-known algorithms, the prim algorithm for dense graphs and the kruscal Algorithm for sparse graphs (the algorithm used to query and query sets in this algorithm). The minimum spanning tree also has some classic variants, such as the next generation tree, minimum Limit generation tree. Another very important issue is the problem of binary matching, which involves many knowledge points, find the maximum matching of a non-weighted bipartite graph, including the most classic Hungary algorithm and the minimum/large matching km Algorithm for the weighted bipartite graph, however, the knowledge points derived from the maximum number of matching for a bipartite graph without weight have the minimum coverage problem and the minimum path coverage problem. The concept of this item is relatively strong, the biggest feature of graph theory is its strong concept and many variants. If you do not have a deep understanding of each problem and its classical algorithms, you will not be able to cope with it. This is the Euler Loop problem. This type of problem is simple. There are two types of problems: one is to judge whether there is an Euler loop in the diagram, and the other is to find one of the Euler's loops, implementation is also very simple, and a DFS can be completed. There is also a strongly connected component. The core of this algorithm is to calculate the point of the strongly connected component of a graph and then convert it into a directed acyclic graph, this makes it easier for us to continue. The last major part is the network stream. I have little to do with this content. It mainly involves several classic network stream algorithms, such as hlpp, ISAP, and EK. There are also many network stream variants, exercise is required.

(10) ACM final match
I have participated in two formal competitions. One is the Shanghai Invitational competition under the sophomore year, and the other is the Hefei semi-finals on the 3rd day. Due to the limited level, after all, I still won only two bronze medals. Next I want to talk about my team members' personal feelings. The first is the composition of the team members, at least one person with strong coding ability is mainly responsible for coding to increase the speed of simple questions. In addition, ACM is more and more fond of mathematics problems, in addition, the idea of good mathematics is usually relatively broad, and the idea can be provided to the entire team. The other one is that the algorithm is widely used by people who have a lot of knowledge about algorithms, there are also a large number of questions. Although there may not be any strong ones, his rich experience in question-making ensures that he can sense the algorithm of a question and give directions to the entire team. Here, my summary is coming to an end. I hope you can get help.

Related Keywords:

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

• Sales Support

1 on 1 presale consultation

• After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

• Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.