See http://www.th000fansxjj.com/wordpress/archives/32761 for more information

By admin on November 5 th, 2009

First of all, what I want to say is that I am a very common acmer. I have never participated in any computer and mathematics competitions in high school, and I have never been as talented as Ben, I wrote this article to help students who have just entered the university or ACM team a little bit. I hope you can take less detours, I hope that you can help huali achieve the glory I have not achieved.

(1). Initial Stage

I started to contact ACM in my sophomore year. The basics are the C language courses of my freshman year, and the language basics are weak. However, ACM does not have very high language requirements in its infancy, as long as you have mastered the C language courses in the school, you can start your ACM journey. However, this is only limited to the beginning. When you learn a certain degree of ACM, the code length of each question will also grow. You will find that some C ++ language features can greatly simplify your code length and ideas, in addition, C ++ is a very important language. It is of great help for both the improvement of ACM level and the foundation for subsequent windows programming. As for the so-called "I don't know how to start" problem that many people encounter, I also want to talk about my opinion. First of all, you need to think about some basic simulation questions on PKU, why? For a person who has never been familiar with programming in the past, the simulation questions can help you to improve your coding capability to a considerable extent. The encoding capability here is your idea or idea, in the shortest time, use the code as beautiful as possible to fully and correctly implement it. As for what is beautiful, I think you will learn it slowly. Here you may ask where to find so many simulation questions? I think you only need to search for keywords such as "PKU water question" on Google, or directly view the category of the questions in our school. (If you want to ask, you can send an email to me) you can find it. So how many questions do you need? I think, for a beginner, there are about 30 courses ~ 50 simple simulation questions are enough.

I started my homework on PKU during my sophomore summer vacation. At that time, I held a "play" mentality and didn't have any plans for the competition. During the summer vacation, I went to play and learn, basically, I had no idea how to deal with algorithm questions, so I cut through dozens of the simplest questions. I didn't make much progress until the summer training session was over, since I was a freshman and had military training at the end of the summer vacation, I went home for all our freshman students and put ACM on one side after I went home, I have never touched it. this decadent state has lasted quite a long time since the beginning of my sophomore year, I started to cut the question again for some reason (I really forgot why I started to cut the question again) and began to access various basic algorithms. It was really painful at the beginning, however, with the help of Tian Ge and Ben Niu and the online reading of other people's problem-solving reports, he finally made about 200 questions. At this time, the sophomore year was almost over, I attended the TOEFL Training Course for New Oriental during the winter vacation because of my going abroad. ACM was put aside by my "justified reasons" and almost never touched it during the winter vacation. After the second semester of my sophomore year, I reviewed TOEFL for a while and started my new career after the exam was over. In fact, strictly speaking, at this time, I started the planned and hard-working ACM training. I had been training for the Shanghai Invitational competition. This is the first competition I have participated in since I participated in ACM. It is also the first time I used my own eyes to identify the gap between myself and others. This competition shocked me a lot, just like a college student, the same age, but the gap is so big, it is indeed a deep thought. At this time, I realized how much time I had wasted, and how much time I realized that my level had a big gap with others. Shortly after that, the summer training session began in my sophomore year. These two months were the fastest growing period since I started ACM. Next, I want to tell you some of my experiences and feelings, hoping to help you.

(2). Key Points

First of all, I think the most important thing is to think independently and dare to try. The so-called independent thinking means not to develop the habit of searching others' code on the Internet without doing anything, you can try to ask other people's ideas, and then try to implement it on your own. after doing so, you can read other people's code and learn some good things. The so-called "dare to try" means not to be afraid of mistakes. programming is a very special thing. He can verify the correctness of your theory on the spot. Therefore, do not hide your mistakes in your heart, open the computer and try it on your own. Naturally, only in this way can you get happiness from every question you have done. Second, we need to write a problem-solving report, write down the knowledge we have learned and the problems we have encountered in this question, and sort out and summarize the knowledge we have learned and link them together. When you stick to it, I want to tell you that you are a good person, but please continue to stick to it because the real fun in ACM is just getting started, that is, algorithms. I think most of my colleagues who first recognized algorithms will be very confused, because in my experience, I have almost obtained a question each time, in most cases, there is no idea at all, and it is rare to have a point of thinking. It may be wrong if I write it for a long time. I think you don't have to worry about this situation, because a considerable number of people are the same as you, and many of them become excellent acmers, of course, this is also the result of their efforts. Therefore, I believe that as long as you stick to it, there will be a day of harvest. In the following section, I want to talk about the main aspects you want to overcome.

(3) Dynamic Planning (DP)

As the saying goes, it depends on a person's algorithm level. As long as he looks at the level of his DP question, it will be OK. However, in the changeable field of ACM, there are a few algorithms that have floated, only DP has almost never disappeared. If you ask me what type of questions have the highest probability of appearing on the field, I can tell you without hesitation that it is DP. We can also see how important the status of DP is, so it should be difficult for such a question type to appear in almost every game. Why should we start with DP? Indeed, DP is very difficult. It has many variants and covers a wide range of knowledge. It is really amazing, but I want to explain how to get started with DP. The first is the most basic models of DP, such as LCS (longest common subsequence), Lis (longest ascending subsequence), maximum common subsegment sum, and number tower, matrix concatenation is one of the most typical problems. At the beginning, it may be difficult for everyone to understand basic ideas such as bottom-up and overlapping sub-structures in DP, for these questions, you can first read others' code and the explanation in the book, and then understand it repeatedly. After understanding it, You can tap the code yourself. If there is something you really don't understand, you can take a look at other questions and look back at the previous questions, which may have a very clear effect. After having thoroughly understood several typical DP problems, you can make the variants of these classic problems, such as the largest public sub-segments and variants-the largest sub-matrix and the largest M sub-segments and, variants of Longest Common subsequences and longest ascending subsequences-Longest Common ascending subsequences and so on. In addition, we can try to use some important applications of DP. The most important issue is the number of backpacks. The problem with backpacks is a big branch of DP, I can't find other better words to describe him.) There are also many variants, such as the most basic 01 backpack, and the expanded multiple backpacks, full backpacks, and group backpacks, tree-type dp (This point will be introduced soon) applies a lot of General backpacks and so on. Next I will talk about the most basic 01 backpacks, multiple backpacks and full backpacks, the first is the simplest 01 backpack. The pseudocode is as follows:

For I = 1. n

For v = V .. 0

F [v] = max {f [v], F [V-C [I] + W [I]}

Why do we need to push back here? In fact, the principle is very simple, because it is actually using the concept similar to the rolling array, but he does not need to open two arrays, just need to open an array. Why? Because the value of F [I] [v] in the traditional two-dimensional array is composed of max (F [I-1] [v], f [I-1] [V-C [I] + W [I, therefore, an error occurs when the value of F [v] is from F [V '] (V' = V) in the upper loop, because the value of F [v] in the upper-layer loop originally calculated for F [V '] has been overwritten by the new value, it must be in a large to small loop. The second problem is that multiple backpacks can be converted into 01 backpacks, but the same type of items with the same value can be considered as multiple different types of items with the same value, that is, there is a loop more than 01 backpacks, note that the two layers of loops must be large to small, and the principle is similar to that of the 01 backpack. Finally, it is a complete backpack problem. The pseudocode is as follows:

For I = 1. n

For V = 0 .. v

F [v] = max {f [v], F [V-C [I] + W [I]}

This pseudocode is different from the pseudo code of the 01 backpack only in the V loop order. Why is this change feasible? First of all, let's think about why the V = V .. 0 in the 01 backpack should be reversed. This is because we need to ensure that the State f [I] [v] in the I-th loop is recursive from the state f [I-1] [V-C [I. In other words, this is to ensure that each item is selected only once, and to ensure that the policy of "selecting the I-item" is considered, it is based on a sub-result f [I-1] [V-C [I] That has never been selected for item I. Now, the unique feature of a backpack is that each type of item can be an unlimited number of items. Therefore, when considering the policy of "adding a first item, however, you need a sub-result f [I] [V-C [I] that may have been selected for Type I. Therefore, you can use V = 0 .. v. This is why this simple program is established. Here, I would like to recommend the nine articles about backpacks written by dd Niu of Zhejiang University, which is the most classic material for getting started and improving backpacks. Now let's talk about tree-based DP. Tree-based DP is actually DP, just the DP built on the tree model. However, although tree-based DP is simple, it is a very difficult point in DP, you need to understand and answer more questions. The last step is to compress the State DP, which is also a difficult point of DP. The so-called State refers to the use of binary or other hexadecimal numbers to represent the State so as to achieve space compression, this kind of state design is generally clever, and the many bit operations involved are also a great challenge to the encoding capability, between State compression DP is implemented by means of memory-based search (the so-called memory-based search is another Recursive Implementation form of DP, that is, the so-called top-down, it involves the search knowledge point again. It is recommended that you wait for the relevant content to be learned and then come back to learn this knowledge point. The typical problems of State compression include Board coverage and artillery positions.

(4). Search (including DFS, BFs, *)

Search is also a very important part of ACM, and its scope is quite wide. First, it is the most basic deep-Priority Search DFS, the so-called DFS, in fact, it is to enumerate all possibilities by recursion to get the desired results. A very important technique in the search is pruning, that is, to manually delete some possibilities that do not need to be searched, to improve the efficiency of our program, the most famous topic of DFS is the eight queens, sticks and so on. In fact, there are too many DFS questions. There are many questions on PKU for us to practice. The other is the breadth-first search (BFS). The basic idea of breadth-first search is to establish a queue (the queue is a basic data structure, which I will explain in the next section ), then, each time we pull out a node listed by the Team to expand all the possibilities, and then free us from the queue to wait for the next expansion, until we find the answer or cannot expand, the typical problems of BFS include horse jumping and digital eight. BFS has a very common technique or optimization, that is, bidirectional BFs. The idea is the same. It is to expand from both the starting point and the ending point. When there is an intersection, it means that the answer is found, this saves a lot of space and time compared to the common BFs. BFS also has another common extension: priority queue (BFS), the so-called priority queue (priority queue is an implementation of queue, and I will also explain it in the next section ), that is, the first element of the queue is always the smallest, so that each extension is the smallest element currently. The so-called minimum element actually refers to the optimal solution currently, we can use this greedy method to speed up the search for answers. Of course, the specific efficiency depends on the data of the question. After talking about so many BFS queues, it is actually a special situation of a *. The Chinese name of a * is a heuristic search technology, which is commonly used in AI, A * the most basic application is to find the shortest path, evaluate the value of the current vertex through an evaluation function, and then put the extended vertex into a priority queue according to its value, then, isn't the first element of the team that we come up with each time the point we want the best at the moment? If STL (C ++'s standard template library, which will also be explained in the next section) is used to implement the priority queue, A * has almost no more code than BFs, there is nothing more than an evaluation function, but the problem is how to better design an evaluation function. The classic question of a * is snake and Digital 8.

(5). c ++ application STL

STL is a standard template library of C ++, providing us with a considerable number of ready-made library functions and data structures. STL can greatly shorten the length of our code, this can greatly reduce the probability of errors. So you may be wondering, why do I hate STL? The reason is simple. We have to pay a very high price, that is, efficiency. Next I will briefly introduce the simple application of STL in ACM. The first is the library functions in STL. Among them, we have the most common sort sorting functions, find, lower_bound, upper_bound, and other search functions to simplify our code, in addition, the most common problems are sequential containers and associated containers, in fact, ordered containers can replace some common basic data structures to a considerable extent, such as vector, which can replace variable-length arrays (which can be used to easily implement adjacent tables), and list can replace linked lists, stack can replace stack, deque can replace dual-end queue, priority_queue can replace the priority queue we mentioned earlier, and map in the associated container can implement the index between any two types of data, set can check whether an element exists in a collection.

(6) Basics of Data Structure

Data structures are widely used in ACM, but there are few questions about the basis of a data structure. They generally play a secondary role, such as the priority queue we mentioned earlier, hash is also very commonly used. As we mentioned above, in the BFS process, we need to filter out the vertices we need from each extended vertex and put them into the queue, which are unnecessary vertices. Generally, they are vertices in the same state as a previously searched vertex, while the method for determining whether the status is the same often uses hash to save the previously searched status, and to determine whether the status of each extended vertex already exists. If not, then we put it into the queue.

(7) improvement of data structure (including querying sets, tree arrays, and line segment trees)

There are also some relatively advanced applications of data structures. These knowledge points may be examined separately as a knowledge point. The first is to check the set, the basic idea of querying a set is to select an element from a set as a representative element of a set. Operations on these elements can be used to merge these elements, kruscal, a classic algorithm for Minimum Spanning Tree, also uses a query set to determine whether two elements belong to the same edge. This is described in the graph theory summary in the next section. There are also a lot of questions available in the query set. The typical questions include the food chain, gay bugs, gang groups, and so on. Another variant of the query set is to delete an element from a collection, we know that the common query set does not include the function of deleting elements. The implementation of deleting elements is actually very simple, that is, creating an index for each vertex, at first, the index of each vertex points to itself. When an element is deleted, a new vertex that does not belong to any set is created first, you can index the deleted vertex to the new vertex. The second part is a tree array, which supports calculating the sum of elements in the interval within the complexity of O (nlogn). His idea is clever, that is, the summary of the tree array: suppose C [] is a tree array, and a [] is the original array, there is such a relationship between the two, c [I] indicates the sum of 2 ^ k elements starting from a [I] (k is the number of 0 contained in the end after I is converted to binary ). The properties of bitwise operations can be obtained: for I, I & (-I) = 2 ^ K, then the basic functions of the tree array can be understood. Tree arrays also have many questions, such as pku_1990, pku_2828, and pku_2155. Finally, I would like to talk about the line segment tree, which is a very powerful data structure. It supports modifying the elements in the range within the complexity of O (nlogn, unlike the query set and tree array, the implementation of the Line Segment tree is very flexible. There is almost no definite formula for a single question. Of course, the most basic idea is that each node represents a range, and its left and right subtree is the left half interval and the right half interval, which are recursively defined in this way, until it is a point or contains a unit. The line segment tree also has several basic models and typical examples, first of all, I want to talk about a series of problems such as the simple but classic application of the Line Segment tree to introduce the concept of the Line Segment tree, that is, the dyeing problem. Taking pku_2777 as an example, the question is that there is a board with a length of l cm, each centimeter can be regarded as a unit interval, the color type is represented by a number, the color of each interval is 1 at the beginning, now we need to perform O operations on this tree. There are two types of operations: the first is to dye the colors from range A to B into the C color, the second is that there are several colors in the range A to B. The simplest idea is to open an array a [] with a length of L, and a [I] stores the color of the I cell, but is it feasible? Let's take a look at the data range. The range of L is 100,000, and the range of O is 100,000. The algorithm complexity is O (LO), which obviously times out, therefore, we chose to use the line tree to help us solve this problem. In fact, the term "line tree" does not elaborate on its powerful functions well. I prefer his academic term "Interval Tree, like other trees, he can maintain the tree in O (logl) time, which means he can) within a period of time. It is precisely because of the efficiency of the Line Segment tree that it has been fully applied to a series of problems such as rmq, rectangular area merging, and perimeter. Here I want to talk about the rmq problem (that is, the problem of finding the maximum or lowest value in the range). As we all know, the rmq problem has an O (nlogn) preprocessing and O (1) offline algorithms for finding the maximum or minimum values in any interval (the so-called offline algorithm means that the value of the interval cannot be dynamically changed or inserted in the process), that is, the St (sparse table) algorithm, in contrast, the line segment tree does not have an efficient advantage, but the st algorithm has a limitation, that is, it does not support online operations, and the line segment tree does not have this restriction. Therefore, we can see that the line segment tree is powerful. Another advanced application of the Line Segment tree is the maintenance of the most value information in the interval. There are many typical problems and some difficulties, pku_2482, pku_1151 (calculate the area after N rectangles are merged), pku_1177 (calculate the perimeter of the border after N rectangles are merged), and so on. The basic idea of maintaining the greatest value is to recursively maintain each subtree and use the information of the subtree to maintain the father.

(8). String (including KMP algorithm, trie tree, suffix array)

String processing is also a considerable area of knowledge in ACM, and it also has a considerable practical application. In fact, I have little knowledge about strings, therefore, we can only briefly talk about several basic algorithms. The KMP algorithm is the most basic linear algorithm for string matching. The core of this algorithm is its understanding of the next array, this array is obtained by preprocessing a string, which successfully reduces the complexity of matching the two strings to a linear value. However, the KMP algorithm can only match two strings. What if we want to match multiple strings? Trie helps us solve this problem. Trie is actually a letter tree. Each node of the tree has 26 English letters, and a string is inserted by marking these nodes, after inserting n strings, we can match the N strings at the same time. The trie tree has a major disadvantage, that is, the space he needs is exponential. If the length of all strings exceeds 15, we should consider other methods. The last part is the suffix array. The main essence of the algorithm lies in the understanding and application of the height array. two articles in the National Team papers specifically introduce the suffix array. I will not go into details here.

(9) Graph Theory (including knowledge points such as the shortest path, the least spanning tree, And the strongly connected component)

The reason why we put the graph theory content at the end is that there are too many knowledge points in graph theory and the aspects involved are too wide, I would like to give a brief summary of my questions. The first is the shortest path, which is mainly divided into multiple sources and the shortest path. The algorithm used is a very classic Floyd algorithm with the complexity of O (N ^ 3), which is quite simple to implement, the main idea is DP. There is also the single-source shortest circuit. The most primitive method is Bellman-Ford, but the probability of this algorithm is not high, because he has a very good alternative, that is, spfa, the implementation of spfa is a bit similar to a wide search, and the code is very short. The question of whether a negative ring exists in a graph for Bellman-Ford can be completely replaced, in addition, finding the shortest path in the sparse graph is more efficient than using the Dijkstra algorithm optimized by the priority queue. It is indeed a practical good thing. Of course, the most famous Dijkstra algorithm is definitely not to mention. This algorithm is the most commonly used algorithm for finding the shortest path, but because he uses the greedy principle, therefore, you must note that Dijkstra cannot process images with negative edges, while Dijkstra also has several very classic variants. One is to extend Dijkstra to two-dimensional to find short circuits, the other is to use a * To find a short circuit. Have you noticed that the division between different knowledge blocks is becoming increasingly unclear when you learn it? Because our brain is gradually forming a knowledge network, each knowledge point is organically linked together. The second big point is the minimal spanning tree, which also has two well-known algorithms, the prim algorithm for dense graphs and the kruscal Algorithm for sparse graphs (the algorithm used to query and query sets in this algorithm). The minimum spanning tree also has some classic variants, such as the next generation tree, minimum Limit generation tree. Another very important issue is the problem of binary matching, which involves many knowledge points, find the maximum matching of a non-weighted bipartite graph, including the most classic Hungary algorithm and the minimum/large matching km Algorithm for the weighted bipartite graph, however, the knowledge points derived from the maximum number of matching for a bipartite graph without weight have the minimum coverage problem and the minimum path coverage problem. The concept of this item is relatively strong, the biggest feature of graph theory is its strong concept and many variants. If you do not have a deep understanding of each problem and its classical algorithms, you will not be able to cope with it. This is the Euler Loop problem. This type of problem is simple. There are two types of problems: one is to judge whether there is an Euler loop in the diagram, and the other is to find one of the Euler's loops, implementation is also very simple, and a DFS can be completed. There is also a strongly connected component. The core of this algorithm is to calculate the point of the strongly connected component of a graph and then convert it into a directed acyclic graph, this makes it easier for us to continue. The last major part is the network stream. I have little to do with this content. It mainly involves several classic network stream algorithms, such as hlpp, ISAP, and EK. There are also many network stream variants, exercise is required.

(10) ACM final match

I have participated in two formal competitions. One is the Shanghai Invitational competition under the sophomore year, and the other is the Hefei semi-finals on the 3rd day. Due to the limited level, after all, I still won only two bronze medals. Next I want to talk about my team members' personal feelings. The first is the composition of the team members, at least one person with strong coding ability is mainly responsible for coding to increase the speed of simple questions. In addition, ACM is more and more fond of mathematics problems, in addition, the idea of good mathematics is usually relatively broad, and the idea can be provided to the entire team. The other one is that the algorithm is widely used by people who have a lot of knowledge about algorithms, there are also a large number of questions. Although there may not be any strong ones, his rich experience in question-making ensures that he can sense the algorithm of a question and give directions to the entire team. Here, my summary is coming to an end. I hope you can get help.