Discussion on solving maximal sub-matrix problem with Maximal thought

Source: Internet
Author: User
Tags require first row valid
【摘要】

This paper introduces the application of maximal thought in this kind of problem, aiming at the problem of maximal (or optimal) sub-rectangle and related deformation in recent times. Two algorithms with certain generality are analyzed. Some examples are presented to explain the techniques of selecting and using these algorithms.

 

【关键字】 矩形,障碍点,极大子矩形

 

【正文】 first, the question

Maximum sub-rectangle problem: There are some obstacle points in a given rectangular grid to find the largest sub-rectangle in the mesh that does not contain any barrier points, and the boundary is parallel to the axis.

This is a recurring problem, such as the "cow's bath" in Winter Camp 2002, which belongs to the largest sub-rectangular problem.

Winter Camp2002, cow's bath

Test Instructions Brief description : (original topic see paper Attachment )

John wants to build a large bath in a rectangular bull farm, but this large bathing bath cannot contain any dairy cows, but the milk-producing spots can be found on the border of the baths. John's cattle farm and the planned baths are rectangular, the baths are located entirely within the cattle farm, and the baths ' contours are parallel or coincident with the outline of the cattle farm. Ask for the size of the bathing area as large as possible.

Parameter convention: The number of milk production point S is not more than 5000, the range of cattle field nxm not more than 30000x30000.

ii. definition and Description

First clarify some concepts.

1. Define a sub-rectangle that has a valid sub-rectangle that is internal without any barrier points and the boundary is parallel to the axis. As shown in the figure, the first is a valid sub-rectangle (although there is a barrier on the boundary) and the second is not a valid sub-rectangle (because the interior contains a barrier point).

2, the maximal effective sub -rectangle: A valid sub-rectangle, if there is no valid sub-rectangle containing it and larger than it, it is called the effective sub-rectangle is a maximal effective sub-rectangle. (For convenience of narration, the following is called the maximal sub-rectangle )

3. Define the maximum effective sub-rectangle to be the largest (or more) of all valid sub-rectangles. The following abbreviation is the maximum child rectangle.

third, the idea of maximum

"Theorem 1" the largest sub-rectangle in a rectangle with a barrier point must be a maximal sub-rectangle.

Proof: If the maximum sub-rectangle A is not a maximal sub-rectangle, then according to the definition of the maximal sub-rectangle, there is a and a larger than a valid sub-rectangle, which and "A is the largest sub-rectangle" Contradiction, so "Theorem 1" was established.

starting from the characteristics of the problem, it is very important to get two commonly used algorithm theorems 1, although it is obvious. According to theorem 1, we can get a solution to this problem:

By enumerating all the maximal child rectangles, you can find the largest child rectangle. The algorithm is designed according to this idea.

Convention: for convenience of narration, the size of the whole rectangle is NXM, and the number of obstacles is s.

Algorithm 1

The idea of the algorithm is to find the maximum sub-rectangle by enumerating all the maximal sub-rectangles. According to this idea, if the algorithm has a single enumeration of the sub-rectangle is not a valid sub-rectangle, or not a maximum sub-rectangle, then you can be sure that the algorithm did "no work", which is the need to optimize the place. How to ensure that each enumeration is a maximal sub-rectangle, we first start with the characteristics of the maximal sub-rectangle.


theorem 2: Four edges of a large sub-rectangle must not scale outward. Further, the necessary and sufficient condition that a valid sub-rectangle is a maximal sub-rectangle is that each edge of the sub-rectangle is either covered by a barrier point or coincident with the boundary of the entire rectangle.



The correctness of theorem 2 is obvious, if an edge of a valid sub-rectangle does not overlap a barrier point, and does not coincide with the boundary of the entire rectangle

, there must be a valid sub-rectangle that contains it. According to theorem 2, we can get an algorithm that enumerates the maximal sub-rectangles. For ease of handling, the points on the corners of the entire rectangle are added first in the set of barrier points. Each time you enumerate the upper and lower left and right boundaries of the sub-rectangle (enumerate the barrier points covered), and then determine whether it is legal (there is an internal barrier point). This algorithm has a time complexity of O (S5), which is obviously too high. Considering that the maximum sub-rectangle cannot contain barrier points, it is obvious that enumerating 4 boundaries will result in a large number of invalid sub-rectangles.

Consider enumerating only the left and right boundary conditions. For the left and right boundaries that have been determined, you can sort all the points that are within this boundary from top to bottom, as shown in Figure 1, where each square represents a valid sub-rectangle. The time complexity of doing this is O (S3). Because the rectangle is guaranteed to be valid every time, the enumerator is much smaller than the previous algorithm. It is important to note, though, that the sub-rectangles of the enumeration are legal, but not necessarily extremely

Of So the algorithm also has room for optimization. By optimizing this algorithm, we can get an efficient algorithm.


Looking back at the algorithm above, it is not difficult to find that the upper and lower bounds of the enumerated rectangle cover the barrier point or coincide with the boundary of the entire rectangle, the problem lies in the left and right boundary. Only those effective sub-rectangles that have the left and right borders covered by the barrier or coincident with the entire bounding rectangle are the maximal sub-rectangles we need to examine, so the previous algorithm does a lot of "useless". How to reduce "useless", here is an algorithm (algorithm 1), it can be used in many of these topics.

The idea of the algorithm is to enumerate the left boundary of the maximal sub-rectangle first, then scan each obstacle point from left to right, and modify the feasible upper and lower bounds, thus enumerating all the maximal sub-rectangles with this fixed-point as the left boundary. Consider the three points in Figure 2, and now we want to determine all the great rectangles with the left edge of Point 1th. Sort the point at the right of point 1th by the horizontal axis. Then, in order from left to right, scan the point at the right of Point 1th, and record the current feasible upper and lower bounds.

Start seasonal the current top and bottom boundaries are the upper and lower bounds of the entire rectangle, respectively. And then start scanning. The first time you encounter point 2nd, take point 2nd as the right boundary, combined with the current upper and lower bounds, you get a maximal sub-rectangle (Figure 3). At the same time, because the rectangle cannot contain 2nd points, and point 2nd is below the 1th point, you need to modify the current lower boundary, that is, the ordinate of the 2nd point as the new bottom boundary. The second encounter 3rd points, then the 3rd point of the horizontal axis as the right boundary and can be satisfied with the nature of a 1 rectangle (Figure 4). Similarly, the upper boundary needs to be modified accordingly. And so on, if the point is above the current point (the point at which the left border is determined), the upper boundary is modified, if it is below, the bottom boundary is modified, and if it is in the same row, the search is aborted (because the rectangular area behind it is 0). Since the increase in the set of barrier points has



Does this enumerate all the great sub-rectangles? As you can see, this only takes into account the rectangles that cover a point on the left, so we also need to enumerate the left boundary coincident with the left edge of the entire rectangle. This can also be divided into two types of situations. One is the case where the left border coincides with the entire left boundary, and the right boundary covers a barrier point, in which case a similar approach can be used to scan each point from right to left as the right boundary. The other is that the left and right boundary are coincident with the entire rectangle's left and right boundary, for this kind of situation we can be completed in preprocessing: First, all points are sorted by the ordinate, and then you can get the coordinates of the adjacent two points for the upper and lower bounds, the left and right boundary and the entire rectangle coincident with the left, It is obvious that such a rectangle is also a large sub-rectangle, so it needs to be enumerated. Two points in the upper right and bottom right corner of the entire rectangle are added, so the maximal sub-rectangles (Figure 5) are not omitted from the right side of the bounding rectangle. It is important to note that if the scanned point is not within the current upper and lower bounds, you do not need to enter this point

Line processing.

With the previous two steps, you can enumerate all the maximal sub-rectangles. The time complexity of algorithm 1 is O (S2). In this way, most of the largest sub-rectangles and related issues can be resolved.


Although the above algorithm (algorithm 1) seems to be more efficient, but also has the limitations of use. It can be found that the complexity of this algorithm is only related to the number of points of the barrier. However, for some problems, S is likely to reach Nxm, when S is large, the algorithm may not meet the requirements of the time. Can you design an algorithm that relies on N and M? So we have other options when the algorithm 1 doesn't work. Let's start again with the most basic questions.

algorithm 2

First, according to theorem 1: The maximum effective sub-rectangle must be a maximal sub-rectangle. However, unlike the previous algorithm, we no longer require that each enumeration must be a maximal sub-rectangle and only require that all the maximal sub-rectangles be enumerated. It seems that this algorithm may be worse than the previous one, but it is not, because the previous algorithm is not perfect: Although every time the investigation is the maximal sub-rectangle, but it has done a certain amount of "no work." It can be found that the previous algorithm does a lot of useless comparison work when the obstacle points are very dense. To solve this problem, we must jump out of the way ahead and reconsider a new algorithm. Notice that the number of maximal sub-rectangles does not exceed the number of units in the rectangle, so it is possible to find an algorithm with a time complexity of O (NXM).

Defined:

Effective Vertical line: The vertical segment of any barrier point is not covered except for two endpoints.

suspension : The upper point covers a barrier point or a valid vertical line that reaches the top of the entire rectangle. As shown in the figure, the three effective vertical bars are dangling lines.

For any one of the maximal sub-rectangles, it has either a barrier point on the upper boundary or coincident with the upper boundary of the entire rectangle. So if a large sub-rectangle is cut into multiple (and actually countless) lines perpendicular to the y axis by x-coordinate, there must be a dangling line. And a dangling line can get a sub-rectangle (not necessarily the maximal sub-rectangle, but may only scale down) by moving as far as possible to the left and right. Through the above analysis, we can get an important theorem.

" theorem 3": If a suspension is moved to the left and right two directions as far as possible the resulting effective sub-rectangle is called the corresponding sub-rectangle of the suspension line, then all the corresponding set of valid sub-rectangles of the suspension line must contain a set of all the maximal sub-rectangles.

The "as far as possible" movement in Theorem 3 refers to the position of moving to a barrier point or rectangular boundary.

According to "Theorem 3", it can be found that all the maximal sub-rectangles can be enumerated by enumerating all the dangling lines. Since each suspension corresponds to the point one by one at the bottom of it, the number of dangling lines = (n-1) XM (at the bottom of every point except the top point in the rectangle, you can get a dangling line without missing). If the operation time of each suspension is O (1), then the complexity of the algorithm is O (NM). In this way, we see the hope of solving the problem.

The question now is how to complete the operation of each suspension within the time of O (1). We know that each of the maximal sub-rectangles can be shifted left and right by a dangling line. So, for each one that determines the bottom of the suspension, we need to know about it in three quantities: top, left and right to move up to the position. For the bottom (i,j) of the suspension line, set its height of hight[i,j], the left and right to move up to the position of left[i,j],right[i,j]. In order to make the best use of the previously obtained information, we give these three functions in a recursive form.

For the suspension at the bottom of the point (I,J):

If the point (I-1,J) is a barrier point, it is obvious that the height of the suspension at (I,J) is 1, and the left and right can be moved to the left and right edges of the entire rectangle, i.e.

Height[i,j]=l

Left[i,j]=0

Right[i,j]=m

If the point (I-1,J) is not a barrier point, then the i,j-bottom suspension is equal to (i-1,j) the bottom of the line (I,J) to the point (I-1,j) segment. So, height[i,j]=height[i-1,j]+1. The more troublesome is the left and right border, first consider Left[i,j]. As shown in the following figure, the position of the corresponding i,j can be shifted on the basis of (I-1,J).

Left[i,j]=max (Left[i-1,j], (I-1,J) The first obstacle point position on the left) as shown in figure

RIGHT[I,J] is similar to the method of seeking. Together, you can get these three parameters in a recursive style:

Height[i,j]=height[i-1,j]+1

Left[i,j]=max (Left[i-1,j], (i-1,j) left first obstacle point position, boundary 0 is also barrier point)

Right[i,j]=max (Right[i-1,j], (I-1,J) to the right of the first obstacle point position, the boundary m is also the barrier point)

This makes full use of the previously obtained information so that the processing time of each suspension is O (1). For a sub-rectangle that corresponds to a point (i,j)-based suspension, its area is (Right[i,j]-left[i,j]) *height[i,j].

The solution to this final problem is:

Result=max (Right[i,j]-left[i,j]) *height[i,j] (l<=i<n, l<=j<=m)

The time complexity of the whole algorithm is O (nm), and the spatial complexity is O (nm).

Comparison of two algorithms:

The above mentioned two kinds of processing algorithms with certain generality, the time complexity is O (S2) and O (NM) respectively. The two algorithms are applicable to different situations respectively. From the time complexity point of view, the first algorithm is more effective for the sparse situation of the obstacle points, the second algorithm is not directly related to the number of obstacles (of course, the barrier point can be reduced by the dispersion of the coordinates of the point to reduce the area of the processing rectangle, but this is more troublesome than the first algorithm good), Suitable for situations where the barrier is dense.

v. Examples

The two algorithms proposed above are applied to specific problems. 1, Winter Camp2002, cow bath

Analysis :

The mathematical model of the problem is to give some obstacle points in the rectangle and rectangle, which requires the maximal effective sub-rectangle within the rectangle. This is the largest sub-rectangular problem We discussed earlier, so the first two algorithms are suitable for this problem.

The following analysis of the two algorithms used in the subject of the gifted slightly:

For the first algorithm, it can be applied directly to the problem without any modification, the time complexity is O (S2), S is the number of obstacle points, and the space complexity is O (S).

For the second algorithm, we need to do some preprocessing first. Because the second algorithm complexity is related to the area of the cattle field, and the area of the cattle field is very large (30000x30000), the data needs to be discretized. The size of the rectangle is reduced to SxS after discretization, so the time complexity is O (S2) and the space complexity is O (S). Note: It should be noted that in order to ensure the correct implementation of the algorithm, in the discretization of the need to add s points, so the actual need for time and space is large, and programming is more complex.

From the above analysis, whether from the space-time efficiency or programming complexity of the point of view, this problem using the first algorithm is more excellent. 2, Oibh simulation 1, improve the group, Candy

Test Instructions Brief description:(original topic see paper attachment)

A candy box that is divided into n*m lattice, where the first row of column J has a a[i,j] sugar in the lattice. But some of the boxes in the candy box were ransacked by rats. Now it is necessary to cut a rectangular candy box from the candy box as soon as possible, the new candy box can not have a hole, and want to keep in the new candy box of the total amount of sugar.

Parameter conventions: 1≤n,m≤1000

Analysis

The first thing to note is that the model of the subject is a matrix, not a rectangle. In the case of matrices, because the number of points is limited, a new problem arises: the maximal weight sub-matrix .

Defined:

A valid sub-matrix is a sub-rectangle that does not contain any barrier points inside. Unlike a valid sub-rectangle, an effective sub-matrix cannot contain barrier points on the ground boundary.

The weights of the valid sub-matrices (only valid sub-rectangles are entitled values) are the weights and values of all the points contained in this sub-matrix.

The maximal weight effective sub-matrix is one of the largest weights in all valid sub-matrices. The following abbreviation is the maximum weight sub-matrix .

The mathematical model of the subject is the maximal weight sub-matrix problem under the condition of positive weight. Once again, the maximum weight sub-matrix must be a maximal sub-matrix, since the weights in the matrix are positive. So we just need to enumerate all the maximal sub-matrices, and we can find the most powerful sub-matrices from them. Similarly, the two algorithms can solve the problem with just a little modification. The following analysis of the two algorithms applied in the subject of the advantages:

For the first algorithm, the number of barrier points in the rectangle is indeterminate, and the maximum possible is NXM, so that the time complexity may reach O (n2m2) and the space complexity is O (NM). In addition, because the rectangle differs from the matrix, there are some minor problems in processing.

For the second algorithm, a little transformation can be used directly, the time complexity of O (nm), the space complexity of O (nm).

As can be seen, the first algorithm is not suitable for this problem, so it is best to use the second algorithm. 3, Usaco Training, section 1.5.4, Bigbarn

Brief introduction of test instructions (original topic see paper attachment)

Farmer John wants to build a square barn on his square farm. Since there are some trees on the farm and farmer John does not want to chop the trees, it is important to find the largest square site that does not contain any trees. Each tree can be seen as a point.

Parameter convention: The cattle field is NXN, the tree number is T. n≤1000,t≤10000.

Analysis :

This is a problem on a rectangle, but the largest sub-square is required. First, define some concepts.

1. Define a valid sub-square as a sub-square that contains no barrier points inside

2. Define the maximal effective sub -square as a valid sub-square which can not be extended outward, or the maximal sub -square for abbreviation.

3. Define the maximum effective sub-square as the largest one (or more) of all valid sub-squares, hereinafter referred to as the largest sub-square .

The model of the subject has some special, to seek the maximal sub-square in a rectangle with some obstacle points. Is this similar to the model of the first two questions? Or starting from the nature of the largest sub-square.

Similar to the previous case, we can get a theorem using the idea of maximum:

" theorem 4": The largest effective sub-square in a rectangle with a barrier point must be a maximal effective sub-square.

According to "Theorem 4", we only need to enumerate all the maximal sub-squares, and we can find the largest sub-squares from it. What are the characteristics of the maximal sub-square? The so-called great, is no longer outward expansion. If it is a maximal sub-rectangle, then it is necessary and sufficient for the four edges to cover the barrier point ("Theorem 2"). Similarly, we can know that a valid sub-square is a maximal sub-square of the necessary and sufficient conditions is that it any two adjacent edges are covered by at least one obstacle point. According to this, an important theorem can be obtained.

theorem 5: Each of the maximal sub-squares is contained by at least one maximal sub-rectangle. And this maximal sub-square must have two nonadjacent sides coincident with the edge of this containing its maximal sub-rectangle.

According to "theorem 5", we only need to enumerate all the maximal sub-rectangles, and check that it contains the maximal sub-square (a large sub-rectangle containing the largest sub-square is the same size) whether the maximum is OK. Thus, the essence of the problem is the same as the largest sub-rectangular problem mentioned earlier, and the same algorithm is used.

Since both algorithm 1 and algorithm 2 have all the maximal sub-rectangles, both algorithm 1 and algorithm 2 can be used on the subject. The specific processing method is as follows: For each of the maximum sub-rectangles enumerated, as shown, if its side length is a, B, then it contains the side length of the maximal sub-square is the min (A, a, a, a,).

Considering the different sizes of N and T, different algorithms have different effects. The following analysis of the two algorithms applied in the subject of merit.

For the first algorithm, the time complexity is O (T2), and for the second algorithm, the time complexity is O (N2). Because of the n<t, the second algorithm is better than the first algorithm from the point of view of time complexity. Considering that the spatial complexity of the two algorithms can withstand, it is better to choose the second algorithm.

The following is the runtime of the first and second algorithms implemented on the Usaco Training program gateway. It can be seen that, when the data is large, the efficiency of algorithm 2 is higher than the algorithm 1.

Algorithm 1:

Test 1:0.009375

Test 2:0.009375

Test 3:0.009375

Test 4:0.009375

Test 5:0.009375

Test 6:0.009375

Test 7:0.021875

Test 8:0.025

Test 9:0.084375

Test 10:0.3875

Test 11:0.525

Test 12:0.5625

Test 13:0.690625

Test 14:0.71875

Test 15:0.75

Algorithm 2:

Test 1:0.009375

Test 2:0.009375

Test 3:0.009375

Test 4:0.009375

Test 5:0.009375

Test 6:0.00625

Test 7:0.009375

Test 8:0.009375

Test 9:0.0125

Test 10:0.021875

Test 11:0.028125

Test 12:0.03125

Test 13:0.03125

Test 14:0.03125

Test 15:0.034375

Above, using the maximal thought and the two algorithms in front of the design, through the transformation model, solved three some representative examples. The key to solving the problem is how to use the maximal thought to transform the model and how to choose the algorithm. Vi. Summary

The design algorithm should start from the basic characteristics of the problem and find out the breakthrough of solving problems. This paper introduces two algorithms which are suitable for most of the maximal sub-rectangular problems and related variant problems, and the breakthrough of their design is to use the idea of maximal, and find the method of enumerating maximal sub-rectangles.

In efficiency, the two algorithms differ in different situations. One is designed for barrier points, so the complexity is related to the obstacle point, and the other is designed for the entire rectangle, so the complexity is related to the area of the rectangle. Although the two algorithms seem to have huge differences, their nature is interlinked, using the idea of maximum, from enumerating all the most effective sub-rectangles to find a solution to the problem.

It should be noted that in solving the actual problem is not enough to apply some of the existing algorithms, but also need to conduct a comprehensive and thorough analysis of the problem to find a breakthrough.

In addition, the complexity of the two algorithms mentioned above can no longer be reduced by using the idea of maximum, since the number of maximal effective sub-rectangles is O (NM) or O (S2).  If other algorithms are used, it is theoretically possible to further improve the efficiency of the algorithm and reduce the complexity. vii. Appendices:

1, several examples of the original problem. See paper attachment. doc

2, the procedure of example. See paper attachment. doc

Description: All programs are compiled on the free Pascal IDE for Dos, Version 0.9.2 run bibliography

1. Information Science Olympic Competition guidance

Analysis of----1997~1998 Contest questions

Wu Wenhu Wang Jiande

2. IOI99 China Training Team Outstanding Essays

3. Information Science Olympiad (quarterly)

4, "Gold Road competition Guidance"

Jiangwen

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.