Discussion on solving maximal sub-matrix problem with Maximal thought

Source: Internet
Author: User
Tags comparison require valid

Summary
This paper introduces the application of maximal thought in this kind of problem, aiming at the problem of maximal (or optimal) sub-rectangle and related deformation in recent times. Two algorithms with certain generality are analyzed. Some examples are presented to explain the techniques of selecting and using these algorithms.

"Keyword" rectangle, barrier point, maximal sub-rectangle

"Body" one, the question

Maximum sub-rectangle problem: There are some obstacle points in a given rectangular grid to find the largest sub-rectangle in the mesh that does not contain any barrier points, and the boundary is parallel to the axis.

This is a recurring problem, such as the "cow's bath" in Winter Camp 2002, which belongs to the largest sub-rectangular problem.

Winter Camp2002, cow's bath
Test Instructions Brief Description: (original topic see paper attachment)
John wants to build a large bath in a rectangular bull farm, but this large bathing bath cannot contain any dairy cows, but the milk-producing spots can be found on the border of the baths. John's cattle farm and the planned baths are rectangular, the baths are located entirely within the cattle farm, and the baths ' contours are parallel or coincident with the outline of the cattle farm. Ask for the size of the bathing area as large as possible.
Parameter convention: The number of milk production point S is not more than 5000, the range of cattle field nxm not more than 30000x30000. ii. definition and Description

First clarify some concepts.

1. Define a sub-rectangle that has a valid sub-rectangle that is internal without any barrier points and the boundary is parallel to the axis. As shown in the figure, the first is a valid sub-rectangle (although there is a barrier on the boundary) and the second is not a valid sub-rectangle (because the interior contains a barrier point).


2, the maximal effective sub-rectangle: A valid sub-rectangle, if there is no valid sub-rectangle containing it and larger than it, it is called the effective sub-rectangle is a maximal effective sub-rectangle. (For convenience of narration, the following is called the maximal sub-rectangle)

3. Define the maximum effective sub-rectangle to be the largest (or more) of all valid sub-rectangles. The following abbreviation is the maximum child rectangle. Iii. The Maximal thought "Theorem 1" the largest sub-rectangle in a rectangle with a barrier point must be a maximal sub-rectangle.

Proof: If the maximum sub-rectangle A is not a maximal sub-rectangle, then according to the definition of the maximal sub-rectangle, there is a and a larger than a valid sub-rectangle, which and "A is the largest sub-rectangle" Contradiction, so "Theorem 1" was established. Starting from the characteristics of the problem, we get two kinds of common algorithms

Theorem 1, though obvious, is very important. According to theorem 1, we can get a solution to this problem:
By enumerating all the maximal child rectangles, you can find the largest child rectangle. The algorithm is designed according to this idea.
Convention: For convenience of narration, the size of the whole rectangle is NXM, and the number of obstacles is s. Algorithm 1

The idea of the algorithm is to find the maximum sub-rectangle by enumerating all the maximal sub-rectangles. According to this idea, if the algorithm has a single enumeration of the sub-rectangle is not a valid sub-rectangle, or not a maximum sub-rectangle, then you can be sure that the algorithm did "no work", which is the need to optimize the place. How to ensure that each enumeration is a maximal sub-rectangle, we first start with the characteristics of the maximal sub-rectangle.

Theorem 2: Four edges of a large sub-rectangle must not scale outward. Further, the necessary and sufficient condition that a valid sub-rectangle is a maximal sub-rectangle is that each edge of the sub-rectangle is either covered by a barrier point or coincident with the boundary of the entire rectangle.

The correctness of theorem 2 obviously, if an edge of a valid sub-rectangle does not overlap a barrier point, and does not coincide with the boundary of the entire rectangle, there must be a valid sub-rectangle containing it. According to theorem 2, we can get an algorithm that enumerates the maximal sub-rectangles. For ease of handling, the points on the corners of the entire rectangle are added first in the set of barrier points. Each time you enumerate the upper and lower left and right boundaries of the sub-rectangle (enumerate the barrier points covered), and then determine whether it is legal (there is an internal barrier point). This algorithm has a time complexity of O (s^5), which is obviously too high. Considering that the maximum sub-rectangle cannot contain barrier points, it is obvious that enumerating 4 boundaries will result in a large number of invalid sub-rectangles.

Consider enumerating only the left and right boundary conditions. For the left and right boundaries that have been determined, you can sort all the points that are within this boundary from top to bottom, as shown in Figure 1, where each square represents a valid sub-rectangle. The time complexity of doing this is O (s^3). Because the rectangle is guaranteed to be valid every time, the enumerator is much smaller than the previous algorithm. It is important to note, though, that the sub-rectangles of the enumeration are legitimate, but not necessarily great. So the algorithm also has room for optimization. By optimizing this algorithm, we can get an efficient algorithm.
Looking back at the algorithm above, it is not difficult to find that the upper and lower bounds of the enumerated rectangle cover the barrier point or coincide with the boundary of the entire rectangle, the problem lies in the left and right boundary. Only those effective sub-rectangles that have the left and right borders covered by the barrier or coincident with the entire bounding rectangle are the maximal sub-rectangles we need to examine, so the previous algorithm does a lot of "useless". How to reduce "useless", here is an algorithm (algorithm 1), it can be used in many of these topics.
The idea of the algorithm is to enumerate the left boundary of the maximal sub-rectangle first, then scan each obstacle point from left to right, and modify the feasible upper and lower bounds, thus enumerating all the maximal sub-rectangles with this fixed-point as the left boundary. Consider the three points in Figure 2, and now we want to determine all the great rectangles with the left edge of Point 1th. Sort the point at the right of point 1th by the horizontal axis. Then, in order from left to right, scan the point at the right of Point 1th, and record the current feasible upper and lower bounds.
Start seasonal the current top and bottom boundaries are the upper and lower bounds of the entire rectangle, respectively. And then start scanning. The first time you encounter point 2nd, take point 2nd as the right boundary, combined with the current upper and lower bounds, you get a maximal sub-rectangle (Figure 3). At the same time, because the rectangle cannot contain 2nd points, and point 2nd is below the 1th point, you need to modify the current lower boundary, that is, the ordinate of the 2nd point as the new bottom boundary. The second encounter 3rd points, then the 3rd point of the horizontal axis as the right boundary and can be satisfied with the nature of a 1 rectangle (Figure 4). Similarly, the upper boundary needs to be modified accordingly. And so on, if the point is above the current point (the point at which the left border is determined), the upper boundary is modified, if it is below, the bottom boundary is modified, and if it is in the same row, the search is aborted (because the rectangular area behind it is 0).



Does this enumerate all the great sub-rectangles? As you can see, this only takes into account the rectangles that cover a point on the left, so we also need to enumerate the left boundary coincident with the left edge of the entire rectangle. This can also be divided into two types of situations. One is the case where the left boundary coincides with the left edge of the entire rectangle, and the right boundary covers a barrier point, in which case a similar approach can be used to scan each point from right to left as the right boundary. The other is that the left and right boundary are coincident with the entire rectangle's left and right boundary, for this kind of situation we can be completed in preprocessing: First, all points are sorted by the ordinate, and then you can get the coordinates of the adjacent two points for the upper and lower bounds, the left and right boundary and the entire rectangle coincident with the left, It is obvious that such a rectangle is also a large sub-rectangle, so it needs to be enumerated. Two points in the upper right and bottom right corner of the entire rectangle are added, so the maximal sub-rectangles (Figure 5) are not omitted from the right side of the bounding rectangle. It is important to note that this point does not need to be processed if the scanned point is not within the current upper and lower bounds.


With the previous two steps, you can enumerate all the maximal sub-rectangles. The time complexity of algorithm 1 is O (S2). In this way, most of the largest sub-rectangles and related issues can be resolved.

Although the above algorithm (algorithm 1) seems to be more efficient, but also has the limitations of use. It can be found that the complexity of this algorithm is only related to the number of points of the barrier. However, for some problems, S is likely to reach Nxm, when S is large, the algorithm may not meet the requirements of the time. Can you design an algorithm that relies on N and M? So we have other options when the algorithm 1 doesn't work. Let's start again with the most basic questions. Algorithm 2

First, according to theorem 1: The maximum effective sub-rectangle must be a maximal sub-rectangle. However, unlike the previous algorithm, we no longer require that each enumeration must be a maximal sub-rectangle and only require that all the maximal sub-rectangles be enumerated. It seems that this algorithm may be worse than the previous one, but it is not, because the previous algorithm is not perfect: Although every time the investigation is the maximal sub-rectangle, but it has done a certain amount of "no work." It can be found that the previous algorithm does a lot of useless comparison work when the obstacle points are very dense. To solve this problem, we must jump out of the way ahead and reconsider a new algorithm. Notice that the number of maximal sub-rectangles does not exceed the number of units in the rectangle, so it is possible to find an algorithm with a time complexity of O (NXM).
definition: effective vertical line: Except for two endpoints, a vertical segment of any barrier point is not covered.

Suspension: The upper point covers a barrier point or a valid vertical line that reaches the top of the entire rectangle. As shown in the figure, the three effective vertical bars are dangling lines.
For any one of the maximal sub-rectangles, it has either a barrier point on the upper boundary or coincident with the upper boundary of the entire rectangle. So if a large sub-rectangle is cut into multiple (and actually countless) lines perpendicular to the y axis by x-coordinate, there must be a dangling line. And a dangling line can get a sub-rectangle (not necessarily the maximal sub-rectangle, but may only scale down) by moving as far as possible to the left and right. Through the above analysis, we can get an important theorem.

"Theorem 3": If a suspension is moved to the left and right two directions as far as possible the resulting effective sub-rectangle is called the corresponding sub-rectangle of the suspension line, then all the corresponding set of valid sub-rectangles of the suspension line must contain a set of all the maximal sub-rectangles.

The "as far as possible" movement in Theorem 3 refers to the position of moving to a barrier point or rectangular boundary.
According to "Theorem 3", it can be found that all the maximal sub-rectangles can be enumerated by enumerating all the dangling lines. Since each suspension corresponds to the point one by one at the bottom of it, the number of dangling lines = (n-1) XM (at the bottom of every point except the top point in the rectangle, you can get a dangling line without missing). If the operation time of each suspension is O (1), then the complexity of the algorithm is O (NM). In this way, we see the hope of solving the problem.

The question now is how to complete the operation of each suspension within the time of O (1). We know that each of the maximal sub-rectangles can be shifted left and right by a dangling line. So, for each one that determines the bottom of the suspension, we need to know about it in three quantities: top, left and right to move up to the position. For the bottom (i,j) of the suspension line, set its height of hight[i,j], the left and right to move up to the position of left[i,j],right[i,j]. In order to make the best use of the previously obtained information, we give these three functions in a recursive form.

For the suspension at the bottom of the point (I,J):
If the point (I-1,J) is a barrier point, it is obvious that the height of the suspension at (I,J) is 1, and the left and right can be moved to the left and right edges of the entire rectangle, i.e.

Height[i,j]=l
left[i,j]=0
right[i,j]=m+1

If the point (I-1,J) is not a barrier point, then the i,j-bottom suspension is equal to (i-1,j) the bottom of the line (I,J) to the point (I-1,j) segment. So, height[i,j]=height[i-1,j]+1. The more troublesome is the left and right border, first consider Left[i,j]. As shown in the following figure, the position of the corresponding i,j can be shifted on the basis of (I-1,J).
Left[i,j]=max (Left[i-1,j], (I-1,J) The first obstacle point position on the left) as shown in figure

RIGHT[I,J] is similar to the method of seeking. Together, you can get these three parameters in a recursive style:

Height[i,j]=height[i-1,j]+1
Left[i,j]=max (Left[i-1,j], (i,j) left first obstacle point position, boundary 0 is also barrier point)
right[i,j]=min (right[ I-1,J], (I,J) to the right of the first obstacle point position, the boundary m+1 is also a barrier point)


This makes full use of the previously obtained information so that the processing time of each suspension is O (1). For a sub-rectangle that corresponds to a point (i,j)-based suspension, its area is (Right[i,j]-left[i,j]) *height[i,j].
The solution to this final problem is:
Result=max (Right[i,j]-left[i,j]) *height[i,j] (l <= i < N, l <= j<= m)
The time complexity of the whole algorithm is O (nm), and the spatial complexity is O (nm).

Comparison of two algorithms:
The above mentioned two kinds of processing algorithms with certain generality, the time complexity is O (S2) and O (NM) respectively. The two algorithms are applicable to different situations respectively. From the time complexity point of view, the first algorithm is more effective for the sparse situation of the obstacle points, the second algorithm is not directly related to the number of obstacles (of course, the barrier point can be reduced by the dispersion of the coordinates of the point to reduce the area of the processing rectangle, but this is more troublesome than the first algorithm good), Suitable for situations where the barrier is dense. v. Examples

1, Winter Camp2002, cow bath

Analysis:
The mathematical model of the problem is to give some obstacle points in the rectangle and rectangle, which requires the maximal effective sub-rectangle within the rectangle. This is the largest sub-rectangular problem we discussed earlier, so the first two algorithms are suitable for this problem.

The following analysis of the two algorithms used in the subject of the gifted slightly:
For the first algorithm, it can be applied directly to the problem without any modification, the time complexity is O (S2), S is the number of obstacle points, and the space complexity is O (S).
For the second algorithm, we need to do some preprocessing first. Because the second algorithm complexity is related to the area of the cattle field, and the area of the cattle field is very large (30000x30000), the data needs to be discretized. The size of the rectangle is reduced to SxS after discretization, so the time complexity is O (S2) and the space complexity is O (S). Note: It should be noted that in order to ensure the correct implementation of the algorithm, in the discretization of the need to add s points, so the actual need for time and space is large, and programming is more complex.

From the above analysis, whether from the space-time efficiency or programming complexity of the point of view, this problem using the first algorithm is more excellent. Maximum sub-matrix problem template

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.