Linear programming (Linear PROGRAMMING,LP) is one of the most classical algorithms, and the most common method to solve this problem is simplex method. This blog is dedicated to explaining the process of simplex algorithm and explaining it in detail in the simplest and most detailed language.
In the middle school curriculum, we are all simply in touch with linear programming, at that time, we usually analyze each constraint, draw a straight line on a two-dimensional plane, get the feasible domain, then make the target function straight line with fixed slope, move the straight line in the feasible region, the intercept on the y-axis is the optimal solution. The most optimal solution is the vertex of the feasible region through (convex). As in the following example:
\[\begin{equation}\begin{split}\max& \quad x_3+x_4 \s.t.& \quad-x_3+2x_4\leq 2 \&\quad 3x_3-2x_4\leq 6 \& Amp;\quad x_3, X_4\geq 0\end{split}\end{equation}\]
The blue area is a feasible field, the red line is a fixed slope, and the target function takes the maximum value when moving up to the (4,3) point.
And we will prove that the optimal solution must be one of the vertices of the feasible field (convex super geometry). So, let's assume that this is the case, using the "Improve-stop" (improve-stopping) routine, which is a vertex of a given feasible field, evaluates to the next vertex along one edge to see if the solution can be improved until the stop requirement is reached.
Here are a few questions to ask: why should the optimal solution be at the apex? How do you get the vertices? How to implement along one edge to the next vertex? When does it stop? Next, we will answer these questions, and when the answers are answered, the Simplex method is obvious.
1, convex polygon optimal solution at the vertex
Considering the minimization problem, the objective function \ (\mathbf{c}^t \mathbf{x}\)has an optimal solution within the feasible domain \ (\mathbf{x}^{(0)}\), Because any point inside a convex polygon can be represented as a linear combination of vertices, i.e. for vertices \ (\mathbf{x}^{(k)}, K=1,2,\cdots, n\), there are
\[\mathbf{x}^{(0)}=\sum_{k=1}^{n}\lambda_k\mathbf{x}^{(k)},\]
\[\sum_{k=1}^n \lambda_k=1\]
Assuming \ (\mathbf{x}^{(i)}\) is the smallest vertex in all vertices that makes \ (\mathbf{c}^t \mathbf{x}\) , then there are
\[\begin{equation}\begin{split}\mathbf{c}^t\mathbf{x}^{(0)} &= \sum_{k=1}^{n}\lambda_k\mathbf{c}^t\mathbf{x} ^{(k)} \&\geq \sum_{k=1}^{n}\lambda_k\mathbf{c}^t\mathbf{x}^{(i)} \&= \mathbf{c}^t\mathbf{x}^{(i)}\end{ Split}\end{equation}\]
Therefore, there is always a vertex whose target function value is no more than the internal spread.
2, how to get a convex polygon vertex?
The following is a theorem: for the feasible domain equation set \ (\mathbf{ax=b}\), the equation determines the convex polygon of any one vertex corresponding to a group of \ (\mathbf{a}\) a set of bases.
2.1 Vertices correspond to a set of base
The above example is a relaxed form of the constraint, the original variable has three, plus 4 after the next to become an equation, the formation of the feasible field as shown. We take a vertex (0,2,0) analysis, in the constraint, we can calculate a complete solution\ (x= (0,2,0,2,2,3,0) \)。 Remove matrix\ (\mathbf{a}\)In the corresponding\ (x\)Columns that are not 0, that is, a few columns of blue in the chart (with\ (\mathbf{a}_2\),\ (\mathbf{a}_4\),\ (\mathbf{a}_5\),\ (\mathbf{a}_6\)These columns are linearly independent, consider\ (m<n\)(The number of constraints is less than the number of loose variables), then the free solution has\ (n-m\)Dimension, so the chosen column must have\ (m\)Form a group of bases. The following mainly explains why they are linearly unrelated. Assuming they are linearly correlated, there is a set of\ (\lambda\neq\mathbf{0}\)Makes\ (\lambda_2\mathbf{a}_2+\lambda_4\mathbf{a}_4+\lambda_5\mathbf{a}_5+\lambda_6\mathbf{a}_6=0\), which means that you can construct\ (\lambda=[0, \lambda_2, 0, \lambda_4, \lambda_5, \lambda_6, 0]\)Makes\ (\mathbf{a}\lambda=0\)。 Then you can construct two more of the different\ (x\)The solution:\ (x ' =x+\theta\lambda\)And\ (x ' =x-\theta\lambda\)。 They're all satisfied.\ (\mathbf{ax=b}\)。 And can be controlled by\ (\theta\)Take a very small positive value, so that both solutions satisfy a constraint that is greater than 0. Both of these solutions are in convex and multi-faceted bodies and have\ (X=\frac{1}{2} (x ' +x ') \), but this is problematic because the vertices of a convex multipatch cannot be represented linearly by the inner point, so these columns form a set of bases.
Here, we can also for each group of solutions, the \ (\mathbf{a}\) column rearranged, the solution vector is also arranged, as a block matrix form, then there will be a \ (\mathbf{x}_{\mathbf{b}}=\ mathbf{b}^{-1}\mathbf{b}\) and \ (\mathbf{c^tx=c_b^tb^{-1}b}\). These are two useful formulas, which are helpful in the understanding of the simplex algorithm at the back, which is noted here first.
2.2 A group of bases corresponding to a base feasible solution (vertex)
With the knowledge above, given a set of base \ (\mathbf{b}\), we directly construct \ (\mathbf{x}=[\mathbf{b^{-1}b}, \mathbf{0}]^t\), As long as it means that he cannot be represented by two of the internal points that are different from his. Suppose there are two internal points \ (\mathbf{x}_1, \mathbf{x}_2\), satisfying \ (\mathbf{x}=\lambda_1\mathbf{x}_1+\lambda_2\mathbf{x}_2\ ), because \ (\mathbf{x_n}=0\), and the elements of these solutions are greater than or equal to 0, so \ (\mathbf{x_1}_{\mathbf{n}}=\mathbf{x_2}_{\mathbf{n} }=0\). And because \ (\mathbf{ax_1=ax_2=b}\), so \ (\mathbf{x_1}_{\mathbf{b}}=\mathbf{x_2}_{\mathbf{b}}=\mathbf{b^{- 1}b}\). That is, the two understand \ (\mathbf{x}\) the same, so \ (\mathbf{x}\) is the vertex.
3 How do I move from one vertex to another vertex?
Here is to study how to improve a solution, we need to know how to start from one vertex along the edge to find another vertex. Before we know that a vertex corresponds to a set of bases, and there are multiple bases of a matrix, is it possible to transform the vertex by a base transformation? Let's look at an example first.
The red dot in the figure corresponds to a complete solution\ (\mathbf{x}=[2,0,0,2,0,3,6]\), the corresponding base is\ (\mathbf{b}=\{\mathbf{a}_1,\mathbf{a}_4,\mathbf{a}_6,\mathbf{a}_7\}\), consider the vector\ (\mathbf{a}_3\), which is the green column, which he can express as:
\[\mathbf{a}_3=0\mathbf{a}_1+1\mathbf{a}_4+1\mathbf{a}_6+1\mathbf{a}_7\]
The complement of the formula, there will be
\[0\mathbf{a}_1+0\mathbf{a}_2-1\mathbf{a}_3+1\mathbf{a}_4++0\mathbf{a}_5+1\mathbf{a}_6+1\mathbf{a}_7=0\]
Write out the coefficients:\ (\lambda=[0,0,-1,1,0,1,1]\), he is the corresponding green vector (in the opposite direction). Therefore, it is only in this direction that the appropriate steps are followed.\ (\theta\), you can reach the next vertex. That is, the new vertex and the old vertex relationship are:
\[\mathbf{x '}=\mathbf{x}-\theta\mathbf{\lambda} (\theta>0) \]
How big is that?\ (\theta\)Is it appropriate? It's easy for us to know\ (\mathbf{x '}\)To meet\ (\mathbf{ax=b}\)Because\ (\mathbf{a\lambda=0}\), now is to ensure that\ (\mathbf{x '}\)Each component is greater than or equal to 0. For\ (\lambda_i\leq0\), minus more than 0 after subtracting, no problem. But for\ (\lambda_i>0\)The item, be careful, in order to ensure that the subtraction is still greater than or equal to 0, we set
\[\theta=\min\limits_{\mathbf{a}_i\in \mathbf{b},\lambda_i>0}\frac{x_i}{\lambda_i}\]
You can guarantee\ (\mathbf{x '}\geq0\)。 In the example above,\ (\theta=2\), the new solution is\ (\mathbf{x '}=[2,0,2,0,0,1,4]\), the corresponding base is\ (\mathbf{b '}=\{\mathbf{a}_1,\mathbf{a}_3,\mathbf{a}_6,\mathbf{a}_7\}\), here, everything looks perfect, we find a way to move to the next vertex, that is, to find a non-base vector, he is written in the form of a base vector representation, to propose coefficients, determine the step size, to obtain a new solution. But there is one small problem that we see actually\ (\mathbf{b '}\)And\ (\mathbf{b}\)A vector is the equivalent of a\ (\mathbf{a_4}\)Change it, put\ (\mathbf{a}_3\)Change it in. We call this process a replacement, and the later algorithm realizes the part called Pivot. So, how do you promise to change a vector and still be a base? Prove it:
Proof:\ (\mathbf{b '}=\mathbf{b}-\{\mathbf{a}_l\}+\{\mathbf{a}_e\}\) is still base. (l means leave,e means enter)
Assume\ (\mathbf{b '}\)Linear correlation, then there is\ (<d_1,d_2,\ldots,d_{l-1},d_{l+1},\ldots,d_m, d_e>\neq0\)Makes\ (\sum_{k}d_k\mathbf{a}_k=0\)。 and\ (\mathbf{a}_e=\sum_{i=1}^m \lambda_i\mathbf{a_i}\), substituting:
\[(d_1+d_e\lambda_1) \mathbf{a_1}+\ldots+ (d_e\lambda_l) \mathbf{a}_l+\ldots+ (d_m+d_e\lambda_m) \mathbf{a}_m=0 \]
Here is the key to proof: we're setting\ (\theta\)If the final election makes\ (\frac{x_i}{\lambda_i}\)The smallest one is subscript\ (p\), then the new solution that gets the first\ (p\)The item must be 0, equivalent to\ (\mathbf{a}_p\)Change it out. In the above example the heavy\ (p=4\)。 And because we only consider\ (\lambda_i>0\)Base vector, so the base is swapped out\ (\mathbf{a}_l\)corresponding to the\ (\lambda_l>0\), so the above formula has\ (d_1=d_2=\ldots=d_m=d_e=0\), and contradictory to the original hypothesis, so\ (\mathbf{b '}\)is also linearly independent.
As of now, we have answered three questions, basically the simplex algorithm is present, the simplex algorithm is through the repeated base transformation (through the vector in and out) to find vertices, so as to find the vertex to reach the optimal value. But there are some details that need to be figured out, for example, how to select a base vector? When does the process of improvement stop?
4 selection and stopping criteria of the base vector
To minimize the problem as the standard, our ultimate goal is to minimize \ (\mathbf{c^tx}\), so a very natural greedy idea is that every step of the improvement is as large as possible, so you can calculate the updated target function value and the original difference, Take the vertices that make the changes larger to continue the next iteration. So how can this difference be quantified to calculate? Only by the quantification of the calculation, to avoid one-to-one calculation of the comparison, improve efficiency.
Suppose \ (\mathbf{b}=\{\mathbf{a}_1, \mathbf{a}_2,\ldots, \mathbf{a}_m\}\), then for any non-base vector \ (\mathbf{a}_e\) , there are \ (\mathbf{a}_e=\lambda_1\mathbf{a}_1+\ldots+\lambda_m\mathbf{a}_m\). Write \ (\lambda\) complete:\ (\lambda=[\lambda_1,\lambda_2,\ldots,\lambda_m,0,0,\ldots,-1,\ldots,0]^t\), Difference \ (\MATHBF{C^TX '-c^tx=c^t (-\THETA\LAMBDA) =\theta (c_e-\sum_{a_i\in b}\lambda_ic_i)}\), So we're going to choose the \ (\mathbf{a}_e\)that makes this value the absolute maximum. So when does it mean that finding the best value should stop? Obviously, for all \ (\mathbf{a}_e\), this difference is greater than or equal to 0, that is, the target function is no longer reduced. Therefore, for each iteration, the difference is calculated first, and if there is less than 0, select a base vector that makes the absolute value of the difference the most.
Well, the next step is to consider vectorization operations. First, let's take a look\ (\lambda\)Can it be quantified that: because\ (\mathbf{b}\lambda=\mathbf{a}_e\)(\ (\lambda\)Only the base factor portion), so\ (\lambda=\mathbf{b^{-1}}\mathbf{a}_e\), if the entire matrix\ (\mathbf{a}\)Left multiply\ (\mathbf{b^{-1}}\), it is interesting that all non-base columns will become the coefficients represented by the base vectors of the non-base vectors, and all the base columns will become\ (e_k\), which is the form of an array of units together. This is a key step in the simplex algorithm implementation is also involved in the first note. Further, we take\ (\mathbf{c}\)The base part\ (\mathbf{c_b}\), so\ (\mathbf{c_b^tb^{-1}a}\)is equal to the above-described\ (\sum_{\mathbf{a}_i\in \mathbf{b}}\lambda_i\mathbf{c}_i\)The Vectorization form (working with all non-base vectors). And then add a part, and become\ (\mathbf{c^t-c_b^tb^{-1}a}\), which is the final form, called the test number\ (\bar{\mathbf{c}}\)。 It is easy to verify that the base vector corresponds to the number of tests is 0, our goal is to iterate, so that\ (\bar{\mathbf{c}}\geq0\), and for any feasible solution\ (\mathbf{y}\)(\ (\mathbf{ay=b,y\geq0}\)), there are\ (\mathbf{c^ty}\geq\mathbf{c_b^tb^{-1}ay=c_b^tb^{-1}b=c_b^tx_b=c^tx}\), i.e.\ (\mathbf{x}\)is the best.
5 Simple algorithm core: Simplex table
Finally, the end of a series of explanations, simple algorithm is also a logical. It's not easy to put a bunch of things together into a short, efficient, workable algorithm. First paste the algorithm pseudo-code:
And a very beautiful simple form:
Now, let's analyze the algorithm and match every operation of the algorithm to the one we described above.
- The simplex algorithm first called Initializesimplex to find an initial base feasible solution, which is very simple in some cases, when \ (\mathbf{b}\geq0\) , he is an initial base feasible solution, otherwise, may need to use two-stage method, Big M law and so on, this is not the point.
- In the while loop, as soon as a \ (c_e<0\)is found, the iteration continues. The 10th to 16th line is to find the base vector by setting \ (\theta\) .
- The most important and interesting of all is the pivot algorithm, which cleverly uses a simple Gaussian line transform for the complex operation we are introducing. And the vector of this algorithm is simplex table, for example, the upper left corner is the target function value opposite number \ (-z\), the first line is the test number \ (\bar{\mathbf{c}}\), the bottom left corner is the base corresponding partial solution (the other part is 0, do not write), The lower-right corner is the matrix \ (\mathbf{a}\). He is always divided into two parts, and the base vector part always exists in the form of a unit array. Note that the left part of the solution each component is strictly corresponding to a base vector, that is, they are in order.
- The
- looks at the pivot algorithm line by line. The input parameter tells us that the vector labeled \ (l\) is swapped out, so find his corresponding line, temporarily called the \ (l\) line , the corresponding base subscript for this line is replaced by the \ (e\) , so why the corresponding solution after the update is \ (\frac{b_l}{a_{ le}}\) , note that this value is actually \ (\theta\) , \ (b_l\) Is the old \ (x_i\) , \ (a_{le}\) is \ (\lambda_i\) (above has explained the multiply \ (\mathbf{b^{-1}}\) After each column is a factor). So why is the \ (\theta\) after the update? We return to the formula \ (\mathbf{x ' =x-\theta\lambda}\) due to \ (b_l\) The now corresponds to a new vector that is not the base vector corresponding to the \ (\mathbf{x}\) , so \ (\mathbf{x }\) The value at this location is 0, and we know that \ (\lambda\) has a value of 1 in the position corresponding to the base vector, so \ (0-( -1) \theta=\theta\) .
- Line 3rd to 4th, dividing the l\ line by \ (a_{le}\), is to change \ (a_{le}\) to 1, because the base is always maintained in the form of a unit array.
- The 8th line is the operation of \ (\mathbf{x ' =x-\theta\lambda}\) to get a new solution.
- Line 10th, Gauss transform, you will find that after this operation, into the base column will become and just out of the base column, the Gauss transformation to ensure the properties of the matrix.
- Line 14th, we know \ (-z=0-\mathbf{c_b^tb^{-1}b}\), because the old base corresponding to the \ (c\) is 0, and only the newly-swapped vector corresponding to the \ (c_e\) is not 0, Specifically, the lost part is only the product of \ (c_e\) and his corresponding solution \ (b_l\) . Similarly, line 16th,\ (\mathbf{c^t-c_b^tb^{-1}a}\), because also only \ (c_e\) is not 0, so and he corresponds to the \ (\mathbf{a}\) the first \ (l\) rows are multiplied.
To this, finally introduced the simplex algorithm. Other things to be aware of, such as must note that the test number and the original objective function \ (\mathbf{c}\) is a completely different concept, in the case of the original constraint is an inequality, need to add relaxation variables, they may be equal, but the heart must distinguish them, at the same time, this case , the base is very easy to find, that is, the number of columns of relaxation variables constitute the unit array. However, if the original constraint is an equation, it is necessary to find the base itself, and the test number is often different from the target function parameter.
Finally, this article is used by the Chinese Academy of Sciences, Computer B teacher's curriculum ppt, I study the course also benefited a lot, the simplex algorithm also studied a relatively long time, so write this article, I hope to give you a learning reference! There may be mistakes, please correct the discussion.
Detailed analysis of simplex algorithm