The Lagrange multiplier method (Lagrange Multiplier) and kkt condition are very important for solving the optimization problem with constrained conditions, and the Lagrange multiplier method can be used to find the optimal value for the optimization problem of equality constraint. You can apply KKT conditions to find out. Of course, the results obtained by these two methods are only necessary, and only if it is a convex function, the sufficient and necessary conditions can be ensured. The KKT condition is the generalization of Lagrange multiplier method. Before learning, only know the direct application of two methods, but do not know why the Lagrange multiplier method (Lagrange Multiplier) and Kkt conditions can work, why do you want to find the best value?
This article will first describe what is the Lagrange multiplier method (Lagrange Multiplier) and kkt conditions, and then begin to talk about why this is the optimal value.
I. Lagrange multiplier method (Lagrange Multiplier) and Kkt conditions
There are several types of optimization problems that we usually need to solve:
(i) unconstrained optimization problem, can be written as:
Min f (x);
(ii) An optimization problem with equality constraints, which can be written as:
Min f (x),
S.T. h_i (x) = 0; I =1, ..., n
(iii) An optimization problem with inequality constraints, which can be written as:
Min f (x),
S.T. G_i (x) <= 0; I =1, ..., n
H_j (x) = 0; J =1, ..., M
For the optimization problem of Class (i), the common method is the Fermat theorem, that is, to use the derivative of f (x), then zero, the candidate optimal values can be obtained, and then validated in these candidate values, if the convex function, it can be guaranteed to be the optimal solution.
For the optimization problem of Class (ii), the method commonly used is Lagrange multiplier (Lagrange Multiplier), that is, the equation constraint h_i (x) is written with a coefficient of f (x) as a formula, called the Lagrangian function, and the coefficients are called Lagrange multipliers. By using Lagrange function to derive the derivative of each variable, it is possible to obtain a set of candidate values and then verify the optimal value.
For the optimization problem of category (iii), the KKT condition is often used. Similarly, we write all equations, inequality constraints and F (x) as a formula, also called Lagrangian function, coefficients are also called Lagrange multipliers, through some conditions, we can find the necessary condition of the optimal value, this condition is called kkt condition.
(a) Lagrange multiplier method (Lagrange Multiplier)
For the equality constraint, we can combine the equality constraint and the objective function by a Lagrange coefficient A to form an equation L (A, X) = f (x) + a*h (x), where A and H (x) are considered vectors, a is the horizontal amount, H (x) is the column vector, It's all because CSDN is hard to write mathematical formulas, only to be ...
Then the optimal value can be obtained by the derivation of the parameters of L (a,x) 0, the conforming type to be obtained, this in the higher mathematics there is said, but did not say why this can be, in the back, will briefly introduce its ideas.
(b) Kkt conditions
For the optimization problem with inequality constraints, how to find the optimal value? The common method is to kkt the conditions, similarly, all inequalities, equality constraints and objective functions are all written as an equation L (A, b, x) = f (x) + a*g (x) +b*h (x), Kkt condition is that the optimal value must meet the following conditions:
1. L (A, B, x) the derivative of x is zero;
2. h (x) = 0;
3. A*g (x) = 0;
The candidate optimal values can be obtained after the three equations are taken. The third is very interesting, because G (x) <=0, if you want to satisfy this equation, you must a=0 or g (x) =0. This is a source of many important properties of SVM, such as the concept of support vectors.
Two. Why is the Lagrange multiplier method (Lagrange Multiplier) and kkt conditions able to get the optimal value?
Why do you want to get the best value? First, the Lagrange multiplier method. Imagine our target function z = f (x), x is a vector, Z takes a different value, is equivalent to a plane (surface) that can be projected on the X, that becomes the contour line, i.e. the target function is f (x, y), where x is a scalar, the dashed line is the contour, now suppose our constraint g (x) = 0,x is a vector, which is a curve on a plane or surface formed by x, assuming that g (x) intersects the contour, the intersection is the value of the feasible field that satisfies both the equality constraint and the objective function, but certainly not the optimal value, because the intersection means that there must be other contours inside or outside the contour of the line, The value of the intersection of the new contour and the target function is larger or smaller, the optimal value can be obtained only when the contour of the contour is tangent to the curve of the target function, as shown in that the normal vector of the contour and the target function must be in the same direction at that point, so the optimal value must satisfy: the gradient of f (x) = A * g (x) gradient , A is constant, which means the left and right sides of the same direction. This equation is the result of the derivation of the parameter by L (a,x). (described above, I do not know the description clearly no, if it is close to my physical location, directly to me, I speak in person to understand some, note: from the wiki).
And the KKT condition is necessary to satisfy the optimization problem of the strong duality condition, it can be understood as follows: we ask for Min f (x), L (A, b, x) = f (x) + a*g (x) + b*h (x), a>=0, we can write F (x) as: Max_{a,b} L (A,b,x), Why is it? Because h (x) =0, g (x) <=0, now is the maximum value of L (a,b,x), A*g (x) is <=0, so L (a,b,x) can get the maximum value only if a*g (x) = 0, otherwise, the constraint is not satisfied, so max_{a,b} L (A , b,x) is f (x) when the constraint is met, so our objective function can be written as min_x max_{a,b} L (a,b,x). If using the dual expression: Max_{a,b} min_x L (a,b,x), because our optimization is to satisfy the strong duality (strong duality that the optimal value of the dual equation is equal to the optimal value of the original problem), so that the optimal value of x0, it satisfies the f (x0) = max_{a,b} min _x L (a,b,x) = min_x max_{a,b} L (a,b,x) =f (x0), let's see what happens in the middle of the two equations:
F (x0) = max_{a,b} min_x L (a,b,x) = Max_{a,b} min_x f (x) + a*g (x) + b*h (x) = max_{a,b} f (x0) +a*g (x0) +b*h (x0 ) = F (x0)
You can see that the above-added black place is essentially said min_x f (x) + a*g (x) + b*h (x) in x0 obtained the minimum value, with the Fermat theorem, that is, for the function f (x) + a*g (x) + b*h (x), the derivative to be equal to zero, that is
Gradient of f (x) +a*g (x) gradient + b*h (x) gradient = 0
This is the first condition in the Kkt condition: L (A, B, x) has a derivative of x that is zero.
As previously stated, A*g (x) = 0, at this time the kkt condition of the 3rd condition, of course the known condition H (x) =0 must be satisfied, all the above mentioned, the optimal value of the optimization problem satisfying the strong dual condition must satisfy the KKT condition, that is, three conditions of the above description. The KKT condition can be regarded as the generalization of Lagrange multiplier method.
In-depth understanding of Lagrange multiplier method (Lagrange Multiplier) and Kkt conditions