*Transferred from: http://www.cnblogs.com/maybe2030/p/4946256.html*

1. The basic idea of Lagrange multiplier method

**As an optimization algorithm, Lagrange multiplier method is mainly used to solve the constrained optimization problem, and its basic idea is to transform the constrained optimization problem with n variables and K constraints into a unconstrained optimization problem containing (N+K) variables by introducing Lagrange multipliers. The mathematical meaning behind the Lagrange multiplier is the coefficient of each vector in the gradient linear combination of the constrained equation.**

**How to ****transform a constrained optimization problem with n variables and K constraints into an unconstrained optimization problem with (n+k) variables? Lagrange Multiplier method starts from the mathematical meaning, by introducing Lagrange multipliers to establish the extremum condition, the n variables are respectively biased to the n equations, and then the K constraint conditions (corresponding to K Lagrange multipliers) together constitute the equations of the (n+k) variable (n+k) equations, This can be solved according to the method of solving the equation group. **

** **The problem model solved is constrained optimization problem:

**Min/max a function f (x, Y, z), where x, Y, Z is not independent and g (x, Y, z) =0.**

**i.e.: Min/max f (x, Y, z)**

**S.T. g (x, Y, z) =0**

2. Mathematical examples

First, an example of the MIT Math Course is introduced as an introduction to the Lagrange multiplier method.

"The MIT Math Course example" asks the nearest point on the hyperbolic xy=3 to the far point.

Solution:

First, we refine the mathematical model of the problem according to the description of the problem, namely:

Min f (x, Y) =x2+y2 (the Euclidean distance between two points should also be prescribed, but this does not affect the final result, so it is simplified, minus the square)

S.T. xy=3.

According to the above formula we can know that this is a typical constrained optimization problem, in fact, we solve this problem is the simplest solution is through the constraints of one of the variables to be replaced by another variable, and then the optimization of the function to find the extremum. We are here to elicit the Lagrange multiplier method, so we use Lagrange multiplier method to solve the idea.

We will draw the X2+y2=c curve family, as shown, when the circle in the Curve family and the xy=3 curve tangent, the tangent point to the origin of the shortest distance. That is, when the contours of the f (x, Y) =c and the hyperbolic g (x, y) are tangent, we can get an extremum of the optimization problem described above (note: If you do not calculate in one step, we do not know whether it is a maximum or a minimum value).

Now the original question can be converted to what is the value of x, y when f (x, Y) and g (x, y) are tangent?

If the two curves are tangent, then their tangents are the same, i.e. the normal vectors are parallel to each other, ▽f//▽g.

can be obtained by ▽f//▽g, ▽f=λ*▽g.

At this point, we transform the original constrained optimization problem into a dual unconstrained optimization problem, as follows:

**Original problem:**min f (x, y) =x2+y2 **duality problem:** by ▽f=λ*▽g,

s.t. Xy=3 FX=Λ*GX,

Fy=λ*gy,

Xy=3.

Problem of unconstrained Equation Group for constrained optimization problems

By solving the equations on the right, we can get the solution of the original problem, i.e.

2x=λ*y

2y=λ*x

Xy=3

By solving the formula can be obtained, λ=2 or 2, when λ=2, (x, y) = (sqrt (3), sqrt (3)) or (-SQRT (3),-SQRT (3)), and when λ=-2, no solution. So the solution to the original problem is (x, y) = (sqrt (3), sqrt (3)) or (-SQRT (3),-SQRT (3)).

This simple example is to realize the idea of Lagrange multiplier method, that is, by introducing Lagrange multipliers (λ) to transform the original constrained optimization problem into an unconstrained equation group problem.

3. Basic morphology of Lagrange multiplier method

**in order to find the conditional extremum of the function, it can be transformed into the unconditional extremum problem of the function. **

We can draw to help think.

The Green Line marks the trajectory of the point that constrains g (x, y) =c. The Blue Line is the contour of f (x, y). The arrows represent the slope, parallel to the normals of the contour line.

From the diagram you can see intuitively that the slope of F and G is parallel to the optimal solution.

▽[f (x, y) +λ (g (x, y) 1)]=0,λ≠0

Once the value of λ is calculated, it is put into the lower formula, and it is easy to find the point corresponding to the unconstrained extremum and Extremum.

F (x, y) =f (x, y) +λ (g (x, y)? c)

The new equation f (x, y) is equal to f (x, Y) when the extremum is reached, because F (x, Y) reaches the extremum at G (x, Y), and C is always equal to zero.

The derivative of the above equation is 0, i.e. ▽f (x) +▽∑λigi (x) = 0, i.e. the gradient collinear of f (x) and g (x).

**Topic 1:**

Given ellipsoid

The maximum volume of the inner box of the ellipsoid is obtained. This problem is actually the conditional extremum problem, that is, the condition

The maximum value to be asked.

Of course, this problem can be eliminated according to the conditions, and then brought into the unconditional extremum problem to deal with. But sometimes this is difficult, even cannot be done, at this time need to use **Lagrange multiplier method** . By Lagrange multiplier method, the problem is transformed into

To obtain a biased derivative

The first three equations are obtained and brought into the solution of the fourth equation.

The maximum volume to be brought into the solution is

The Lagrange multiplier method can also be applied to the conditional extremum problem of general multivariate function under many additional conditions.

**Topic 2:**

**title:** The maximum entropy of discrete distributions is obtained.

**Analysis:** because the entropy of the discrete distribution is expressed as follows

And the constraint is

The maximum value of the function is required, according to the **Lagrange multiplier method** ,

For all the partial derivative, get

Calculate the differential of this equation and get

It means that all are equal, and the final solution is

Therefore, the maximum entropy value can be obtained by using **uniform distribution** .

4. Lagrange Multiplier method and Kkt condition

The problems we discussed above are equality constrained optimization problems, but the equality constraints are not enough to describe the problems that people face, inequality constraints are more common than equality constraints, most of the actual problems are not more than the amount of time, not more than how much manpower, not more than the cost and so on. So a few scientists have expanded the Lagrange multiplier method, adding KKT conditions can then use Lagrange multiplier method to solve the optimization problem of inequality constraints.

First, let's start by introducing what the KKT condition is.

The **kkt condition is a necessary and sufficient condition for a nonlinear programming (nonlinear programming) problem to have an optimal resolution method under some conditions that satisfy some rules.** This is the result of a generalized Lagrangian multiplier. Generally, an optimal mathematical model of the column standard form of reference to the beginning of the formula, the so-called Karush-kuhn-tucker optimization conditions, refers to the optimal point of the formula X? The following conditions must be met:

1). Constraints meet GI (x?) ≤0,i=1,2,..., p, and, HJ (x?) =0,j=1,2,..., Q

2). f (x?) +∑i=1μi?gi (x?) +∑J=1ΛJ?HJ (x?) =0, which is the gradient operator;

3). Λj≠0 and inequality constraints satisfy Μi≥0,μigi (x?) =0,i=1,2,..., p.

Kkt condition The first term is to say that the most advantageous x? must satisfy all equality and inequality constraints, that is, the most advantageous must be a feasible solution, which is naturally indisputable. The second item shows that in the most advantageous x?,? f must be the linear combination of GI and HJ, Μi and Λj are called Lagrange multipliers. The difference is that the inequality restriction condition has directionality, so each μi must be greater than or equal to zero, and the equality restriction condition has no directionality, so the λj has no symbolic limit, its symbol depends on the formula of the equation restriction condition.

**To make it easier to understand, let's first cite an example to illustrate the origin of the kkt condition.**

LETL(X,Μ)=F(X)+∑K=1ΜKGK(X)，< Span class= "Mi" > < Span class= "Mrow" > where ΜK≥0,GK (x) ≤0

∵μk≥0 GK (x) ≤0 =>μg (x) ≤0

∴maxμl (x,μ) =f (x) (2)

∴MINXF (x) =minxmaxμl (x,μ) (3)

MaxΜMinXL(X,Μ)=MaxΜ[MinXF(X)+MinXΜG(X)]=MaxΜMinXF(X)+MaxΜMinXΜG(X)=MinXF(x) +max μmin x Μg (x< Span id= "mathjax-span-562" class= "Mo" >)

∵μk≥0, GK (x) ≤0

∴maxμminxμg (x) = 0, at which point μ=0 or G (x) =0.

∴MaxΜMinXL(X,Μ)=MinXF(X)+MaxΜMinXμg (x) =< Span id= "mathjax-span-747" class= "Munderover" >min xf (x ) (4) At this timeμ=0 or G(x)=0. Union (3), (4) We get Minxmaxμl (x,μ) =MAXΜMINXL (x,μ), i.e.

Minxmaxμl (x,μ) =MAXΜMINXL (x,μ) =minxf (x)

We refer to MAXΜMINXL (x,μ) as the duality problem of the original problem minxmaxμl (x,μ), which indicates that the original problem, the solution of duality, and the MINXF (x) are the same when certain conditions are met, and at the optimal solution x where the μ=0 or g (x?) = 0. Put X into (2) maxμl (x?,μ) =f (x?), from (4) MAXΜMINXL (x,μ) =f (x?), so L (x?,μ) =MINXL (x,μ), which shows that x is also the extreme point of L (x,μ), i.e.

Finally, summarize:

The KKT condition is the generalization of the Lagrange multiplier method, and if we include the equality constraint and the inequality constraint together, we will show:

Note: x,λ,μ are vectors.

Indicates that the gradient of f (x) at the extremum point x is each hi (x?). and GK (x?) Linear combination of gradients.

Lagrange Multiplier method