Solving the extremum problem with multivariate function

Source: Internet
Author: User

today, it is necessary to study the extremum problem of multivariate function in order to find the parameter using Newton iterative method in logistic regression .  

Recall, the unary function to find the extremum problem how do we do? For example, in the case of concave functions, First order derivative is obtained,

The derivative at the extremum must be zero, but the point with the derivative equal to zero does not necessarily have an extremum, for example. so there is a need for further judgment,

The function continues to seek second-order derivation because the second derivative is established at the standing point, so

At the point where the minimum value is obtained, the meaning of the second derivative here is to judge the concave and convex function locally.  

The method of finding extreme values in multivariate functions is similar, only in judging the convexity here a matrix is introduced, called the Hessian matrix .  

If a real-valued multivariate function is second-order in a defined field, then we ask for its extremum, first of all the biased guides, i.e.

get an equation like this  

           

The standing point can be solved by this equation, which is a one-dimensional vector of length. But we just got this residency, actually.

There are 3 types of standing points: local maxima, local minima, and non-extremum values.  

So the next thing to do is to judge which of these 3 is the resident point. So the Hessian matrix is introduced, which means it is used to

Judging The convexity of the multivariate function.  

The Hessian matrix is a square of the second derivative of a multivariate function, describes the local curvature of the function, and is commonly used in Newton iterative method to solve the optimization problem.

For example, for the above multivariate function, if its second derivative is present, then the Hessian matrix is as follows

If a function is continuously hessian within a defined field, then the matrix is symmetric within the defined field, because if the function is connected

Second derivative, there is no difference in the sequence of derivatives, i.e.  

with the Hessian matrix , we can judge the above-mentioned extremum of 3 kinds of cases, the conclusion is as follows

(1) If it is a positive definite matrix, then the critical point is a local minimum value

(2) If it is a negative definite matrix, then the critical point is a local maximum value

(3) If the indefinite matrix, then the critical point is not the Extremum

then continue to learn how to determine whether a matrix is positive, negative, or uncertain.

One of the most commonly used methods is the sequential master style. the necessary and sufficient conditions for a positive definite matrix of a real symmetric matrix is greater than 0 for each order .

Because this method involves the calculation of determinant, it is more troublesome! There is also a method for the real two-matrix matrices, which are described below

the necessary and sufficient conditions for the real two-order matrix to be positive definite two is that the eigenvalues of the matrix are all greater than 0. For a negative two-time type of charging bar

the eigenvalues of the matrix are all less than 0, otherwise it is uncertain.

Lagrange Multiplier method

The Lagrange multiplier method is used to find the extremum of the condition, there are two kinds of extremum problem, one is to find the extremum of the function at the given interval, and the independent variable

No other requirement, this extremum is called unconditional extremum . Second, there are some additional constraints on the independent variables under the limit of the extremum, called

conditional extremum . For example, given an ellipsoid

The maximum volume of the inner box of the ellipsoid is obtained. This problem is actually the conditional extremum problem, that is, the condition

The maximum value to be asked.

Of course, this problem can be eliminated according to the conditions, and then brought into the unconditional extremum problem to deal with. But sometimes it does.

Very difficult, even can not do, this time need to use Lagrange multiplier method . Described below

The conditional extremum that satisfies the function can be transformed into a function.

The unconditional extremum problem. If it is the standing point of the function, it is the suspect point of the conditional extremum.

Back to the above topic, the problem is transformed into a question by Lagrange multiplier method

To obtain a biased derivative

The first three equations are obtained and brought into the solution of the fourth equation.

The maximum volume to be brought into the solution is

The Lagrange multiplier method can also be applied to the conditional extremum problem of general multivariate function under many additional conditions. For example

Title: Find the nearest point and furthest point of the intersection of the rotating parabolic surface and the plane to the origin of the coordinates.

Analysis: set, make all

The partial derivative is zero and gets

                    

The solution was two suspects, respectively.

Because

            

So, the closest point to the origin is that the farthest point is.

title: The maximum entropy of discrete distributions is obtained.

Analysis: because the entropy of the discrete distribution is expressed as follows

    

And the constraint is

the maximum value of the function is required, according to the Lagrange multiplier method ,

For all the partial derivative, get

Calculate the differential of this equation and get

It means that all are equal, and the final solution is

    

Therefore, the maximum entropy value can be obtained by using uniform distribution .

Solving the extremum problem with multivariate function

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.