"Turn" Minkowski distance

Source: Internet
Author: User

This paper describes the relationship between Euclidean distance, Manhattan distance, and Chebyshev distance from the memory Minkowski.

In general, defining a distance function, d (x, y), needs to meet the following guidelines:

1) d (x,x) = 0//To its own distance of 0
2) d (x, y) >= 0//distance non-negative
3) d (x, y) = d (y,x)//symmetry: If A to B distance is a, then the distance from B to a should also be a
4) d (x,k) d (k,y) >= d (x, y)//Triangle rule: (the sum of both sides is greater than the third side)

Minkowski Distance:

Minkowski distance (Minkowski distance) is a very common way to measure the distance between numerical points, assuming that the value points P and Q coordinates are as follows:

So, the Minkowski distance is defined as:

The most common p for this distance is 2 and 1, the former Euclidean distance (Euclidean distance), and the latter is the Manhattan distance (Manhattan distance). Suppose you take a taxi from P-point to Q-point in Manhattan, white for tall buildings, grey for streets:

The green slash indicates Euclidean distance, which is impossible in reality. The other three lines represent the Manhattan distance, and the lengths of the three polylines are equal.

When p approaches infinity, Minkowski is converted to Chebyshev distance (Chebyshev distance):

We know that the shape of the point at which the Euclidean distance from the plane to the origin (P = 2) is 1 is a circle, when p takes other values?

Note that when P < 1 o'clock, Minkowski is no longer in accordance with the triangle rule, for example: when the distance of P < 1, (0,0) is equal to (1 1) ^{1/p} > 2, and (0,1) the distance between the two points is 1.

Minkowski is more intuitive, but it is not related to the distribution of data, has certain limitations, if the X-direction of the amplitude is far greater than the value of the Y-direction, the distance formula will be over-amplified the role of X-dimension. So, before we calculate the distance, we may also need to z-transform the data, minus the mean, divided by the standard deviation:

As you can see, the above processing begins to embody the statistical characteristics of the data. This method uses the characteristics of the data distribution to calculate different distances, assuming that the dimensions of the data are irrelevant. If the dimensions are related to each other (for example: high-height information is likely to lead to heavier weight information, because the two are related), the Markov distance (Mahalanobis distance) will be used.

"Turn" Minkowski distance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.