1. What is manifold
Manifold Learning Viewpoint: We think that the data we can observe is actually mapped by a low-dimensional pandemic to a high-dimensional space. Due to the limitations of the internal characteristics of the data, some of the data in the higher dimensions produce redundancy on the dimension, which in fact can be represented only by a lower dimension. So intuitively speaking, a manifold is like a D-dimensional space, in a m-dimensional space (M > D) is distorted after the result. It is important to note that the manifold is not a shape, but a space. For example, for example, a piece of cloth, you can think of it as a two-dimensional plane, which is a two-dimensional space, now we twist it (three-dimensional space), it becomes a manifold, of course, when it is not twisted, it is a manifold, European space is a special case of manifolds. As shown
For example, for a point on a sphere (in fact, a point on a three-dimensional European space), you can use a ternary group to represent its coordinates:
In fact, this three-dimensional coordinate is generated only by two variables θ and φ, it can also be said that its degree of freedom is 2, but also corresponds to a two-dimensional manifold.
The manifold has the space of the same embryo in the local and European space, that is, it has the property of European space in the local, and can use the Euclidean distance to calculate the distance. This has brought a great enlightenment to the dimensionality, if the low dimensional manifold embedded in the high-dimensional space, the sample in the high-dimensional space distribution although complex, but in the local still have the nature of European space, so you can set up a reduced-dimensional mapping relationship, and then try to extend the local mapping relationship to the global. And when data is reduced to two and three dimensions, it can be visualized, so manifold learning can also be used for visualization.
2. Metric Mappings (ISOMAP)
First, the MDS algorithm, the core idea of the MDS algorithm: to find a low-dimensional space so that the distance between samples in high-dimensional space and low-dimensional space is basically the same. Therefore, the MDS algorithm is to use the similarity between the samples to maintain the output of the reduced dimension and the consistency before the reduction (the calculation of this algorithm is two large), but for the high-dimensional space directly to calculate the straight distance between samples (Euclidean distance) is very misleading. For example, to calculate the distance between the Antarctic and the Arctic on Earth, you can calculate the distance between the two points directly, but this distance is meaningless (you can't always make a hole from Antarctica to the North Pole), so the geodesic distance is introduced, and the geodetic distance is the true distance between two points. The details are as follows
However, how to calculate the geodetic distance between two points, after all, there are many routes from Antarctica to the North Pole, but we are asking for the shortest geodetic distance from Antarctica to the Arctic. At this point, the manifold can be used locally with the European space of the same nature, for each point based on the Euclidean distance to find its nearest neighbor point, and then can establish a neighbor map, so calculate the distance between two points of the problem, The transition becomes the shortest path problem (Dijkstra algorithm) between two points on the nearest neighbor graph.
So what is the ISOMAP algorithm? In fact, it is a variant of the MDS algorithm, the same idea as the MDS, but in the calculation of the distance of the high-dimensional space is the geodesic distance, rather than the real expression of the European distance between two points. The specific algorithm flow is as follows (source: machine Learning Zhou Zhihua version)
The ISOMAP algorithm is global, and it is necessary to find the optimal solution of all the global samples, when the amount of data is very large or the sample dimension is very high, the computational amount is very large. Therefore, the more commonly used algorithm is lle (local linear embedding), Lle discards all the sample global optimal dimensionality reduction, only by guaranteeing the local optimal to reduce the dimension.
3, Local linear embedding (LLE)
Local linear embedding idea: just trying to keep the relationship between the samples in the field. As shown, the linear relationship between samples in each domain is constant after the sample is mapped from a high-dimensional space to a low-dimensional space.
That is, the coordinates of the sample point Xi can be reconstructed by the xj,xl,xk of its domain sample, and the weight parameter is consistent in low and high dimensional space.
The lle algorithm can be divided into two steps:
The first step is to calculate the domain reconstruction coefficient w for all samples based on the domain relationship, that is, to find the linear relationship between each sample and the sample in its field.
The second step is to find the coordinates of each sample in the low-dimensional space according to the invariant of the domain reconstruction coefficients.
Using the M-matrix, you can write the problem
Therefore, the problem becomes the characteristic decomposition of the M-matrix, and then takes the smallest d ' eigenvalues corresponding to the eigenvectors to form the coordinates z of the low-dimensional space. The specific flow of the Lle algorithm is as follows (source: machine Learning Zhou Zhihua version)
Lle Algorithm Summary:
Key Benefits:
1) can learn the local linear low-dimensional manifold of any dimension
2) The algorithm comes down to the sparse matrix feature decomposition, the computational complexity is relatively small, the realization is easy.
3) can deal with non-linear data, can be non-linear dimensionality reduction
Main disadvantages:
1) The flow shape learned by the algorithm can only be closed, and the sample set is dense
2) The algorithm is sensitive to the selection of nearest neighbor samples, and the different nearest neighbors have a great influence on the final dimensionality reduction results.
Summary of machine learning Algorithms (12)--manifold learning (manifold learning)