What is feature vector, feature value, matrix decomposition?

Source: Internet
Author: User

 

What is feature vector, feature value, matrix decomposition (21:53:56)

Geometric meaning of feature vectors and feature values

Feature vectors do have a clear geometric meaning, matrix (since the question of feature vectors is discussed, of course it is a square matrix, the concept of generalized feature vectors is not discussed here, that is, the general feature vectors) the result of multiplying a vector is still a vector of the same dimension. Therefore, matrix multiplication corresponds to a transformation that converts a vector to another vector of the same dimension. What is the transformation effect? Of course, this is closely related to the structure of the square matrix. For example, we can take an appropriate two-dimensional square matrix, so that the effect of this transformation is to rotate the two-dimensional vector on the plane by 30 degrees counterclockwise, in this case, we can ask a question: is there a vector that does not change the direction under this transformation? You can think about it. Except for the zero vector, there are no other vectors that can rotate for 30 degrees on the plane without changing the direction. Therefore, this transformation corresponds to the matrix (or this transformation itself) there is no feature vector (Note: The feature vector cannot be a zero vector). Therefore, a transformed feature vector is such a vector. After such a specific transformation, the direction remains unchanged, it's just scaling the length. (If you think about the original feature vector definition AX = Cx, you can see it? CX is the result of matrix A's transformation to vector X, but it is clear that CX and X are in the same direction). If X is a feature vector, ax is also a feature vector (A is a scalar and not zero). Therefore, the feature vector is not a vector but a vector family. In addition, the feature value only reflects the scaling multiple of feature vectors during transformation. For a transformation, the direction specified by the feature vector is very important, and the feature value is not so important, although we first find the feature values when finding these two quantities, feature vectors are more essential!

For example, a transformation on the plane performs an image symmetric transformation on the horizontal axis of a vector, that is, the abscissa of a vector remains unchanged, but the ordinate value is the opposite, the transformation is represented as the matrix [1 0; 0-1], where the semicolon indicates line feed, apparently [1 0; 0-1] * [a B] '= [A-B]', in which the supermark 'indicates the transpose. This is exactly what we want. Now let's guess, what is the feature vector of this matrix? Think about the vectors that remain in the same direction under this transformation. Obviously, the vectors on the horizontal axis remain in the same direction under this transformation (remember that this transformation is an image symmetric transformation, that mirror surface (on the horizontal axis) of course, it will not change), so we can directly guess that its feature vector is [a 0] '(A is not 0). Is there anything else? Yes, that is, the vector on the vertical axis. After the transformation, the direction is reversed, but it is still on the same axis, so it is also considered that the direction is not changed, so [0
B] '(B is not 0) is also its feature vector. The feature vectors in the matrix [10; 0-1] are correct!

The following is transferred from -- http://blog.sina.com.cn/s/blog_531bb7630100gxid.html --

What is feature vector, feature value, matrix decomposition?

Mathematical significance of feature vectors:

First, we examine a linear change. For example, the elliptic equation in the X and Y coordinate systems can be written as x ^ 2/A ^ 2 + y ^ 2/B ^ 2 = 1, after the coordinate system rotates the origin, the elliptic equation will undergo transformation. We can multiply the (x, y) of the original coordinate system by a matrix to obtain a new (x', y') representation, written as an operator in the form of (x, y, y) * m = (x', y '). The matrix m here represents a linear transformation: stretching, translation, and rotation. So, is there any linear transformation of B (B is a vector) that makes the transformed result look like and Let (X, Y) * B is like a number B multiplied by a number M * B? In other words, is there such vector B that the linear transformation of matrix A * B is equivalent to the projection of M * B on vector B?
If yes, B is a feature vector of A, and M is a corresponding feature value. A matrix can contain many feature vectors. Feature values can be obtained using feature equations, and feature vectors can be obtained through the equations corresponding to feature values, which is the same in turn. For example, if a is set to a cubic real symmetric matrix, a1 = (a,-A, 1) T is the solution of AX = 0, a2 = (A, 1,-) T is the solution of (a + E) x = 0. If a is less than 2, the constant A =? Because a1 = (a,-A, 1) T is the solution of AX = 0, it means that a1 = (a,-A, 1) T is a 0 feature vector of, a2 = (A, 1,-a) T is the solution of (a + E) x = 0, indicating a2 = (A, 1,-) T is a-1 feature vector of. The real symmetric matrix belongs to the feature vector orthogonal with different feature values, so a ^ 2-a-a = 0, A = 2, so a = 0.

 

It is still too abstract. Specifically, finding the relationship between feature vectors is to perform Orthogonal Decomposition of the Space represented by matrix, so that the vector set of a can represent the projection length of each vector A on each feature vector. For example, if A is a matrix of M * n and N> m, then the feature vector is m (because the maximum rank is m), and N row vectors are projected on each feature vector E, its feature value V is the weight. Now, each row vector can be written as VN = (E1 * v1n, e2 * v2n... Em * vmn), and the matrix becomes a square matrix. If the rank of the matrix is smaller, the storage of the matrix can be compressed. Furthermore, because the projection size represents the projection of each component of a in the feature space, we can use the least 2 multiplication to find the components with the largest projection energy, remove the remaining components to save the information represented by the matrix to the maximum extent, and greatly reduce the dimensions to be stored in the matrix, or the PCA method for short.

For example, for a point (x, y) on the X and Y planes, I perform linear transformation on it, (x, y) * [; 0, -1]. The semicolon indicates the line feed of the matrix. The result is (x,-y). This linear transformation is equivalent to mirroring the X axis. We can find two feature vectors in the matrix [; 0,-1], [] and [], that is, the X axis and Y axis. What does it mean? The projection on the X axis is not changed after this linear transformation. The projection on the Y axis is multiplied by the amplitude coefficient-1 and does not rotate. The two feature vectors indicate that the linear transformation matrix is linear for the x-axis and Y-axis orthogonal bases. For other linear transformation matrices, we can also find n Symmetry Axes. The transformed results will not change linearly about these N symmetry axes. The N symmetry axes are the N feature vectors of linear transformation. This is the physical meaning of feature vectors. Therefore, matrix A is equivalent to linear transformation.

In actual matrix algorithms, the inverse of the matrix is often required: When the matrix is not a square matrix, there is no solution. This requires Singular Value Decomposition, that is, a = PSQ, p and q are reciprocal matrices, while S is a matrix, and then the pseudo-inverse value can be obtained. At the same time, a = PSQ can be used to reduce the storage dimension of a, As long as P is a thin long matrix and Q is a wide and flat matrix. In a very large case, the storage capacity can be reduced by several orders of magnitude.

What are the physical meanings of feature vectors? For example, if a standing wave passes through a rope, each point on the rope forms an infinite vector. The feature vector of this vector is the feature function sin (t), because it is time-varying, it becomes a feature function. Each vertex feature value is the sin (x + T) value of each vertex at a specific time point. Another example is that although the coordinates of each scene are constantly changing from a certain angle in space, this transformation has symmetry on the autobiography axis of the Earth, that is, the coordinate transformation of the pan and stretch of this axis is not sensitive. Therefore, the self-rotating axis of the earth is a feature vector of the spatial transformation of the Earth's rotation. Google's PageRank is the correction of the adjacent matrix of WWW links. The projection component of the main feature vector gives the page split. What are the features?
AB and Ba have the same feature vector ---- if the feature vector of AB is X and the corresponding feature value is B, then (AB) x = Bx, take the left multiplication matrix B on both sides of the above formula and obtain B (AB) x = (BA) (BX) = B (BX). Therefore, B is the characteristic value of BA, the corresponding feature vector is BX. And vice versa.

What are feature matrices and feature values? We use the general theory to consider that P (A) = (1, 2, 3) is the three feature vectors of. P (A ^ 2) is (1 ^ 2, 2 ^ 2, 3 ^ 2). P can be considered as an operator. Of course, the features of operators need to be proved in detail in Part/in detail. Once proved, it can be used as a whole feature. What are the characteristics of feature values? It means that the matrix can be decomposed into n-dimensional feature vector projection. The N feature values represent the length of each Projection Direction. Since N * n matrix A can be projected into an orthogonal vector space, any matrix composed of n-dimensional feature vectors can be a linear projection transformation matrix, I is a linear transformation projection matrix. Therefore, for the feature value m, it must be enough to form a matrix without linear irrelevant vectors, where AA = MA is obtained by multiplication on both sides using I.
AA = Mai, so (A-mi) A = 0 has a non-0 solution, then | A-mi | = 0 (you can use the anyway method, if this determinant is not 0, then, n vectors are linearly independent. In an n-dimensional space, they can only be at the origin and cannot have non-zero solutions ). So some useful properties can be introduced, such as a = [1/2, 1/5;/;,], if the value of | A-mi | = 0 is the feature value, it is obvious that the feature value array can be obtained immediately (1/2, 1/3, 1/5 ). If the rank of a n * n matrix is 1, the largest linear independent group is 1, and the feature vector is 1. Any n-dimensional non-zero vector is a feature vector of. The feature vector itself is not dead, just like the coordinate system can be rotated. Once each direction of the feature vector is determined, the feature value vector is also determined. The process of finding the feature value is to use the feature equation: | A-Me | = 0, P (1/a) = 1/P (A), which can be proved. What is the physical meaning? A vector with n-dimensional linear independence removes one dimension, so at least two vectors are linearly related, so the deciding factor is 0. What is the role of the feature matrix? Change the Matrix to a positive definite matrix, that is, a = P ^-1bp. In this transformation, A is a diagonal matrix.

The Study of linear algebra is to take vector and matrix as a whole, starting from the nature of some, to the overall nature, and then obtain various application and physical concepts from the overall nature. When matrix A is a symbol, its nature is very similar to that of real number. Scientific theorem seems always recursive. For another example, the basic concepts of high numbers include differential, integral, and reciprocal. Now I can think of three median theorems, namely, differential, integral, and reciprocal.

Disadvantages of linear transformation: linear transformation PCA can be used to process images. For example, 2D portrait recognition:

1. We regard image A as a matrix, and further as a linear transformation matrix. We can find out the feature matrix of this training image (assuming n feature vectors with the largest energy ). Multiply A by the N feature vectors to obtain an n-dimensional vector A, that is, the projection of a in the feature space.

2. in the future, the same class of images (for example, from the face photos of the same person) will be recognized as linear correlation images of A, multiplied by this feature vector, obtain a vector B composed of N numbers, that is, the projection of B in the feature space. The distance between A and B is the criterion for us to judge whether B is.

However, PCA has a natural disadvantage, that is, the linear vector correlation test has the "Translation independence" advantage while completely ignoring it. In a two-dimensional graph, the order between vector components is meaningful. Different sequences can represent completely different information. In addition, image B must be scaled by a (determined by the feature vector space) to be well projected into the feature vector space of, if B contains a certain rotation factor in a, PCA can be permanently invalidated. Therefore, in actual application, the PCA method is used for image recognition, and the recognition rate is not high. It requires that the image have certain strict direction alignment and normalization. Therefore, PCA is generally used for dimensionality reduction of the feature matrix instead of directly extracting features. Of course, the result of dimensionality reduction is not ideal for classification. We can further perform the Fisher transformation of the distance between classes in the least square. However, the Fisher transform will introduce new weaknesses, that is, the data in the training category becomes more sensitive. The cost of increasing the classification effect is the decline in universality. When the number of types expands sharply, the classification function is still straight down-but it is much better than the direct PCA classification function.

K-L transformation is an application form of PCA. Assume that image type C has n images, then each image is dropped into a vector, and the vectors of n images form a matrix to obtain the feature vectors (column vectors) of the matrix ). Then we use the original n images multiplied by these column vectors to obtain the average value, which is our feature image. We can see that the feature image is similar to the original image, but some deformation information related to stretching and translation is removed. While being robust, it sacrifices a lot of accuracy. Therefore, it is suitable for verification of images in a specific range, that is, to determine whether image P belongs to type C. Comparison of neural networks: To put it bluntly, the ing of the Function Y = f (x) is changed to the vector ing of [Y] = [F (x. The input and output entries are fixed. The real neural system does not clearly distinguish between internal processing and external interfaces. Therefore, all neural networks are named neural networks, which are essentially far behind each other.

 

Finally, what is spectrum )? We know that music is a dynamic process, but the music score is on paper and static. For mathematical analysis tools, the time-varying function tools can be used to study the frequency spectrum corresponding to Fourier transformation. For probability problems, although the results of each projection are different, however, the power spectral density of probability distribution can be obtained. As a metaphysical tool, mathematics focuses on the unchanging laws in the changing world.

[5. Can it be used for classification?]

 

The so-called feature matrix is how the original matrix is similar to an X-Dimensional Quantitative matrix. Lamda (I) illustrates the I axis of a similar projection and an x-dimensional linear space. Lamda (I) is a scale-down ratio. The order between Lamda (I) is not important because the interchange between coordinate axes is elementary linear transformation and does not affect the properties of the algebra topology. Feature vector XI shows how a projects a linear combination to a coordinate axis. A feature vector is a set of orthogonal basis sets.

 

When the image is regarded as a matrix in the problematic domain of image processing, the classification problem of the image is that the similar matrix is considered to have the same or algebraic Approximation "invariant ". Obviously, "similar" is a class defined by subjective assumptions, rather than a class "determined" by computation. This leads to a problem. The so-called different types mean that the subjective Comprehension Ability of a person is a prior, not a posterior obtained through computation, it does not represent any deterministic information in the mathematical logic. If the feature vectors or feature matrix of a matrix are used as the classification information, there is no evidence that different "classes" matrices can have more approximate feature values. The so-called matrix decomposition method, the intra-class least distance method (Fisher), has an unpleasant premise, that is, it must ensure the intra-class matrix, the Euclidean distance is small enough-the Euclidean distance is often different from the human's geometric topology ). Because the matrix itself does not have predefined topology information, when the Euclidean distance between similar images is increased, it cannot be well classified. At the same time, the more classes of images are divided, the more severe the overlap between these subspaces. in a timely manner, we can look for Linearly unchanged sub-spaces or factors from the sub-spaces of each category, nor can this overlap be eliminated-the Fisher algorithm tries to bypass the past, but it has paid the cost of relying heavily on initial data and the cost of losing universality. The PCA algorithm tries to obtain the best classification in the statistical sense, but when the number of types increases, the previous parameters will be voided and no useful computing flow can be obtained. As the overlapping spaces cannot be solved, the classification will continue to decline. Why?
It is because classification itself is not obtained based on the algebraic characteristics of linear transformation itself, but a prior non-linear "smart" human judgment. Therefore, binary computation is a cooperative classification of discrete sets, which must be performed in Orthogonal division of linear space. This leads to a logical irreconcilable paradox. Non-linear determination is continuous, geometric topology, infinite vertices, non-separated variables, and cannot be modeled at all. Therefore, it is an unidentifiable problem.

 

So without the idea of Higher Algebra, can the practical signal processing method extract local features for classification? This still does not answer the question of "A prior" classification, and is still trying to find a way to barely use it on a bad premise. How does one know that the local part of a Matrix actually corresponds to the local location of another matrix? This is still a subjective and intuitive judgment! A computer is just a deformation of paper and pen, and it cannot understand the meaning-even if the result of an operation like 1 + 1 = 2, it cannot determine whether it is right or wrong. If it asks other computers to determine whether it is right or not-how can other computers prove itself right or wrong? No. You have to wait for the "person" of a subject to observe the result. This result will become meaningful. So just like the cat of schörödnex, she smiled at me lazily in the sun. The metaphysical theory is subtle, and it is not beyond the cage of empirical doctrine.

 

Therefore, I no longer need algorithms or philosophy.

The feature value is the root of the one-dimensional multiple-time equations corresponding to the matrix.

The feature value indicates the extent to which a matrix vector is stretched or compressed. For example, if the feature value is 1, it indicates that after transformation, the vector is not stretched and physically represents a rigid body movement, it is quite different from the overall framework, but the internal structure is not changed.

In quantum mechanics, the matrix represents the amount of mechanics, the feature vector of the matrix represents the fixed state wave function, and the feature planting of the matrix represents a possible observation of the amount of mechanics.

A vector (or function) is multiplied by a matrix, indicating a linear transformation of the vector. If the transformed vector is multiplied by a constant, this constant is called the feature value. This is the mathematical meaning of the feature value;

The physical meanings of feature values are explained based on specific situations. For example, the frequency in dynamics, the ultimate load in stability analysis, and the principal stress in stress analysis.

To clarify the feature values of a matrix, we must start with linear transformation. We should regard a matrix as a matrix with a linear transformation under a certain group of bases. The simplest linear transformation is the multiplication transformation, the purpose of finding the feature value is to see if a linear transformation can act on some non-zero vectors as a number-multiplication transformation. The feature value is the transformation ratio of the number-multiplication transformation, some of these non-zero vectors are feature vectors. In fact, we are more concerned with feature vectors. We hope that we can break down the original linear space into straight sums of the subspaces related to feature vectors, in this way, our research can be carried out in these sub-spaces. This is the same as the practice of splitting motion into horizontal and vertical directions when studying motion in physics!

 

Use MATLAB to find the feature vector of the Maximum Matrix feature value

Use the function [V, d] = EIG ()

 

The diagonal element of matrix D stores all the feature values of,

And arranged in ascending order.

Each column of matrix V stores the corresponding feature vectors.

So it should be the last column of v.

Is the feature vector of the largest feature value.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.