Machine learning notes--matrix analysis and application

Source: Internet
Author: User

0.0 The third is still mathematics, because mathematics is the basis for solving all problems, a question in depth to the last is the support of mathematical knowledge. The so-called basic decision superstructure, such as participation in the ACM competition, the game between the master is not programming skills, more is the mathematical knowledge of the competition. If you want to go far, the mathematical foundation must be played well. Well, it is a pity to learn mathematics before the exam, after learning to forget, now to review again. Dr. Cheng took two hours to summarize the contents of the two books "Linear Algebra", "Matrix Theory", and combined with other relevant information.


1. A review of the concept of linear algebra

Before the lecture, re-turned over the undergraduate "linear algebra", brush up on the important concepts of linear algebra, in order to further study in the future.

Inverse matrix : for n-order matrix A, if there is an n-order matrix B, so that AB = BA = E, then the matrix A is reversible, and the matrix B is called a inverse matrix. When | a| =0, A is called a singular matrix. A invertible matrix must be a non-singular matrix, because the sufficient and necessary conditions for the invertible matrix are | The a| is not 0.

        rank of matrix : The row number of rows in the row ladder of the Matrix A, which is the rank of matrix A.   for n-order matrix A, there is only one N-order subtype of a | a|, so when | a| not 0 o'clock R (A) =n, when | A|=0 R (A) <n. Thus, the rank of the invertible matrix = the order of the Matrix, the rank of the non-invertible matrix < The order of the Matrix. So what is the actual significance of rank? The answer is yes. In the matrix SVD decomposition when used for noise reduction, If the matrix rank is much smaller than the sample dimension (that is, the number of matrix columns), then these samples are equivalent to a low subspace space that lives only in the outer space, so that the dimensionality reduction operation can be implemented. Furthermore, if Think of matrices as linear mappings , then the rank is the dimension of the image space.

the solution of the linear equation Group : The basic application of the Matrix in mathematics is to solve the linear equations, which is also a sample throughout the textbook. A complex linear equation group can be expressed as ax=b, where x,b is a vector. By seeking the rank of a, we can determine the solution of the equations: the sufficient and necessary conditions for the solution are R (a) < R (A, b), and the sufficient and necessary conditions for a unique solution are r (a) = R (A, b) = n; The sufficient and necessary condition for an infinite number of solutions is r (a) = R



the base of the vector space : Set V is a vector space, V has r vector a1,a2...ar, and satisfies the a1,a2...ar linearly independent, and any constant in V can be represented by the a1,a2...ar linear, then the vector group A1,a2 ... AR is called a base of vector space V, R is the dimension of vector space, and the V is the R-dimensional vector space. As can be understood, the vector space as a vector group, then the base is a maximal linear independent group, can be used to represent the smallest combination of other vectors, the dimension is the rank of the vector group.

eigenvalues and eigenvectors: for n-order matrix A, if the number λ and n-dimensional non-0 column vector x makes the relational ax=λ x established, then λ is called the eigenvalues of matrix A, and vector x is the corresponding The characteristic vector of the c7>λ. So how do we understand eigenvalues and eigenvectors?

 we know that matrix multiplication corresponds to a transformation, which is a new vector that transforms any vector into another direction or length that is mostly different. In the process of this transformation, the original vector changes mainly in rotation and scaling. If the matrix only has a scaling transformation to a vector or some vectors, and does not produce a rotational effect on those vectors, then these vectors are called the eigenvectors of the matrix, and the scale of the scaling is the eigenvalues. in fact, thisThe meaning of the matrix transformation and the geometric meaning of the eigenvector (graphic transformation) are also discussed in paragraph. The meaning of physics is the picture of motion: the characteristic vector is scaled by the action of a matrix, and the amplitude of the scaling is determined by the eigenvalue. Eigenvalues greater than1, all feature vectors that belong to this eigenvalue are in a very long shape, with eigenvalues greater than0less than1, feature vectors are indented; eigenvalues are less than0, the eigenvectors have shrunk.The opposition to the0point over there.

Similarity matrix : Set A, B are all n-order matrices, if there is a reversible matrix p, so that P-1 Ap=b, B is the similarity matrix of A.

&NBSP; diagonal Array :

two-times : The two-time homogeneous function with n variables is called two-th. Only the square term is called the two-time standard. The matrix representation of the two-order type is f = xT Ax, and A is a symmetric matrix.


2. Basic knowledge of linear algebra (new perspective)
2.1 Matrix

Follow the teacher to see linear algebra. The first is the matrix, the intuitive understanding of the matrix is derived from the "linear algebra" textbook in The Matrix representation of linear equations, today we analyze the matrix from the view of the row and column views respectively. For example, the matrix structure of the line view of the equation Group is 2x-y=1;x+y=5, the two-dimensional plane is represented by a double line, the solution of the equation is the intersection of two lines.


If the column view is expressed, it represents the relationship between two vectors, such as vector A (2,1) and Vector B ( -1,1), 2a+3b= (1,5).

Here, let's explore the dimensions. The dimension of mathematics and the dimensions of physics research are not generalize. The dimensions of mathematics represent the number of independent parameters, and the dimensions of physics refer to the number of independent space-time coordinates. According to Einstein's theory, the space we live in is four-dimensional space, which contains three dimensions of space + time.

2.2 linearity-dependent and linear-independent

We first look at the linear correlation and the linear independent definition. In vector spacea set of vectors for VA, if there isNot all zerosthe number k1, K2,,km, so that the vector Group A is linearly related, otherwise the number K1, K2,,km is all 0 o'clock , called it is linearly independent. Linear independence can be understood, if it is a two-dimensional space, any two vectors of this set of vectors are not collinear, if it is three-dimensional space, then any three vectors are not coplanar, if coplanar, you can use the other two to represent the third vector is not it.

2.3 Base of the vector space

Base: a set of vectors of the vector space V if satisfied (1) linearly Independent, (2)v Any of the constant amount can be shown by this vector linear, it is called the group Vector v a base (also known as the base).

2.4 Four basic sub-spaces column space: is a linear combination that contains all the columns. The column vectors are m-dimensional, so C (A) is in R m -li. On a dimension,the primary column of A is a set of bases for the column space, Dim (C (A)) =rank (a) =r, and the dimension is the size of the rank. 0 Space:n-dimensional vector, which is the set of all solutions of ax=0, so N (A) in R n . 0 Space may not exist. In the dimension, a group of bases is a special set of solutions, R is the number of main variables, N-r is the number of free variables, the dimension of 0 space equals N-r. row space: is a linear combination that contains all the rows. all linear combinations of A's rows, that is, a linear combination of a transpose column (since we are not accustomed to handling line vectors), C (A T) in Rn . Dimension, there is an important property: The row space and the column space dimensions are the same, all equal to the size of the rank. left 0 space: The space perpendicular to the column space, handed to a 0 point, the dimension is m-r. You can draw four sub-spaces as follows, row space and 0 space in Rn, their dimensions add up equal to n, the column space and the left 0 space in Rm, their dimensions add up equal to M.                  


3. Feature decomposition (important technique in convex optimization) 3.1 feature decomposition
  • linear algebra and eigenvectors means the product of the matrix. It is important to note that only the diagonalization matrix
  • properties of feature decomposition: for ax= λ x, if all eigenvalues are not the same, then all of the corresponding eigenvectors are linearly independent, at which point the X can be diagonally converted. But not all squares can be diagonal.
  • characteristic decomposition of symmetric matrices: if the eigenvalues of a symmetric matrix are different, their all of the corresponding eigenvectors are orthogonal.
  • Two-order: used to determine whether a matrix is a positive definite matrix, a semi-positive definite matrix, a negative definite matrix or an indefinite matrix. If the eigenvalues of the matrix are greater than 0, it is the positive definite matrix. is an example of two two-time graphs, with a convex function on the left and a non-convex function on the right. The optimization of the convex function is then discussed in detail.

3.2 PCA  &NBSP; PCA is an effective application of feature decomposition. In the process of feature extraction of images, too many feature dimensions often lead to complex feature matching, consume system resources, We have to adopt the method of feature dimensionality reduction. The so-called feature reduction, that is, the use of a low latitude characteristics to represent high latitude. Feature reduction generally has two types of methods: feature selection feature extraction. feature selection that selects a subset of the features from the high latitude as a new feature Feature Extraction is a new feature that maps the characteristics of high latitude to a low latitude as a function. The commonly used feature extraction method is PCA (principal Component analysis ) Matrix x:1. Calculate the covariance matrix2. The eigenvalues of the computed CX are:λ1=2,Λ2=2/5. The characteristic vectors corresponding to the eigenvalues are 3. Dimensionality reduction: Transpose *x of eigenvector. Gets the matrix of a row.
4. SVD Decomposition and applicationThere are two ways to implement PCA, namely eigenvalue decomposition and SVD decomposition. SVD singular value decomposition can be used to represent a complex matrix with a few smaller sub-matrices, each of which represents an important characteristic of the original matrix. SVD singular value decomposition: any matrix with rank R can be decomposed into the following formula:


     where u and v are orthogonal matrices. The equation represents the relationship between SVD and subspace. The formula (13) is easy to analyze, but it is not effective; the formula (14) is effective, but sometimes it is inconvenient to analyze; the formula (15) facilitates expansion for the calculation of low-rank matrices. In addition, SVD also provides a fast method for computing four sub-spatial orthogonal bases.
one of the SVD applications image compression: Given an image, 256*512 pixels, consider a low-rank matrix approximation to store singular vectors, if a singular vector k=1 is retained, the compression ratio is roughly (256*512)/(256+512) = 170. But k too small image quality also lost, the actual k is not so small, the following four pictures are the original, k=1, k=10, k=80 when the image of the performance.

       here is the code snippet for image compression:


Resources:


1. July Algorithmic Machine Learning course
2. "Linear algebra"3. The geometrical meaning of linear algebra4. "Matrix Computing"

Machine learning notes--matrix analysis and application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.