Looking at the issue of sparse encoding, I saw a lot of content, and I was confused with the knowledge of compression sensing. I accidentally read several blog posts and clarified the relationship between them here.
In fact, compression sensing only uses sparse representation as a tool to reconstruct signals,
The compression Sensing theory mainly includes three aspects: sparse Signal Representation, encoding measurement, and reconstruction algorithm. sparse representation of signals is A Concise Expression of the original signal when the signal is projected to the basis of orthogonal transformation. this is a prior condition for compression sensing. in encoding measurement, the constraints on the same distance must be met. Finally, the reconstruction algorithm is used to reconstruct the original signal.: d
I. Comparison of sparse encoding and compression sensing
In the compression sensing model: Y = AX + N (1)
X indicates the original signal, a indicates the sparse ing matrix, N indicates the addition noise, and y indicates the compression measurement. In this model, if the original signal X meets certain sparse characteristics, the sparse ing matrix A can be used to compress it into a small vector space, that is, the number of rows of Y is much smaller than that of X, which also reflects the core idea of sparse Theory: Describe high-dimensional signals with low-dimensional signals.
In the sparse expression model, we use y to represent the original signal, a to represent the dictionary, and X to represent the sparse expression of the original signal under Dictionary A. In general, Dictionary A is too complete, that is, the redundant number of columns in a is consistent with the sparse ing matrix in compression sensing. Without noise considerations, y = AX (2)
Using X as the independent variable, we can see that the number of unknown numbers of the equation is more than the number of equations, that is, the equation either has an infinite number of solutions (when the Augmented Matrix [, y] Rank = the rank of Matrix [a]), or no solution (when the rank of the rank> matrix [a] of the augmented matrix [A, y ), we certainly consider the situation where there are infinite solutions. Which of the following solutions do we need? Scholars have told us that we should find the most sparse solution, that is, the solution with the least non-zero elements, which is required to optimize the following problem: min J (x) s. t. y = AX (4)
Or you can create an expression: argmin power (| Y-ax | _) + lambdaj (x) (5)
The first item indicates the data fitting item, and the last item indicates the sparse Constraint item. J (x) obtains the l0 norm of X, and the l0 norm is used to count the number of non-zero elements in X, however, the Functions Represented by the l0 norm are non-convex, and we cannot guarantee the optimal solution. However, in some practical applications, we do not need to obtain the optimal solution. For example, in image denoising applications, we can apply the l0 norm model. Although the obtained X is not necessarily the globally unique and optimal, it still achieves the denoising effect and achieves the state-of-the-art, scholars represented by Michael Elad in Israel use l0 norms for modeling, such as ksvd [1] and double sparsity models [2]. Or, let's take the next step. Considering L1 norms, although L1 norms cannot guarantee a globally unique optimal solution, because L1 norms are not strictly convex, L1 norms can achieve sparse results, in addition, it can generally meet the needs of practical applications. For example, scholars represented by Julien mairal, francisbach, and Jean ponce in France [3] are modeled using L1 norms. The Research on sparse expression in these two areas seems to play a leading role, and many people conduct expansion research based on their work.
From the above sparse expression and compression sensing model, we can see that their core problems are the same, that is, when the compression measurement y or the original signal y is known, combined with pre-defined perception matrix A or dictionary a, the l0 and l1 norm models (can be their fusion, or even L2 norm [3]) can be used. the original sparse signal X or sparse expression X is obtained. However, in compression sensing, the perception matrix A is generally defined in advance and can be obtained as a Gaussian random matrix, or a sparse matrix with only 0 and 1 (Binary sparse matrix). For example, when Mr Zhang Zhilin uses the bsbl_bo algorithm to process Fetal ECG signals, the latter is used, it has achieved good results. However, in a sparse expression model, the most popular practice is to obtain a dictionary by means of learning. Such a dictionary has better adaptability than a pre-defined dictionary, this idea should be used for reference to the Compressed Sensing Model.
From the two articles [4] [5], we can see that we can get a higher compression ratio when considering the intra-block correlation of the original signal, in addition, the restored original signal is more real and robust, which has been verified in the applications of Fetal ECG signal and voice signal. In the sparse expression model, some people consider the structure of the sparse expression coefficient x, such as the intra-block or inter-block correlation presented by instructor Zhang Zhilin, on the other hand, many people are considering the structure of dictionary a. For example, in [2], Dictionary A is modeled as a = WD, where D is a sparse matrix, each column has a specified number of non-zero elements. After D is learned, W is the preset matrix. In this way, D and X are sparse, you can use the algorithm in [2] to solve the problem in a unified manner. Article [3] describes how to build a specific dictionary based on a specific task, which seems to be a mainstream topic in sparse expression research.
In the bsbl algorithm proposed by Dr. Zhang Zhilin and its extension, the original signal X has a block structure, and the number of rows in the perception matrix A is smaller than the number of columns, in this way, the original signal can be mapped to a lower-dimension space, which achieves the sampling effect. The sampling rate can be much lower than the nequest sampling rate under certain conditions. The state-of-the-art effect is achieved in the signal recovery application by considering the correlation of elements in the block of the original signal X, that is, in a block, each element is generally not independent. For example, in an image, a pixel usually has a lot to do with the surrounding pixels, this is also the assumption Basis of the MRF model. Considering this correlation, the Bayesian model is used for modeling. we can estimate the correlation matrix and mean Matrix of each block to restore the original signal, this article analyzes the number of chunks, known and unknown.
[1] K-SVD: an algorithm for designing overcompletedictionaries for sparse representation.
[2] Double sparsity: Learning sparse dictionaries forsparse signal approximation.
[3] task-driven dictionary learning.
[4] extension of SBL algorithms for the recovery ofblock sparse signals with intra-block correlation.
[5] From sparsity to structrued sparsity: bayesianperspective (in Chinese ).
Ii. Compression sensing
In short, the theory of compression sensing points out that as long as the signal is compress or sparse in a certain transform field, then we can use an observation matrix unrelated to the transform base to project the transformed high-dimensional signal to a low-dimensional space, then, by solving an optimization problem, the original signal can be reconstructed from these few projections with a high probability. it can be proved that such projection contains sufficient information of the reconstructed signal.
Under the theoretical framework, the sampling rate no longer depends on the signal bandwidth, but largely depends on two basic principles: sparsity and non-relevance, or sparsity and equi-distance constraints.
The compression perception theory consists of three parts:
(1) sparse representation of signals;
(2) design the measurement matrix to minimize the information loss of the original signal X while reducing the dimension;
(3) Design a signal recovery algorithm to use M observations to restore the original signal with a length of N without distortion.
Theoretical Basis:
(1) signal with a length of NXK-sparse on an orthogonal basis (that is, K non-zero values );
(2) If we can find an observation base Phi that is irrelevant to the kernel;
(3) observations of one-dimensional measurement m with the length of m are obtained by observing the original signal with the observation base Phi.Y, K <m <N;
(4) Then we can use the optimization method from the observed values.YMedium/high probability recoveryX.
Mathematical expression:
Set X to a one-dimensional signal with N length. The sparse degree is K (contains k non-zero values), and a is a two-dimensional matrix with m x n (m <n ), y = Φx is a one-dimensional measurement value with a length of M. The compression sensing problem is based on the known measured values Y and the measurement matrix Phi, and the original signal X is obtained by solving the under-determined Equations y = phi X. Each line of Φcan be considered as a sensor, which is multiplied by the signal and obtains part of the signal information. This part of information is sufficient to represent the original signal and can find an algorithm to restore the original signal with a high probability.
The general natural signal X itself is not sparse and needs to be represented in a certain sparse base,X= BytesS, The Delimiter is the sparse base matrix,SFor the Sparse Coefficient (SOnly K values are non-zero (k <n ).
The compression sensing equation isY= PhiX= Φ133S= BytesS.
Convert the original measurement matrix Phi to the approximate value of "S", then the original signal x' = Ψ s '.
1. sparse representation of Signals
The sparsity of signals is simply understood as the number of non-zero elements in the signal is small, or the majority of coefficients are 0 (or the absolute value is small ).
The real signals in nature are generally not absolutely sparse, but are approximately sparse in a certain transform domain, that is, compressed signals. In theory, all signals are compressed and sampled as long as the corresponding sparse representation space is found. Signal sparsity or testability is an important prerequisite and theoretical basis for compression sensing.
Meaning of sparse representation: only when the signal is K sparse (and K <m <n), it is possible to observe M observations, reconstruct the signal whose original length is n from K larger coefficients. That is, when the signal has a sparse expansion, the small coefficient can be lost without distortion.
We know that the signal length is N.XYou can use a group of baseline T = [limit 1 ,..., Ψ m] linear combination to represent:
X= BytesS, Using is the sparse nxn matrix,SIs A Sparse Coefficient (n-dimensional vector). When signal X has only K <n non-zero coefficients or coefficients far greater than zero on a certain base cosineSThe sparse base of the signal X. What we need to do is to reasonably select the sparse base so that the number of sparse coefficients of the signal is as small as possible.
Then, if the signal X with the length of N is not zero (or obviously greater than other coefficients), k <n, therefore, we can think that signal X is sparse in the Phi domain and can be called K-sparse (not strictly defined ). In this field, if we only retain the m large coefficient and discard other coefficients, we can reduce the space required to store the signal and achieve compression (lossy compression. At the same time, the original signal X can be reconstructed with the m coefficient, but in general, an approximation of X is obtained.
We should be familiar with the difference between JPEG and 2000. the Core Algorithm of JPEG is DCT, while the latter is dwt. In essence, the two processing methods are to transform the signal from one domain to another (rotating the coordinate system and projection the signal to different bases) to obtain the sparse representation of the signal, that is, the signal is expressed with the least coefficient, but the DWT is more sparse than the DCT. Different signals correspond to different bases for the most sparse expression. For example, for one-dimensional signals, wavelet may be the most sparse, while for images, Curvelet and contour may be the best, for some signals, it is also possible to combine several bases to be optimal. Sparse Decomposition is the most sparse and effective expression for finding signals.
The sparsity of signals in a certain representation mode is the theoretical basis for the application of compression sensing. The typical sparse methods include discrete cosine transform (DCT), Fourier Transform (FFT) and discrete wavelet transform (DWT.
In recent years, another hot topic in sparse representation research has been the Sparse Decomposition of signals in redundant dictionaries. This is a brand new theory of Signal Representation: replacing the base function with an ultra-complete redundant function library is called a redundant dictionary, and the elements in the dictionary are called atoms. Currently, the research on sparse representation of signals under the redundant dictionary focuses on two aspects: first, how to construct a redundant dictionary suitable for a certain type of signals, and second, how to design a fast and effective Sparse Decomposition Algorithm. Currently, common sparse decomposition algorithms can be divided into matching pursuit and basis pursuit.
2. Signal observation matrix
The observation matrix (also called the measurement matrix) mxn (M <n) is an observation vector y for the n-dimensional original signal, then, we can use the optimization method to reconstruct X from the high probability of Y. That is to say, the original signal X is projected onto this observation matrix (observation base) to obtain a new signal representing y.
The purpose of the design of the observation matrix is to sample M observations, and ensure that the equivalent Sparse Coefficient vector of the signal X with a length of N can be reconstructed.
To ensure that the signal can be accurately reconstructed from the observed values, the product of the observed matrix and the sparse matrix must satisfy the rip (finite offset ). This ensures that the observation matrix does not map two different K sparse signals to the same set (the one-to-one ing between the original space and the sparse space ), this requires that the matrix composed of m column vectors extracted from the observation matrix is non-singular.
In the CS-encoded measurement model, instead of directly measuring the sparse signal X itself, the signal is projected onto a group of measurement matrices (PHI) to obtain the measured value Y. That is, a mxn (M <n) Measurement Matrix not related to the transformation matrix is used to linearly project the signal X, and the linear measured value Y: Y = Φx is obtained;
The measurement value Y is an M-dimensional vector, which reduces the measurement object from N-dimensional to m-dimensional. The design of the Measurement Matrix requires that during the process of converting the signal from X to Y, the K measured values do not destroy the information of the original signal, so as to ensure accurate signal reconstruction.
Since signal X is sparse, the above formula can be represented as follows:
Y = phi X = Phi Ψ S = Θ s
Here, the diameter is an mxn matrix. In the above formula, the number of equations is much smaller than the number of unknown numbers, and the equations have no definite solutions and cannot reconstruct signals. However, because the signal is K sparse, if the above formula's Phi satisfies the limited offset property (restricted isometry property, rip for short ), then, K coefficients can be accurately reconstructed from M measurements (to obtain an optimal solution ). The equivalence condition of RIP properties is that the measurement matrix is not correlated with the sparse basis.
If the sparse base and the observation base are irrelevant, the RIP is guaranteed to a large extent. Candes and Tao prove that an independent Gaussian random measurement matrix with the same distribution can become a universal measurement matrix for compression sensing. Then, a random Gaussian matrix is usually used as the observation matrix. Currently, the commonly used measurement matrices include the random benuli matrix, some orthogonal matrices, topitz, cyclic matrices, and sparse random matrices.
3. Signal Reconstruction Algorithm
When the matrix Phi meets the RIP criterion. In the compression Sensing theory, we can first solve the Sparse Coefficient s through the inverse problem of the above formula, and then convert the signal with a sparse degree of KXProjected values from M dimensionsY. The most direct method for decoding is to solve the optimization problem through the l0 norm (0-norm, that is, the number of non-zero elements in the Y-weight vector:
So as to obtain the estimation of the Sparse Coefficient s '. Then the original signal x' = ipvs '. The above formula is an NP-hard problem (it is difficult to solve in polynomial time, or even unable to verify the reliability of the solution ). The minimum L1 norms are equivalent to the minimum l0 norms under certain conditions, and the same solution can be obtained. Then the above formula is converted to the optimization problem under the L1 minimum norm:
L1 norm minimization is to use l1 norm to approximate 0 norm. 1 is used instead of 1/2, 2/3, or other values because 1 Norm Minimization is a convex optimization problem, the solution process can be transformed into a linear programming problem. The Optimization Problem under L1 minimum norm is also known as baseline tracing (BP). Its common implementation algorithms include the interior point method and Gradient Projection Method. The speed of the interior point method is slow, but the obtained results are very accurate: the gradient projection method is fast, but the results obtained by the interior point method are not accurate.
At present, the compression perception reconstruction algorithms are mainly divided into two categories:
(1) greedy algorithms are used to implement signal Vector Approximation by selecting appropriate atoms and following a series of progressively increasing methods, these algorithms mainly include matching and tracing algorithms, orthogonal matching and tracing algorithms, and complementary space matching and tracing algorithms.
(2) convex optimization algorithm, which relaxed the 0-norm to the 1-norm through linear programming. Such algorithms mainly include gradient projection method, base tracing method, and least angle regression method.
Convex optimization algorithms are more precise than greedy algorithms, but require higher computing complexity.
In terms of mathematics, CS is to solve the undefined (unsuitable) equation under certain conditions. The conditions include that X is sparse and the measurement matrix must meet the rip conditions, then the undefined (undefined) equation will have a unique solution with a large probability.
Reference from: http://blog.csdn.net/caikehe/article/details/8510971
Http://blog.csdn.net/zouxy09/article/details/8118329
Comprehensive Understanding of compression perception