Compression sensing (compressed sensing) two popular science

Source: Internet
Author: User

Article posted from: http://www.cvchina.info/2010/06/08/compressed-sensing-2/#more-1173

This is a popular science article written by mathematician Tao on his own blog, which discusses one of the hottest topics in the field of applied mathematics in recent years: Compression perception (compressed sensing). The core concept of the so-called compression perception is to try to reduce the cost of measuring a signal in principle. For example, if a signal contains 1000 data, then according to the traditional signal processing theory, at least 1000 measurements are required to fully recover the signal. This is equivalent to saying that 1000 equations are required to accurately solve 1000 unknowns. But the idea of compression perception is to assume that the signal has some characteristics (such as the feature sparse in the wavelet domain described in the text), then the signal can be completely restored with just 300 measurements (which is equivalent to solving 1000 unknowns with only 300 equations). It is conceivable that this matter contains many important mathematical theories and broad application prospects, so in the last three or four years attracted a lot of attention, has been very vigorous development. Tao itself is one of the founders of this field (can refer to "Tao: the child prodigy of growing Up" article), so the authoritative needless of this article. In addition, this is a relatively rare and first-class mathematicians directly written on their own cutting-edge work of the popular article. It is to be explained that this article is written to a non-mathematics major reader, but it is not understood, perhaps with some science and engineering background will be easier to understand some.

"the author Terence Tao; the translator cottage jobless, his more translations in this, that; proofing wood away"

A lot of people have recently asked me what "compression perception" means (especially with the recent reputation of this concept), and how the so-called "single-pixel Camera" works (and how can it have an advantage over traditional cameras in some situations). The subject already has a lot of literature, but for such a relatively new area, there is not a good non-technical introduction. So I try to do this small, I hope to be able to non-mathematics major readers help.

In particular, I'll focus on camera applications, although compression sensing is used as a measurement technique in a much wider range of areas than imaging (e.g. astronomy, NMR, statistical selection, etc.), and I'll talk about these areas briefly at the end of the post.

The purpose of the camera, naturally, is to record images. To simplify the discussion, we assume that the image is a rectangular array, such as an array of 1024x2048 pixels (2 million pixels in total). In order to omit the color problem (this is minor), we assume that only black and white images are required, so each pixel can measure its brightness with an integer grayscale value (for example, eight-bit integers represent 0 to 255, 16 bits represent 0 to 65535).

Next, according to the most simplified theory, the traditional phase opportunity to measure the brightness of each pixel (in the above example is 2 million measured values), the resulting picture file is relatively large (with 8-bit gray value is the 2mb,16 bit grayscale is 4MB). Mathematically it is assumed that this file is depicted with an superelevation-dimensional vector value (in this case, about 2 million dimensions).

Before I start talking about the new story of "Compression perception", you must first review the old story of "old-fashioned compression." (Readers who already understand the image compression algorithm can skip these paragraphs.) )

The image above takes up a lot of storage space on the camera (which is also a disk space on the computer) and is a waste of time when transferring between various media. As a result, the camera with a significantly compressed image function is logical (usually can be compressed from 2MB so large to a small lump of one-tenth--200kb). The key is that although "all images" make up a space that takes up 2MB of "freedom" or "entropy", the space made up of "meaningful pictures" is actually much smaller, especially if people are willing to reduce the quality of a little image. (In fact, if a person is really using all the freedom to generate a picture randomly, he is unlikely to get any meaningful images, but to get the equivalent of random noise like a static snowflake on a TV screen.) )

How do I compress an image? There are a variety of ways, some of which are very advanced, but let me try to describe these advanced technologies in a less high-tech (and less precise) statement. Images often contain large, no-detail parts – for example, in a landscape, nearly half of the footage may be occupied by a monochrome sky background. Let's say we extract a large square, say 100x100 pixels, which is exactly the same color-assuming it's all white. When uncompressed, this block takes up 10000 bytes of storage (according to 8-bit grayscale), but we can only record the dimensions and coordinates of the block, and a single color that fills the entire block, so that a total of four or five bytes is recorded, saving considerable space. However, in reality, the compression effect is not so good, because the surface seems to have no details of the place actually has a slight chromatic aberration. Therefore, given a no-detail block, we record its average color value, the image of this area is abstracted into a monochrome color block, leaving only a small residual error. Next, you can continue to select more color-visible squares and abstract them into a monochrome color block. The last thing left is the light (color intensity) of the small, imperceptible details of the naked eye. So you can discard the remaining details by simply recording the size, position, and brightness of the "visible" color blocks. In the future, you can reverse the operation and reconstruct a copy image that is slightly less than the original image quality and occupies much less space.

In fact, the above algorithm is not suitable for dealing with the situation of drastic changes in color, so it is not very effective in practical applications. In fact, the better approach is not to use a uniform color block, but instead to use "uneven" color blocks-for example, the right half of the color intensity average is greater than the left half of the color block. This can be described by the (two-dimensional) Haar wavelet system. Later, it was found that a "smoother" wavelet system was more capable of avoiding errors, but this was all technical detail and we did not discuss it in depth. However, all of these systems have the same principle: the original image is represented as a linear overlay of different "wavelets" (similar to the color blocks above), the coefficients of the significant (high-intensity) wavelets are recorded, and the remaining wavelet coefficients are discarded (or excluded by thresholds). This "hard threshold for wavelet coefficients" compression algorithm is not as fine as the actual algorithm used (as defined in the JPEG 2000 standard), but it also describes the general principle of compression.

In general (and very simplified), the original 1024x2048 image may contain 2 million degrees of freedom, and the person who wants to use Polai to represent the image needs 2 million different wavelets for a perfect reconstruction. But the typical meaningful image, from the point of view of wavelet theory, is very sparse, which is compressible: it may take only 100,000 wavelets to obtain all the visible details of the image, while the remaining 1.9 million small waves contribute only a very small amount, and most observers are largely invisible "random noise". (This is not always true: images with lots of textures-such as hair, fur images-are particularly difficult to compress with wavelet algorithms, and are a challenge for image compression algorithms.) But this is another story. )

Next, if we (or rather, the camera) know in advance which 100,000 of the 2 million wavelet coefficients are important, then you can just measure these 100,000 coefficients, whatever else. (Setting an appropriate "filter" or "filter" on the image, and then metering the color intensity of each pixel that is filtered out, is a feasible method of coefficient measurement.) However, the camera will not know which factor is important, so it has to measure all 2 million pixels, the entire image into a basic wavelet, to find out the need to leave the 100,000 dominant basic wavelet, and then delete the rest. (This is, of course, a sketch of the real image compression algorithm, but we'll use it for the sake of discussion.) )

Now, of course, the digital camera is very powerful, and there is no problem why to improve it? In fact, the above algorithm needs to collect a large amount of data, but only a portion of it needs to be stored, in the consumer photography is no problem. Especially as data storage becomes cheap, it doesn't matter if you're taking a bunch of completely uncompressed photos. And, despite its famous power consumption, the computational process required for compression is still easy. However, in some applications in the non-consumer domain, this method of data collection is not feasible, especially in sensor networks. If you're going to use thousands of sensors to collect data, and these sensors need to stay in a fixed location for a few months, then you need a sensor that is as inexpensive and energy-efficient as possible--which first excludes those with powerful computational power (which, however, is important – we receive the processing data at the receiving end Still need the luxury computing power provided by modern technology). In such applications, the more "fool" The data is, the better (and the system needs to be strong, for example, to tolerate 10% of sensor loss or various noise and data defects).

This is where compression sensing comes in. The theoretical basis is: if only 100,000 components can be reconstructed most of the image, then why do all the 2 million measurements, only 100,000 times not enough? (In practical applications, we will leave a safety margin, such as measuring 300,000 pixels, to cope with all possible problems, from interference to quantization noise, and recovery algorithm failures.) This can basically make energy saving an order of magnitude, which has little meaning for consumer photography, but has tangible benefits for sensor networks.

However, as I said earlier, the camera does not know in advance which 100,000 of the 2 million wavelet coefficients need to be recorded. What if the camera picks up another 100,000 (or 300,000) and throws away all the useful information in the picture?

The solution is simple but not intuitive. is to use a non-wavelet algorithm to do 300,000 measurements-although I did say before that the wavelet algorithm is the best way to observe and compress the image. In fact, the best measurements should actually be (pseudo) random measurements--such as randomly generating 300,000 "filter" images and measuring how relevant the real image is to each filter. In this way, the results of these measurements (i.e. "correlations") between the image and the filter are very likely to be very small and very random. But--that's the point--the 2 million possible wavelet functions that make up the image generate their own "features", each of which is positively related to some filters, negatively related to the other filters, but not to more filters. However, (in the great probability) 2 million characteristics are different; what is more, the linear combinations of any 100,000 of them are still different (in the view of linear algebra, this is because any two 100,000-dimensional subspace in a 300,000-dimensional linear subspace is very likely to be disjoint from each other). Therefore, it is basically possible to recover images from these 300,000 random data (at least 100,000 major details in the recovery image). In short, we are discussing a linear algebraic version of a hash function.

However, there are still two technical problems in this way. The first is the noise problem: the superposition of 100,000 wavelet coefficients does not represent the whole image completely, and the other 1.9 million coefficients have a little contribution. These small contributions may interfere with the characteristics of the 100,000 wavelets, the so-called "distortion" problem. The second question is how to use the resulting 300,000 measurement data to reconstruct the image.

Let's take a look at the latter question first. If we know which 100,000 of the 2 million wavelet is useful, then you can use the standard linear algebra method (Gaussian elimination, least squares, and so on) to reconstruct the signal. (This is one of the greatest advantages of linear coding-they are more prone to inverse than nonlinear coding.) Most hash transformations are virtually impossible to reverse--a key advantage in cryptography, but not in signal recovery. But, as we said earlier, we don't know beforehand which wavelets are useful. How to find out? A simple least squares approximation results in a horrible result involving all 2 million coefficients, and the resulting image also contains a large number of particle noise points. Alternatively, you can replace a brute force search with a linear algebra for each of the possible 100,000 key coefficients, but it's a horrible time to do this (consider a total of about 10 of the 170,000-time combinations!). , and this brute force search is usually NP-complete (some exceptions are the so-called "sub-set plus total" problem). Fortunately, there are two possible ways to recover data:

• Match tracking: Find a wavelet whose tag appears to be related to the collected data, and remove all traces of the mark in the data; Repeat until we can "interpret" all the data that is collected with the wavelet notation.

• Base Tracking (aka L1): In all the wavelet combinations that match the recorded data, find a "least sparse", which means that the sum of the absolute values of all the coefficients is as small as possible. (This minimized result tends to force the majority of the coefficients to disappear.) This minimization algorithm can be calculated in a reasonable time by using convex programming algorithms such as Simplex method.

It is important to note that this type of image recovery algorithm still requires considerable computational power (but not too much), but this is not a problem in applications such as sensor networks, because image recovery is done at the receiving end (which has the means to connect to a powerful computer) rather than the sensor side (which is no way to do so).

There are now rigorous results showing that the original image is set to different compression rates or sparse, the two algorithms perfect or approximate perfect image reconstruction success rate is very high. The matching tracking method is usually faster, and the base tracking algorithm is more accurate when considering the noise. The exact scope of application of these algorithms is still a very popular research area today. (It is regrettable that there is no application for P not equal to NP problem; If a reconstruction problem (when considering the measurement matrix) is NP-complete, it just can't be solved with the above algorithm. )

Since compression sensing is still a fairly new area (especially with rigorous mathematical results), it is still too early to expect this technology to be applied to practical sensors. However, there has been a proof-of-concept model, the most notable of which is the single-pixel camera developed by Rice University.

Finally, it must be mentioned that the compression sensing technology is an abstract mathematical concept, rather than a specific operating scheme, it can be applied to many areas other than imaging. Here are just a few examples:

• Magnetic resonance imaging (MRI). In medicine, magnetic resonance works by making many (but still limited) measurements (basically, a discrete radon transformation (also called an X-ray transformation) of a human image, and then processing the data to produce an image (in this case, the density distribution image of the human water). Because the number of measurements must be many, the whole process is too long for the patient. Compression sensing technology can significantly reduce the number of measurements, speeding up imaging (and possibly even real-time imaging, that is, NMR video rather than static images). In addition, we can also change the image quality by measuring the number of times, with the same number of measurements can get a much better image resolution.

Astronomy Many astronomical phenomena, such as pulsars, have a variety of frequency oscillation characteristics, making them highly sparse and compressible in the frequency domain. Compression sensing technology will enable us to measure these phenomena in the time domain (that is, to record telescope data) and to accurately reconstruct the original signal, even if the original data is incomplete or the interference is severe (either due to bad weather, insufficient time on the machine, or because the Earth's autobiography makes it impossible for us to get full timing data).

• Linear coding. The compression sensing technology provides a simple way for multiple conveyors to combine their signal with error correction, so that the original signal can still be recovered even if a large part of the output signal is lost or destroyed. For example, you can encode 1000 bits of information into a 3000-bit stream with any linear encoding, so even if 300 of them are (maliciously) destroyed, the original information can be reconstructed perfectly without loss. This is because the compression sensing technique can treat the destructive action itself as a sparse signal (concentrating only 300 bits in 3000 bits).

Many of these applications remain at the theoretical stage, but the potential for this algorithm to affect so many areas of measurement and signal processing is encouraging. The author's own most accomplishment is to be able to see his work in the field of pure mathematics (for example, estimating Fouriers-type determinant or single value) and ultimately the prospect of benefiting the real world.

=============================== Gorgeous split-line =======================================

Compression perception is the most popular research frontier in recent years, which attracts attention in a number of application fields. On this topic, squirrels will have translated two articles, one from the first researcher Tao (link) of the compression sensing technology, a mathematician from the University of Wisconsin Alan Berg (text of this article). Both articles are universal, but because the author is a professional researcher, in fact the wording is still obscure. So I liberty to attach a superfluous guide to help more readers understand the theoretical and practical significance of this novel field of study.

Compression perception literally looks like the meaning of data compression, but in fact it's completely different. The classic data compression technology, whether it is audio compression (such as MP3), image compression (such as JPEG), video compression (MPEG), or general coding compression (Zip), is based on the characteristics of the data itself, to find and eliminate the implicit redundancy in the data, so as to achieve compression. This compression has two characteristics: first, it occurs after the data has been fully collected, and second, it needs a complex algorithm to complete. In contrast, the decoding process is generally more simple to calculate, in the case of audio compression, suppressing a MP3 file is much more computationally than playing (that is, extracting) a MP3 file.

A little thought will reveal that the asymmetry of compression and decompression is just the opposite of what people need. In most cases, the devices that collect and process data are often inexpensive, power-saving, low-computational portable devices, such as shoot camera, or audio pens, or remote monitors. The process of processing (that is, extracting) information is often carried out on large computers, which have higher computational capacity and are often not portable and power-saving requirements. In other words, we are using inexpensive and energy-efficient equipment to handle complex computational tasks, while handling relatively simple computational tasks with large, efficient equipment. This contradiction may even be more acute in some cases, such as in field operations or military operations, the acquisition of data equipment is often exposed to the natural environment, at any time may lose energy supply or even partial loss of performance, in this case, the traditional data acquisition-compression-transmission-decompression mode is basically ineffective.

The concept of compression perception is created to solve such contradictions. Since the acquisition of data after all to compress the redundancy, and this compression process is relatively difficult, then why we do not directly "collect" the compressed data? This is a much lighter task and eliminates the hassle of compression. This is known as "compression perception", that is, direct perception of compressed information.

But this seems like an impossible thing to do. Since the compressed data is not a subset of the data before compression, it is not said that there are 10 million pixels on the camera's sensor, throw away 8 million of them, the remaining 2 million is the compressed image,--this can only collect an incomplete small piece of image, Some information is lost forever and cannot be restored. If you want to collect a very small amount of data and expect to "extract" a large amount of information from these small amounts of data, you need to be assured: first: These small amounts of collected data contain the global information of the original signal, and the second: there is an algorithm that can restore the original information from these small amounts of data.
Interestingly, on certain occasions, the first thing above is automatically fulfilled. The most typical examples are medical image imaging, such as tomography (CT) technology and nuclear magnetic resonance (MRI) techniques. People with a little knowledge of both technologies know that in both of these imaging techniques, the instrument is not a direct image pixel, but the image has experienced the global Fourier transform data. In other words, each individual data is to some extent contain the information of the full image. In this case, removing some of the collected data does not result in a permanent loss of some of the image information (they are still included in other data). That's exactly what we want.
The second thing is due to the work of Tao and Kan. Their work points out that if a given signal (whether image or sound or other kind of signal) satisfies a particular "sparsity", it is indeed possible to restore the original larger signal from these small amounts of measurement data, where the computational part required is a complex iterative optimization process called " l1-minimization "algorithm.

Put these two things together, we can see the advantages of this model. It means that we can simply collect a subset of the data ("compression perception") while collecting the data, and then hand over the complex parts to the end of the data restore, exactly matching the pattern we expect. In the field of medical imaging, this program is particularly beneficial, because the process of collecting data is often a great inconvenience or even bodily injury to the patient. Taking X-ray tomography as an example, it is well known that X-ray radiation can cause bodily damage to patients, and "compression perception" means that we can use a much less radiation dose than the classical method for data collection, which is self-evident in the medical sense.

This idea can be extended to many areas. In a large number of practical problems, we tend to collect data as little as possible, or we have to collect incomplete data due to the limitation of objective conditions. If there is some kind of global transformation between this data and the information we want to reconstruct, and we know beforehand that the information satisfies some kind of sparsity condition, we can always try to restore more signals from less data in a similar way. So far, this kind of research has been expanded to a very wide range.
However, it is also necessary to note that such a practice does not always meet the two conditions described above in different application areas. Sometimes, the first condition (that is, the measured data contains the global information of the signal) cannot be satisfied, such as the most traditional photographic problem, each sensor perceives only a small piece of image rather than the global information, which is determined by the physical properties of the camera. To solve this problem, some scientists at Rice University in the United States are trying to develop a new photographic device called a "single-pixel camera" that seeks to achieve the highest possible resolution of photography with as few photosensitive elements as possible. Sometimes, the second condition (that is, a mathematical method is guaranteed to be able to restore a signal from incomplete data) cannot be satisfied. At such times, practice goes in front of the theory. People have been able to advance a lot of data reconstruction process in the algorithm, but the corresponding theoretical analysis has become the subject left in front of mathematicians.

But anyway, the basic idea that compression perception represents: extracting as much information from as little data as possible is undoubtedly an idea with great theoretical and application prospects. It is an extension of the traditional information theory, but it goes beyond the traditional compression theory and becomes a brand-new sub-branch. It has been five years since the day of its birth, and its effects have swept far above the applied sciences.

Compression sensing (compressed sensing) two popular science

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.