Compressing and perceiving Popular Science Documents

Source: Internet
Author: User

Over the past few days, thanks to the hard work of happyharry, everyone has expressed their interest in sparse expression and compression sensing. I am not quite familiar with this frontier. I only improved the popular science articles for the two Science Squirrels. They are all translations. Maybe everyone has read the original article.

The first article was written by Tao zhexuan.

This is a popular science article written by the mathematician Tao zhexuan on his own blog, discussing one of the most popular topics in the field of applied mathematics in recent years: Compressed
Sensing ). The core concept of the so-called compression sensing is to try to reduce the cost of measuring a signal in principle. For example, if a signal contains one thousand pieces of data, the traditional signal processing theory requires at least one thousand measurements to completely restore the signal. This is equivalent to saying that one thousand equations are required to precisely solve one thousand unknowns. However, the idea of compression sensing is to assume that the signal has certain characteristics (such as the sparse coefficients in the wavelet domain described in this article ), then the signal can be completely restored after only three hundred measurements (this is equivalent to solving three hundred unknowns only through one thousand equations ). It can be imagined that this incident contains many important mathematical theories and broad application prospects. Therefore, it attracted a lot of attention in the last year 34 s and gained a very vigorous development. Tao zhexuan is one of the founders in this field (you can refer to Tao zhexuan: a child prodigy who grew up). Therefore, the authority of this article is undeniable. In addition, this is a rare popular article written directly by top mathematicians about their cutting-edge jobs. It should be noted that this article is intended for non-mathematical readers, but it is not easy to understand. It may be easier to understand some background in science and engineering.

[Author Terence Tao; translator's shanzhai blind stream. More of his translations are here, that's; proofreader Mu yao]

Recently, many people have asked me what "compression sensing" means (especially with the recent reputation of this concept ), how does a single-pixel camera work (and how can it be better than a traditional camera in some cases ). This topic has a lot of literature, but there is no excellent non-technical introduction to such a relatively new field. Therefore, I am trying to help non-mathematical readers.

Specifically, I will focus on camera applications, although compression sensing is used as a measurement technique in a much wider range of fields than imaging (e.g. astronomy, nuclear magnetic resonance, statistical selection, etc ), I will briefly talk about these fields at the end of the post.

The purpose of the camera is to record images. To simplify the discussion, we suppose an image is a rectangular array, for example, an array of 1024x2048 pixels (in this case, a total of 2 million pixels ). To omit the color problem (this is more important), we assume that only black and white images are required, then, each pixel can be measured by a gray value of an integer (for example, an eight-digit integer represents 0 to 65535 bits and a 16 bits represent 0 to bits ).

Next, according to the simplest statement, a traditional camera will measure the brightness of each pixel (in the above example, 2 million measurements ), the resulting image file is relatively large (the 8-bit gray value is 2 MB, and the 16-bit gray value is 4 MB ). In mathematics, this file is depicted by a super-high-dimension vector value (about 2 million dimensions in this example ).

Before I start to talk about the new story of "compression sensing", we must first review the old story of "Old compression. (Readers who have already learned about image compression algorithms can skip this section .)

The above images occupy a lot of storage space of the camera (uploading to a computer also occupies disk space), and it also takes a waste of time for transmission between various media. As a result, the camera has the function of significantly compressing the image, which is quite simple (usually from 2 MB to a tenth-KB ). The key is that although the space of "all images" takes 2 MB of "freedom" or "entropy", the space of "meaningful images" is actually much smaller, especially if people are willing to reduce the image quality. (In fact, if a person uses all degrees of freedom to randomly generate an image, it is unlikely that he will get any meaningful image, instead, it gets the random noise equivalent to the static snow on the TV screen .)

How to compress images? There are a variety of methods, some of which are very advanced, but I will try to describe these advanced technologies with a low-tech (and not accurate) statement. Images usually contain large pieces of detail-for example, in landscape photos, nearly half of the images may be occupied by a monochrome sky background. Let's assume that we extract a large square, for example, 100x100 pixels, which are completely the same color-we suppose it is completely white. The square occupies 10000 bytes of storage space (calculated based on 8-bit gray scale), but we can only record the dimensions and coordinates of the square and fill the single color of the entire square; in this way, only four or five bytes are recorded, saving a considerable amount of space. However, in reality, the compression effect is not so good, because there is actually a slight color difference on the surface where there are no details. Therefore, given a detail-free square, we record its average color value and abstract this area of the image into a monochrome Color Block, leaving only a small residual error. Next, we can continue to select more visible color blocks and abstract them into monochrome color blocks. The last thing left is the small brightness (color intensity) that is invisible to the naked eye. Therefore, we can discard the remaining details. We only need to record the size, position, and brightness of the "visible" color blocks. In the future, you can perform reverse operations to create a copy image that is slightly less quality than the original image, but occupies much space.

In fact, the above algorithm is not suitable for dealing with sharp changes in color, so it is not very effective in practical applications. In fact, the better way is not to use even color blocks, but to use "uneven" color blocks-for example, the average color intensity of the right half side is greater than that of the left half side. This situation can be described using the (two-dimensional) Haar wavelet system. Later, we found that a "smoother" wavelet system can avoid errors, but this is all technical details and we will not discuss it in depth. However, the principles of all these systems are the same: the original image is represented as linear superposition of different "wavelet (similar to the color block above)", recording significant (high strength) wavelet coefficients, discard (or use a threshold to exclude) the remaining wavelet coefficients. This "wavelet coefficient hard threshold" compression algorithm is not actually used (such as JPEG
2000 defined in the Standard) So fine, but can also describe the general principles of compression.

In general (and very simplified), the original 1024x2048 image may contain 2 million degrees of freedom. To express this image using wavelet, we need 2 million different wavelets for perfect reconstruction. However, typical meaningful images are sparse from the perspective of wavelet theory, that is, they can be compressed: it may take only 100,000 wavelet to obtain all the visible details of the image. The remaining 1.9 million wavelet only contributes a small amount, and most observers basically do not see the "random noise ". (This is not always true: Images containing a large number of textures, such as hair and fur, which are particularly difficult to compress Using Wavelet algorithms, are also a major challenge for image compression algorithms. But this is another story .)

Next, if we know in advance which 2 million of the 100,000 wavelet coefficients are important, we can only calculate the 100,000 coefficients, and nothing else will work. (Setting an appropriate "filter" or "filter" on the image, and then measuring the color intensity of each pixel after filtering is a feasible method for measuring the coefficients .) However, the camera does not know which coefficient is important, so it has to measure all 2 million pixels, convert the entire image into a basic wavelet, and find the 100,000 dominant basic wavelet to be left behind, delete the remaining ones. (Of course, this is just a sketch of the real image compression algorithm, but we should use it for convenience of discussion .)

Now, the digital camera is very powerful. Why should we improve it? In fact, the above algorithms need to collect a large amount of data, but they only need to store a portion of the data, which is no problem in consumer photography. Especially as data storage becomes very cheap, it doesn't matter if you take a lot of photos that are not compressed at all. In addition, despite the nominal power consumption, the computation process required for compression is still easy. However, in some non-consumption applications, this data collection method is not feasible, especially in sensor networks. If you plan to use thousands of sensors to collect data, and these sensors need to stay in a fixed location for a few months, then we need to make the sensor as cheap and energy-saving as possible-this first removes the sensors with powerful computing power (however-this is also important-we still need the receiver that receives and processes the data. modern technology provides luxury computing power ). In such applications, the more "Dummies" the data collection method, the better (and such systems also need to be strong, such as being able to tolerate 10% of sensor loss or various kinds of noise and data defects ).

This is the application of compression sensing. The theoretical basis is: if we only need 0.1 million parts to reconstruct the vast majority of images, why do we need to do all the 2 million measurements? Is it enough to do only 0.1 million measurements? (In practical applications, we will leave a margin for security, for example, measuring 0.3 million pixels to deal with all possible problems, from interference to quantization noise, and restoring algorithm faults .) In this way, energy conservation can be achieved by an order of magnitude, which makes no sense to consumer photography, but has tangible benefits for sensor networks.

However, as I mentioned earlier, the camera does not know which 2 million wavelet coefficients need to be recorded in advance. If the camera chooses another 0.1 million (or 0.3 million), instead, it will throw all the useful information in the image. What should I do?

The solution is simple but not intuitive. It is to use non-wavelet algorithms for 0.3 million measurements-although I have mentioned before that wavelet algorithms are the best way to observe and compress images. In fact, the best measurement should be a (pseudo) random measurement-for example, randomly generating 0.3 million "filter" Images and measuring the correlation between the real image and each filter. In this way, the measurement results (that is, "correlation") between the image and the filter may be very small, very random. But -- this is the key -- the two million possible wavelet functions that constitute the image will generate their own unique "Features" under the measurements of these random filters ", each of them will be positively related to some filters and negatively related to other filters, but not to more filters. However, (under a high probability) Two Million features are different. What's more, any 100,000 linear combinations are still different (from the perspective of linear algebra, this is because any two 0.3 million-dimensional subspaces in a 0.1 million-dimensional Linear Subspace are very likely to be mutually exclusive ). Therefore, it is basically possible to recover the image from the 0.3 million random data (at least 0.1 million of the main details of the image ). In short, we are discussing the linear algebra version of a hash function.

However, this method still has two technical problems. First, there is a noise problem: the superposition of 0.1 million wavelet coefficients does not completely represent the entire image, and the other 1.9 million coefficients also contribute a little. These small contributions may interfere with the 0.1 million wavelet features, which is the so-called "distortion" problem. The second problem is how to use 0.3 million of the measurement data to reconstruct the image.

Let's take a look at the next question. If we know which 0.1 million of the 2 million wavelet are useful, we can use standard linear algebra (Gaussian elimination division, least square method, etc.) to reconstruct the signal. (This is one of the biggest advantages of Linear Encoding-they are easier to obtain inverse than nonlinear encoding. In fact, most hash transformations cannot be used to obtain inverse values. This is a major advantage in cryptography, but not in signal recovery .) However, as we mentioned earlier, we do not know which wavelets are useful beforehand. How can I find it? A simple Least Square approximation would produce a terrible result involving all 2 million coefficients, and the generated image also contains a large amount of granular noise. Otherwise, we can use a powerful search method to perform linear algebra for each group of possible 0.1 million key coefficients, however, this process is very time-consuming (a total of about 10 0.17 million combinations should be considered !), In addition, this powerful search is usually NP-complete (some of the special cases are the so-called "Subset sum" problem ). But fortunately, there are two feasible methods to restore data:

• Matching tracing: Find a wavelet whose flag looks to be related to the collected data; remove all the marks from the data; repeat until we can use the wavelet marker to "interpret" all the data collected.

• Base tracing (also known as L1 modulo Minimization): Find the "Most sparse" among all wavelet combinations that match the recorded data ", that is, the smaller the sum of the absolute values of all coefficients, the better. (This minimal result tends to force the vast majority of coefficients to disappear .) This minimization algorithm can be calculated within a reasonable period of time using convex planning algorithms such as the simple form method.

It should be noted that such image restoration algorithms still require considerable computing power (but they are not abnormal), but this is not a problem in applications such as sensor networks, because image restoration is performed at the receiving end (which can be connected to a powerful computer) rather than at the sensor end (which cannot be achieved.

Now there are close results that show that different compression ratios or sparsity are set for the original image. These two algorithms have a high success rate of perfectly or perfectly recreating the image. The matching tracing method is usually faster, while the basic tracing algorithm is more accurate when noise is taken into account. The exact applicability of these algorithms is still a very popular research area today. (Sorry, there is no application of P not equal to NP at present. If a reconstruction problem (considering the measurement matrix) is NP complete, then it cannot be solved using the above algorithm .)

Since compression sensing is still a fairly new field (especially the rigid mathematical results), it is still too early to expect this technology to be applied to practical sensors. However, there has been a concept verification model, the most famous of which is the single-pixel camera developed by Rice University.

It must be mentioned that the compression sensing technology is an abstract mathematical concept, rather than a specific operation scheme, which can be applied to many fields other than imaging. The following are examples:

• MRI ). In medicine, the working principle of magnetic resonance is to perform many (but the number of times is still limited) measurements (basically, to perform discrete Laidong transform (also called X-ray transformation) on human images )), then process the data to generate an image (here is the density distribution image of the body's inner water ). The process is too long for patients because the number of measurements must be large. Compression sensing technology can significantly reduce the number of measurements and accelerate imaging (even real-time imaging, that is, MRI videos rather than static images ). In addition, we can change the image quality by the number of measurements, and get a much better image resolution with the same number of measurements as before.

• Astronomy. Many astronomical phenomena (such as pulsars) have a variety of frequency fluctuation characteristics, which make them highly sparse in the frequency domain, that is, they can be compressed. The compression sensing technology will allow us to measure these phenomena (that is, record telescope data) in the time domain and precisely reconstruct the original signal, even if the raw data is incomplete or the interference is serious (the possible cause is that the weather is not good, the computer is not enough time, or because the autobiography of the earth makes it impossible for us to obtain full-time data ).

• Linear coding. The compression sensing technology provides a simple method for multiple senders to merge and transmit their signals with error correction, even if a majority of the output signals are lost or destroyed, the original signal can still be restored. For example, IF 1000 bits can be encoded into a 3000 bits stream using any linear encoding, even if 300 bits are maliciously destroyed, the original information can be completely restored without loss. This is because the compression sensing technology can regard the destructive action itself as a sparse signal (only 3000 bits in 300 bits ).

Many such applications are still in the theoretical phase, but this algorithm can affect so many fields of measurement and signal processing, and its potential is truly exciting. What I feel most fulfilled is that I can see my work in the field of Pure Mathematics (for example, estimating the determinant or single value of the Fourier subform), which will ultimately benefit the real world.

Article 2

Compressed Sensing is a hot research frontier in recent years and has attracted a lot of attention in a number of application fields. The squirrel has already translated two articles on this topic, one from Tao zhexuan, the first investigator of the compression sensing technology ), alan Berger, a mathematician from the University of Wisconsin ). These two articles are both popular, but because the author is a professional researcher, the text is still obscure. Therefore, I would like to take the liberty to attach a superfluous guide here to help more readers better understand the theoretical and practical significance of this novel research field.

The compression perception literally seems to be the meaning of data compression, but it is actually out of completely different considerations. Classic data compression technology, whether it is audio compression (such as MP3), image compression (such as JPEG), video compression (mPEG), or general encoding compression (ZIP ), both are based on the characteristics of the data itself, to find and remove the hidden redundancy in the data, so as to achieve the purpose of compression. Such compression has two features: First, it occurs after the data has been fully collected; second, it requires complex algorithms to complete. In contrast, the decoding process is generally relatively simple in terms of computing. Taking audio compression as an example, the computation for compressing an MP3 file is much greater than that for playing (decompression) An mp3 file.
File calculation.

With a bit of thinking, we will find that the asymmetry between compression and decompression is exactly the opposite to people's needs. In most cases, devices that collect and process data are usually portable devices with low cost, power saving, and low computing power, such as dummies cameras, recording pens, or remote control monitors. However, the process of processing (decompression) information is usually carried out on large computers. It has higher computing power and often has no portable and power-saving requirements. That is to say, we are using cheap and energy-saving devices to process complex computing tasks, while using large and efficient devices to process relatively simple computing tasks. In some cases, this contradiction is even more acute. For example, in field or military scenarios, data-collecting devices are often exposed to the natural environment, energy supply may be lost at any time or even some performance loss. In this case, the traditional data collection-compression-transmission-decompression mode basically fails.

The concept of Compressed Sensing is generated to solve such conflicts. Since we need to compress the redundancy of the collected data, and the compression process is relatively difficult, why do we not directly "collect" the compressed data? This collection task is much lighter and saves the trouble of compression. This is the so-called "compression sensing", that is, directly sensing the compressed information.

However, this seems impossible. Because the compressed data is not a subset of the pre-compressed data, it does not mean that there are 10 million pixels on the camera sensor, and 8 million of them are discarded, the remaining 2 million images are compressed images, which can only be a small incomplete image. Some information is lost forever and cannot be recovered. If you want to collect a small amount of data and expect to "decompress" a large amount of information from the small amount of data, you need to ensure that: first, these small amount of collected data contains the global information of the original signal, second, an algorithm can restore the original information from a small amount of data.
Interestingly, in some specific scenarios, the first thing above is automatically satisfied. The most typical example is medical image imaging, such as tomography and MRI. People who have a little understanding of these two technologies know that, in these two imaging technologies, the instruments do not collect direct image pixels, it is the data of the image after the global Fourier transformation. That is to say, each individual data contains the full image information to some extent. In this case, removing part of the collected data does not cause permanent loss of some image information (they are still included in other data ). This is exactly what we want.
The second thing above is due to the work of Tao zhexuan and Kan Dai. Their work points out that if the signal (whether an image or sound or other types of Signals) meets a specific "sparsity", then from these small amounts of measurement data, it is indeed possible to restore the original large signal. The required computing part is a complex iterative optimization process, the so-called "L1-minimization" algorithm.

Put the above two things together, we can see the advantages of this mode. It means that when collecting data, we can simply collect a part of the data ("compression awareness"), and then hand over the complex part to the end of the data restoration, it exactly matches our expectation. In the Medical Image field, this solution is particularly beneficial, because the data collection process is often a process that brings great trouble or even physical harm to patients. Taking X-ray tomography as an example, we all know that X-ray radiation will cause physical damage to patients, and "compression sensing" means we can use a much less radiation dose than the classic method for data collection, the significance of this in medicine is self-evident.

This idea can be extended to many fields. In a large number of practical problems, we tend to collect as little data as possible, or we have to collect incomplete data due to objective conditions. If there is a global transformation relationship between the data and the information we want to reconstruct, and we know that the information meets certain sparse conditions in advance, you can always try to restore a large number of signals from a small amount of data in a similar way. So far, such research has been widely expanded.
However, it must be noted that such practice does not always meet the two conditions described above in different application fields. Sometimes, the first condition (that is, the measured data contains the global information of the signal) cannot be met, such as the most traditional photography problem, each photosensitive element perceives only a small image, rather than global information, which is determined by the physical nature of the camera. To solve this problem, some scientists at Rice University are attempting to develop a new photographic device called a single pixel camera 」), we strive to use a small amount of photosensitive elements to achieve photography at the highest possible resolution. Sometimes, the second condition (that is, there is a mathematical method to ensure that the signal can be restored from incomplete data) cannot be met. In this case, the practice is at the beginning of the theory. People have already been able to perform a lot of Data Reconstruction procedures on algorithms, but the corresponding theoretical analysis has become a topic for mathematicians.

However, in any case, the basic idea represented by Compressed Sensing is to extract as much information as possible from as few data as possible. There is no doubt that it is an idea with great theoretical and application prospects. It is an extension of the traditional information theory, but beyond the traditional compression theory, has become a new sub branch. From the date of its birth to the present time in just five years, its influence has swept through more than half of the application sciences.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.