The popularity of digital computers has promoted the study of Speech Science, enabling people to record, save, exchange and analyze sound signals quickly, at a large cost. however, because the core of a digital computer is to express and record all information with discrete numbers, it cannot be used to describe all the mathematical concepts and methods that humans already have, of course, it cannot fully express all physical concepts and physical measurements. for a single sound signal, what we want to measure physically is the change of sound pressure over time. It corresponds to a certain continuous function of time in mathematics. digital computers cannot directly express such continuous signals, but can only express Discrete Time Series (I .e. discrete signals ). it cannot even express all discrete signals, but can only express discrete signals (digital signals) in terms of values ). therefore, when we use computers to process any kind of physical signal, the primary problem is the digitization of continuous signals (or "mode/number conversion ). generally, the process from continuous signal to discrete signal is called sampling, and the discretization process of the measured value is called quantization. here I want to clarify the sampling problem.
Before using the computer, we must clarify the impact of the sampling process on the original continuous signal, and then be confident in the subsequent processing and analysis work. the famous sampling theorem (nycraftsmanship-Shannon sampling theorem) is an important guiding theorem that helps us establish such confidence. unfortunately, people may often misunderstand it. now I will try to explain it as little as possible in mathematical language.
The Theorem involves several concepts, including "sampling", "sampling frequency", "Bandwidth", and "completely reconstructed ". first, "sampling" refers to the ideal sampling, that is, directly recording the precise value of the signal at a certain time point. therefore, the sampling theorem only applies to the ideal sampling process from the continuous signal to the discrete signal, but does not involve the quantization process of the measured value. second, "sampling frequency" refers to the number of sampling points per unit time. It also implies that the sampling discussed here is a periodic operation, and non-periodic sampling is not within the scope of its discussion. third, "Bandwidth" is a frequency-domain parameter of a signal. we have to mention the mathematical method of Fourier analysis. A time-varying signal that meets certain mathematical conditions (physical signals in reality mostly meet this condition), or a time-domain signal, it can be transformed into a signal that changes with frequency (or a frequency-domain signal ), the relationship between the time-domain signal and the frequency-domain signal is determined by Fourier's transformation and inverse transformation calculation methods. in fact, time-domain signals and frequency-domain signals are complete expressions of the same physical measure from different angles. when a common time domain signal is transformed into a frequency domain, the bandwidth of the signal exceeded by a portion of its non-zero value. in the theorem, the expression of bandwidth is sometimes misused to "double the highest signal frequency", because for signals with low-pass properties, the maximum cutoff frequency and bandwidth are the same. fortunately, this misunderstanding has little impact on speech processing. fourth, the so-called "full reconstruction" refers to the precise sampling value given in the previous conditions. In mathematics, the signal value of any time point in the original continuous signal can be accurately calculated. in fact, the mathematical formula used to "completely reconstruct" the original signal can be introduced by the mathematical proof of the theorem (that is, the formula of the nycraftsmanship-Shannon difference ). it is worth noting that this formula cannot be accurately implemented on a digital computer, at least because the family of functions used are infinitely long in the time domain.
The sampling theorem was formally proved by Shannon from 1928 to 1949 by nycept, which has no direct relationship with computers. however, since digital computers can only process discrete digital signals and continuous signals must be sampled and quantified before they can be processed by computers, the sampling theorem has fundamental guiding significance for computer-based signal processing technology.
Now let's focus on the meaning of "double" in the sampling theorem, because I think people are most likely to misunderstand from its literal extension. A common misunderstanding is that "if a signal is sampled at the sampling frequency fs, the FS/2 or more information in the signal will disappear ". this misunderstanding is not only wrong, but also dangerous. the sample theorem proves that when sampling a signal with a sampling frequency fs, the frequency components above fs/2 in the signal do not disappear, instead, it is mapped to a band below fs/2 and superimposed with the original frequency components below fs/2. this phenomenon is called "aliasing" and is the inevitable result of discretization of any continuous signal (math can be proved by Fourier analysis ). we can use the example shown below to illustrate this phenomenon.
The blue signal in the upper half is a part of x (t) = cos (2 * pI * t). It has only one line at F = 1Hz in the frequency domain. when we sample it with FS = 4Hz, the expected sampling points are shown in the Red Circle in the upper half of the graph. because our signal and sampling frequency meet the conditions of the sampling theorem, we can reconstruct the signal x (t) from these points ). the blue signal in the lower half of the figure is a part of Y (t) = cos (2 * pI * t) + cos (6 * pI * t). In the frequency field, F = 1, two 3Hz lines. when we sample it with FS = 4Hz, we can obtain the sample points as shown in the Red Circle in the semi-graph below. note that the values of the sampling points in the lower half of the graph are exactly twice the values of the corresponding sampling points in the upper half graph. if we use the sample points in the lower half of the image to reconstruct the signal, we will get 2 * Cos (2 * pI * t), as shown in the green dot line, instead of the original signal y (t ). the original F = 3Hz frequency component seems to have disappeared, in fact, this frequency component is mapped symmetric along fs/2 = 2Hz to F = 1Hz and then overlapped with the original frequency component. this damage to components lower than the FS/2 Frequency cannot be recovered. therefore, an important guiding significance of the sampling theorem is to provide the minimum condition for anti-aliasing. mixing is an inevitable effect of sampling. However, if the frequency composition of the original signal bandwidth is zero, the signal will not be damaged and it will be "completely reconstructed.
The misunderstanding mentioned above may lead to the introduction of mixing distortion within the frequency range to be observed. computer electronic devices (such as monitors) have many high-frequency noise signals. They do not disappear because they are higher than fs/2. Instead, they are mixed into the low-frequency band due to sampling. although the voice signal itself is a low-pass feature (about-6 dB per frequency), its high-frequency segments will not be absolutely zero. this is why there must be an anti-aliasing filtering step in the real sampling technology. the logical relationship here is: Sampling will inevitably lead to mixing --> the mixing that meets the sampling theorem will not destroy the signal (reconfigurable) --> anti-aliasing filtering is used to pre-process the signal to meet the conditions of the sampling theorem. of course, the anti-aliasing filter in reality cannot be ideal. the closer the filter is to the ideal, the higher the cost. in audio processing, there is a technology called oversampling, which is generally used for low performance (relatively low) the filter filters the signal and then uses a sampling frequency far greater than the two-pass band to sample, which makes the mixing frequency component much higher than the filter cutoff frequency, because the band resistance performance is relatively better. these are technical details. however, we should see that generally, computer sound cards do not mark the indicators of the anti-mixing filter, because the sound card design tends to focus on playing; and professional recording equipment such as CSL will provide detailed indicators. obviously, their price difference is not unreasonable.
There are so many discussions about the sampling theorem and filtering. another interesting phenomenon can be explained by the sampling theorem. when we watch a movie, if there is a camera starting with a probe plane, we will see that the propeller first turns faster, then it will suddenly look slower or even reverse. this is because the film camera is equivalent to sampling a continuously rotating propeller blade at a fixed frequency. When the rotation speed of the blade exceeds fs/2 and continues to increase, what we see is the result of interference. -- This is a digress.