AP Series Article--PDM microphone

Last Update:2016-06-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

The PDM represents the pulse density modulation. However, a better abbreviation is "1-bit oversampling audio" because it is simply a high-sample-rate, single bitrate digital system. If you are looking for an advantage, it is that the sample rate is several times the audio CD, and a proper way to reduce the word length from 16bit to 1bit, which will be the basis of a PDM system.

Most modern digital audio systems use multi-bit PCM (pulse-coded modulation) to characterize signals. PCM facilitates easy processing. This allows the operation of the signal processing to be done on the audio stream, such as mixing, filtering, and equalization.

PDM transmits audio only in 1bit, and is simpler in concept and in terms of implementation than PCM. It is commonly used in mobile phones to transmit audio from the microphone to the signal processor. PDM is ideally suited for this task because it brings a number of benefits, such as low noise and interference free signals, and low costs.

This file will cover the fundamentals of PDM: How to generate, transfer, and process.

Quick Glossary

DAC (Digital analog Converter): A device used to convert digitally represented signals into analog quantities.
LSB (least significant bit): The minimum change of a digital quantity. A bit is a binary digital quantity.
MSB (most significant bit): a number of bits with the highest value. In fact, it represents a sign bit on a fixed-point signed number.
PCM (Pulse coded modulation): A system uses a series of multi-bit words to characterize a sampled signal. This technique is used in audio CDs.
PDM (pulse density modulation): A system uses a single bit to characterize a sampled signal.
Sampling rate: signals are sampled to produce a time-discontinuous characterization of the rates.
Wordlength: The number of bits used to characterize a sample.
Quantization: A process that uses a given word length to characterize a random sampling of data.
Dither: is a noise-like signal that is added before quantization to improve performance.
Linearization: A process used to mitigate the harmful effects of data quantization, usually adding jitter.
Noise modulation: Undesirable changes in the background noise caused by the signal content within the system.

PCM

Before we talk about PDM, let's look at PCM, the traditional multi-bit digital audio. The audio signal of the PCM is represented by a series of sampled values, with a fixed number of bits for each sampled value. There are two factors that determine system performance:

Sampling frequency. This determines the bandwidth of the system.
Word. This determines the signal-to-noise ratio of the system.

In particular, the bandwidth is half the sample frequency, and the signal-to-noise ratio is (6.02N +1.76)d B Given, n is the number of bits of the word length.

The theoretical signal-to-noise ratio of an unprocessed 16-bit system is 96 d B 。 In practice, jitter is used to linearized the system and eliminates noise modulation, which reduces the SNR 4 d B 。 Using the previous formula, the signal-to-noise ratio of a non-dithered 1-bit system is 8 d B , which is unacceptable for any realistic audio job. At the same time, the ideal jitter requires 2 least significant bits to work, since the 1bit system has only 1 least significant bits and is used on audio, so there is no room for jitter.

Since the system cannot be properly dithered, the 1-bit rendering seems to be unworkable at first glance. The solution lies in understanding noise shaping and oversampling.

Noise Shaping

Consider a typical PCM signal, such as a sine wave waveform with a 24-bit representation. How to represent this waveform in a system with 1-bit word length, and how to express it when there are serious noise and distortion problems with such a system?

One method is to discard all the bits except the most significant bit, effectively thresholding the signal around 0 points. This converts the sine wave into a square wave that switches over 0 points. This introduces a great deal of distortion; in fact, it's over 40%. The reason for the distortion is that the system is not shaken. Quantization always causes errors, but in a system that is shaken, the error is shaped like a background white noise that is not related to the signal. In a system that is not shaken or under-jitter, many of the errors are in the form of distortion.

The answer is therefore not to retain the most significant bit to reduce to 1 bits. However, we are familiar with an example that reduces the word length to 1 bits and works fine. This is the halftone, and has become the basic principle of restoring images on print media since the invention of the newspaper.

Within a halftone, a continuous tonal image (such as a grayscale) is converted into a series of black dots and white dots. In other words, the word length is shortened to 1 bits, and the state of the bits corresponds to a black point or white point. Instead of simply thresholding, the errors generated by the thresholding are assigned to adjacent, non-thresholding points. This process is called error diffusion. (There are many other ways to create a halftone, which we don't consider here.) The effect of error diffusion on image quality is enormous, as follows:

Why does the error spread after the threshold is diffused greatly improves the visual quality of the picture? The answer is that error diffusion has completed two functions. First, it transforms the distortion caused by simple thresholding into a much closer to the background noise, and secondly, it shapes the noise of the floor so that it reduces the noise in the low-frequency space, at the cost of increasing the noise in the high-frequency space. In addition, high image frequencies are filtered by the native resolution of the human eye, so once these points are small enough (or the image is far enough), most of the high-frequency noise is almost invisible.

As a result, the severe distortion generated by the threshold becomes benign, high-pass noise. Here the benign, meaning is that the effect is acceptable, although it is not really the background noise, because the system is not shaken. The noise is still intertwined with the signal, showing the behavioral characteristics of the tones and other visual artifacts. However, the visual result is good.

The halftone system is an example of noise shaping. The noise caused by reducing the word length is shaped so it won't be lowered, but Qualcomm is not. In general, the noise shaping system has a length of lengths and does not require their noise transfer function to pass through a high-pass filter. However, for most of these systems, including the PDM system, there are 1-bit outputs and high-pass noise transfer functions.

over sampling

The noise caused by reducing the word length is enormous (the noise in a 1-bit system is approximately higher than in a 16-bit system 90 d B ）。 Noise shaping distributes noise at a high bandwidth, but does not reduce the overall noise level. In an image application, most of the content of the image is low frequency and pushing the noise to the high frequency (which may obscure the high frequency) is not a problem. In the audio field, however, intermediate frequency and high frequency are very important and are very easy to hear. If the word length is reduced to 1 bits, it is impossible to obtain acceptable results, even in the case of noise shaping. As a result, high-frequency noise is easily audible.

The answer to the question is to use a higher sampling rate. This increases the bandwidth of the system and creates a new spectrum over the audible range. Noise shaping can therefore be used to push noise into a higher frequency spectrum. What is useful is that more space is created to dump the noise. And because the spectrum is above the audible range, the noise is inaudible.

The higher-speed sampling frequency can be achieved in the following two ways:

Use a higher sampling frequency at the beginning. This method is used on the PDM microphone, and the typical sample rate is 3MHz.
A known signal is interpolated, and the signal is sampled at a low rate. This method is used by many dacs, and the typical input sampling frequency is 48KHz. This is also used in an audio system that internally uses PCM to represent audio, but uses the PDM format to transmit audio to external devices.

Now let's take a look at these two ways for more details.

PDM Microphone

The PDM microphone is also known as a digital microphone, which includes the following sections:

A microphone feature. Typical is a Electret container.
An analog pre-amplifier.
A PDM modulator.
Interface logic.

Analog signals from the microphone feature are first amplified and then sampled and quantified at high rates in the PDM modulator. The modulator contains quantization operation and noise shaping, and the output is a high rate single bit. Noise shaping ensures that the relevant noise in the audio band is relatively low, although noise above the audio band is relatively high. The interface logic is able to receive a master clock and transmit the sampled bit stream.

The microphone-connected device provides a master clock to the PDM microphone. The clock frequency defines the sampling frequency of the system and defines the bit rate that is transmitted over the data line. Although no standard is defined, the typical oversampling rate is 64. So in order to obtain a 24KHz bandwidth (compared to a PCM system with a sampling frequency of 48KHz), the frequency of the master clock needs to be 3.072MHz.

The 1-bit data on the data line is considered valid at the rising or falling edge of the main clock. Most PDM microphones support two-channel operation, and the data on the data line is valid when a microphone is on the rising edge of the main clock, and the second microphone is in effect on the falling edge. On non-active edges, the data output is high impedance. The data lines of the two microphones can be simply connected together. The PDM receiver is capable of separating two bitstream streams.

DACs and PCM-PDM converters

In many commercial dacs and in systems where PCM can be converted to PDM, the process differs slightly from the PDM microphone. The signal has been sampled at low frequencies and is in PCM format. In order to obtain a high sampling frequency that makes the noise shaping effective, the signal must first be interpolated. Then its word length is reduced to 1 bits within the noise shaper.

Interpolation is a digital filtering operation, which refers to generating additional sample values between valid sampled values to increase the effective sampling frequency. For PDM applications, the oversampling rate is typically 64, meaning that there are 63 new sample values between each input sample.

PDM modulator

The PDM modulator (within the PDM microphone) or noise shaper (within the PCM to PCM Converter) has the ability to generate a 1-bit signal with very low noise in the pass band. The complexity of the modulator is represented by its order. The order of the modulator is equal to the number of interpolated (aggregation nodes) it contains; Usually, the higher the order, the more intense the noise from the pass band to the resistance band, the better the noise performance. However, the more complex the production and design of higher-order modulators, the more unstable they become under determined operating conditions, and the lower their maximum input level before overloading. Because there is no industry standard, the modulator of a typical PDM microphone is 4-step. This provides a good tradeoff between noise performance and complexity.

The following is a view of the time and frequency domain of the PDM modulator output given a sine wave input signal. The time domain output switches between the two levels at a high rate. Within the frequency domain, the x On-axis pass band from 0 Extended to 0.5 f s 。 Above this is the spectral space created by oversampling. It is easy to see the violent noise rise above the pass band. Also visible is a small number of three harmonic distortions (peak proximity 0.06Fs ）。

transmission and processing of PDM signals

A PDM bitstream is a logic level signal with a typical switching frequency near 3MHz, fast edge hopping. So it needs to be treated like any other fast signal (e.g. SPDIF, or analog video). It is important to use a high-quality coaxial cable and to properly terminate the signal.

If a signal wants to be heard, it must be converted into an analog format. If it needs to be processed or analyzed by the test device, it needs to be converted into PCM. You can also use a PDM signal when doing these things.

It is very simple to convert the PDM signals into analog signals. The 1-bit signal already contains the audio signal in the low frequency portion of the spectrum. All the recovery work to be done is a low-pass filter. In practice, the fast switching edge of the signal requires careful design of the analog filter series, but it can be used in this way to restore very high-quality analog signals.

Converting PDM to PCM is more complex. Sampling frequency needs to be reduced by using oversampling factors. This is done by a digital filtering process called decimation. Extraction is the interpolated counterpart: The sample is shifted from the signal to reduce the sampling frequency. It is important that the 1-bit form of noise above the audio bandwidth cannot be brought into the audio band. The decimation filter is designed to filter out this noise, making the baseband audio signal intact. The output of the extractor is a PCM audio stream (no oversampling) at the baseband rate. Typically, the word length has been lifted from 1 bits to a filtering period with approximately 20 significant digits.

Performance

The 1-bit system is very mature. Although the 1-bit system has its inherent problems, especially the inability to add enough jitter to fully linearized the system and eliminate noise modulation, it can be designed as a system with excellent audio performance.

PDM modulators are usually patented, so their performance depends on the design. APX's PDM interface option uses a modulator that is a 4-step modulator plus a 120 d B Image/alias rejection of Level 6 interpolation/decimation filter. The resulting system has the following specifications:

Maximum input level before Overload: -6dbfs
SNR @ 1KHz, -6dbfs, 20hz-20khz, unweighted:109db
Thd+n @ 1KHz, -6dbfs, 20hz-20khz, unweighted: -107db
Three harmonic attenuation @ 1KHz, -6dbfs: -116db
Flatness: 20hz-20khz: Better than ±0.001db

All high-order PDM modulators have a maximum input level below the full label. Exceeding this level can cause the modulator to overload, resulting in degraded noise performance. The APX user interface indicates that the modulator is overloaded.

The thd+n performance of the system is determined by the local noise of the modulator. Here is a very small three harmonic distortion. This is because the system is not jittery.

Conclusion

PDM is a cost-effective way to digitally transport audio in mono and or in two channels, through a clock/data pair. Although the 1-bit form has its inherent limitations, it can also obtain extremely high audio performance under careful design. The APx PDM interface option can generate and analyze PDM signals, greatly simplifying the design of the PDM signal chain and troubleshooting all aspects.

AP Series Article--PDM microphone

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

AP Series Article--PDM microphone

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

AP Series Article--PDM microphone

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support