Audio design in smart phones

Source: Internet
Author: User


When mobile phones are constantly integrated with various functions including photography, games, data, and video, they have become a playing platform for multimedia applications, it can be said that the development of Portable mini-computers is meticulous. In terms of positioning, such a mobile phone is different from a voice phone or a feature phone with some features, but rather a smart phone ).

In addition to strong data editing and management capabilities, smart phones provide audio, video, game, and other multimedia application services, and can also handle multiple tasks at the same time. Furthermore, its functions cover communication, information, and multimedia functions, namely:

1. communication functions: Voice, message (messaging), authentication, billing, and other communication processing functions;

2. Information functions: email, calendar, information management, sync, security, and other information processing functions;

3. Multimedia functions: video, photography, games, TV, streaming, music, DRM, and other multimedia application functions;

In addition to information functions, audio is a necessary processing task for communication and multimedia applications. In the past, mobile phones only needed to process pure voice call signals. However, the audio tasks processed in today's smart phones were heavy, except for multi-tone ringtones and MP3 music, fmbroadcast and game sound effects may also be available, and not just single-channel effects. What we need now is a sense of presence in a stereo.

In the past, the world of digital audio was completely divided into two parts: one is the world of hi-fi, and the other is the world of speech. Generally speaking, hi-fi refers to 16-bit stereo quality audio that is sampled at 44.1khz, that is, the type of CD music. The telephone voice is 8-bit and 8-kHz mono) low-quality audio. However, in the era of smart phones, the two audio worlds have begun to hit together. How can we integrate the Audio Subsystem with our applications and communication processing platforms, it becomes a key challenge for portable device engineers to develop new products.

Audio Encoding format and Interface

Before discussing the system architecture, let's take a look at the current situation of audio encoding. Currently, there are many Audio Encoding formats, including PCM, ADPCM, DM, PWM, WMA, Ogg, Amr, ACC, mp3pro, and MP3; for human speech has LPC, CELP and ACELP; other MPEG-2, MPEG-4, H. 264, VC-1 audio and video program encoding format. For more information about the market trend of mobile multimedia formats, see Figure 1 ).

Figure 1 mobile multimedia format application market trend

The following describes three common audio formats:

Amr format

Amr is the adpative multi-rate speech codec (Adaptive multi-rate speech codec). The initial version is the speech compilation code standard set by the European Telecommunications Standardization Association (ETSI) for the GMS system, because the bandwidth is divided into two types-AMR-NB (AMR narrowband) and AMR-WB (AMR wideband ). Most mobile phones of Nokia, the biggest brand in the market, support audio files in the above two formats.

MP3 format

MP3 is short for MPEG audiolayer3. It is an audio compression technology with a high compression ratio of-, which can keep the low-frequency part undistorted, however, at the cost of 12 kHz-16 khz's high-frequency components, the downsize of the file is reduced, and its. mp3's format file is generally only 10% of the. WAV format file. In addition, one of the major reasons why MP3 is popular is that it is not a technology protected by copyright, so anyone can use it.

There are many kinds of sampling frequencies for MP3 compressed music. You can use 64 kbps or lower encoding to save space, and you can also use Kbps to achieve extremely high compression sound quality. MP3 is divided into "CBR" (fixed encoding), and "VBR" (Variable Bit Rate) technology. Some mobile phones cannot play the downloaded music, it is because MP3 music in the "VBR" format is not supported.

AAC format

AAC (Advanced Audio Coding) is used in different ways than MP3, AAC supports up to 48 audio tracks, 15 low-frequency audio tracks, more sampling rates and transfer rates, compatibility with multiple languages, and higher decoding efficiency. In conclusion, AAC can provide better sound quality with a 30% reduction in the MP3 format, and the sound fidelity is better than that of the original sound, therefore, the mobile phone industry regards it as the best audio encoding format. AAC is a big family. They are divided into nine types to meet the needs of different occasions:

(1) MPEG-2AAC LC low complexity specification (low complexity)

(2) MPEG-2 AAC main specification

(3) MPEG-2 aac ssr Variable Sampling Rate specification (scaleable sample rate)

(4) MPEG-4 aac lc low complexity specification (lowcomplexity), the current mobile phone more common MP4 files in the audio part includes the specification of audio files

(5) MPEG-4AAC main specification

(6) MPEG-4 aac ssr Variable Sampling Rate specification (scaleable sample rate)

(7) MPEG-4 aac ltp Long Term Forecast specification (Long Term Prediction)

(8) MPEG-4 aac ld Low Delay specification (low delay)

(9) MPEG-4 AAC he high efficiency specification (high efficiency)

In the above specifications, the main specification includes all functions other than gain control, and its sound quality is the best, while the low-complexity specification (LC) is relatively simple, without gain control, the encoding efficiency is improved. The SSR and LC specifications are roughly the same, but the gain control function is added. In addition, LTP/LD/He are all encoded at a low bit rate. Among them, he adopts the neroacc encoder, which is a common encoding bit rate method recently. However, the main specification and LC specification have little difference in sound quality. Therefore, considering that the current memory of the mobile phone is still limited, the most frequently used AAC specification is the LC specification.

Audio interface

Audio interfaces are important topics for smart phone designers. Digital Speech generally uses the Pulse Code Modulation Interface, while the hi-fi stereo uses the serial I2S (Inter-IC Sound) interface or the ac''' 97 interface. I2S is a bus standard developed by Philips for the transmission of audio data between digital audio devices. It is a common interface in consumer audio products. AC? 7 is Intel's specification used to improve the sound effects of personal computers and reduce noise. As it was developed in 1997, it was called the ac''97.

In terms of computer audio requirements, it is basically similar to the consumer market, but in order to play music files with different sampling rates (8 kHz, 44.1 kHz, 48 khz, so there is a need for more efficient and inexpensive solutions, and ac''97 has this feature. In the broad-based handheld device market, three formats have their own advocates: CD, MD, and MP3 players use I2S interfaces; mobile phones use PCM interfaces; the PDA with audio functions uses the same ac''97 encoding format as the PC.

Audio System Integration Policy

In earlier systems, the telephone and PDA circuits are usually placed side by side in the device housing, where the PCM voice compilation code is controlled by the communication processor, and the hi-fi stereo (AC? 7 or I2S) is connected to another application processor. In this architecture, the integration between the two audio subsystems is still very low. In addition to occupying space, the distributed hardware switching circuit requires additional peripheral components for signal exchange and sound mixing, it will also cause problems such as harmonic distortion. See figure 2 ).

Figure 2 audio architecture processed by PCM and stereo respectively

Therefore, it is ideal to customize an integrated solution for specific applications. With the technology trend of SOC, some vendors have integrated the stereo digital analog converter (DAC) or codec into specific functions of the IC. However, some functions are suitable for integration, and some may have reverse effects.

For example, when manufacturers integrate power management and audio processing functions, they usually have to compromise the sound quality, because the power regulator (regulator) the noise will interfere with the nearby audio path. It is also difficult to integrate the audio function into the digital IC, because for the hi-fi component, A 35mm process is required to optimize the performance of mixed signal processing. However, the application of digital logic is now evolving towards a process of less than 18mm. For the above two integrated chip strategies, to allow two different circuits to coexist in one chip, the final chip size may be too large to be acceptable.

In addition, loudspeaker amplifier is particularly difficult to integrate. The heat it generates is a problem that requires heat dissipation. Therefore, an independent speaker driver IC is often needed. There is also a common integration problem, that is, to minimize the number of analog input or output connectors as much as possible.

The exclusive audio IC can avoid these problems, and there are several ways to achieve audio integration. Shared ADC and DAC can reduce hardware costs, but cannot play or record two audio stream formats at the same time. Dedicated converter can solve this problem for individual functions, but this will increase the cost of chips. The compromise is to share only the part of the ADC, but there is an independent DAC. In this way, when the telephone communication is in progress, you can also play other audio (such as playing the ringtone of another phone call or playing music). However, you cannot play audio simultaneously during communication. The power consumption of the ADC can be controlled at a lower sampling rate by turning off a function. See (Figure 3) and (figure 4 ).

Figure 3 exclusive audio processing system concept

Figure 4 architecture of smart phone audio processing

In addition, the audio system can have different practices. When the voice codec is integrated into the communication chipset, it is appropriate to use another hi-fi codec with additional analog input, output, and internal sound mixing; in another case, A dual-codec with an exclusive PCM Interface for directly connecting wireless headphones (such as Bluetooth) also has its advantages. See figure 5 ).

Figure 5 dual codec audio architecture with wireless earphone functions (such as Bluetooth)

The following is a planning analysis of several important components in the audio system:

Frequency and Interface

Although the internal circuit of the shared communication and application subsystem is feasible, this is not the case for the interface, because different audio applications have to operate at their own frequency in independent frequency areas. As long as the situation persists, codec of Integrated smart phones must have PCM interfaces and independent I2S or ac'' 97 interfaces at the same time.

In non-mobile devices (such as PCs), the audio frequency is usually produced by a crystal oscillator, but in the design of smart phones, to avoid additional power consumption, board space, and frequency chip costs, the designer prefers to separate the frequency functions required by the HIFI audio from the existing frequency. Because the low power consumption and low noise lock loop (PLL) can be integrated into the hybrid signal chip at a relatively low cost, so today's chip manufacturers are integrating one or two plls into their smart phone codec.


The most difficult Design Issues in smart phones are often related to microphones. Generally, there are at least two microphones to consider: one is the built-in internal microphone and the external microphone that is inserted into the headset (headset. In addition, there may be additional internal microphones for noise cancellation or stereo recording, as well as another external microphone required for the hands-free feature for vehicles. In addition to speeches, These microphones can also be controlled by application processors to record sound effects in voice short messages or short videos.

If the Audio Codec Chip needs to cover various switching functions, the circuit of the chip needs to be properly designed. In addition to the recording function, codec should also provide the side tone function, so that the headset user can hear their own voice. The insertion detection function provides a seamless switching function, that is, when the headset is inserted or pulled out, the system automatically switches to use internal or external headphones.

The noise elimination of voice (acoustic) is another problem. It uses two microphones, one for both the speech and background noise, and the other for only the background noise. The simulation method is often insufficient, so it needs to be enhanced through digital signal processing, and the audio codec needs to achieve the digitization of two microphone signals.

Another problem is the noise of outdoor wind. The frequency is usually lower than 200Hz, so it can be processed through the high-pass filter, in the indoor recording, the sound of the low-frequency part is missing. This filter should be optional for dual-purpose microphones, but this high-pass filter has been built in many audio ADCs. Therefore, mobile phone manufacturers should select a suitable solution based on their needs.

External Earphone

The use of the phone's external headset (headset/headphone) also requires a special analog circuit, that is, when the headset is plugged in, the audio output signal can be routed to the headset. Although a socket with an integrated mechanical switch can meet this requirement, it is too large and expensive. In addition, the speaker's volume may not be suitable for this headset. This problem can be solved by providing independent volume control for internal and external audio, and a simple slot design can also be used. Whether the external headset has a microphone also needs to be detected. This can be determined by sensing bias current. If there is no current flow, no microphone is inserted. This current sensor should be added to the audio codec of the smart phone to enable audio input and output processing based on different situations.


With the addition of multiple-tone ringtones, MP3 playback, and fmbroadcast functions, the Broadcast System of smart phones is also evolving towards stereo speakers. In the design of mobile phone speakers, the main problem is the consideration of the configuration architecture, power and power consumption. To support stereo sound, the mobile phone needs two external speakers. However, because the size of the mobile phone is too small, the two speakers cannot be opened, so the effect of the stereo sound is not easy to display, in this case, special 3D effects are required. To support the Do-not-hold receiver function, connect it to another large speaker. It is best to provide exclusive analog output for individual speakers, but the power management must be changed accordingly.

Since the speaker power amplifier will use a lot of power supply, it is important to turn off the power when they are not in use. The audio codec of the smart phone can provide some power management functions to manage the switch for the output of individual speakers, so as to avoid unnecessary power consumption. In addition, the voltage regulator in the system power management solution usually cannot provide the speaker with the power required to reach the maximum volume. Therefore, Codec Chip vendors adopt the method of adding the speakers in the chip, that is, the speaker is directly driven by a battery. Although this does not necessarily reduce power consumption, it also saves the need for additional voltage regulators.


In recent years, mobile phone ringtones have become more and more complex, from simple ringtones to chord ringtones, to various sounds that can be made into stereo WAV and MP3 formats. MIDI has become the standard format for chord ringtones. Many manufacturers have released dedicated low-power MIDI chips for this application. To integrate the Midi chip into the audio sub-system, additional analog input is required on codec.

These additional inputs are also useful for FM radio IC connections and provide additional functions for multimedia applications. The production of MIDI audio can also be produced by the audio codec, but the current market trend is to store special ringtone files and play them through the existing hi-fi DAC, codec Chip vendors that lack the Midi software library will not actively do this.


What is the next step for smart phones? As far as the sound of hi-fi is concerned, it is already a necessary system function. As for the competition between I2S and ac'97 in the mobile audio system, the competition will continue. Some people prefer the I2S interface, but others prefer the low pin count and the ease of running the ac''97 with different sampling rates. For smart phones, most low-power processors currently support both types. It seems that the two will coexist. However, it is difficult for Codec vendors to support two types at the same time, because the Variable Rate Audio function of ac''97 needs to be in a different frequency architecture than that of I2S, it also requires many additional digital circuits.

However, will smart phones move from stereo to multi-channel surround sound formats like PCs (Intel's Azalia? This possibility cannot be seen in the near future, because although the multi-channel effect is dazzling today, the cost and power consumption of chips are still too high, which is not acceptable in the mobile phone world. However, today's negative answer still has many variables in the future electronic world. No one can say this accurately.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.