HD Speech Technology (WBS) and its implementation in mobile phones and Bluetooth headsets

Source: Internet
Author: User

HD Voice, also known as wideband Voice, is an audio technology that transmits HD, natural voice quality to cellular, mobile, and wireless headsets. Compared with traditional narrowband telephones, HD voice improves voice quality to a large extent and reduces the hearing burden.

All networks and devices on the communications industry chain need to support HD voice to demonstrate the benefits of this technology. by June 2011, 20 Cellular networks operating in 18 countries, as well as 33 leading mobile phone brands, have supported HD voice. HD Voice has been introduced in GSM, WCDMA (UMTS) and LTE cellular networks by deploying adaptive multi-rate broadband (AMR-WB) speech coding. In addition, by using the improved sub-band encoding (MSBC) Speech codec technology , wireless Bluetooth headsets are also beginning to support HD voice, combining hands-free calls with high voice quality.

The advantages of HD voice can also be reflected in the existing network. With narrowband networks and devices transitioning to high-definition voice, a speech processing technology called bandwidth expansion (BWE) can be used to simulate voice-like quality on a receiving terminal device, Provides a compromise solution for devices that do not support HD voice.

From narrowband to HD voice

the bandwidth of traditional telephony systems is limited to about 300Hz to 3.4kHz within the audio frequency range ( Chart 1) , this range is often referred to as narrowband speech . Although the current telephone system is digital, it still inherits the same bandwidth as the traditional analog system. From the point of view of voice quality, narrowband speech lacks natural voice fidelity, which is often described as thin and blurry. However, the speech discrimination of the complete statement within the narrowband frequency range is about 99%.

HD Voice at sample frequency of 16kHz The audio bandwidth is approximately 50Hz to 7kHz, so there is a clearer voice signal than narrowband speech. While wideband speech does not significantly improve speech intelligibility, 3.4kHz to 7kHz outside the narrowband range improves the recognition of friction tones in words such as f, s, and th. Wideband Voice can provide more natural and real voice, and has a significant improvement in subjective audio quality than narrowband speech. High-definition speech expands the 50Hz to 300Hz low frequency to reduce the narrow-band voice of the feature, while the expansion of the high frequency increases the articulation.

In the subjective speech quality listening test, wideband voice gets 4.5 points in average opinion score (MOS), while Narrowband voice is 3.2 (1 is poor quality and 5 is excellent). The increase in wideband voice quality reduces the hearing burden and the listener's fatigue, especially when the listener is in a noisy environment. The mobile network operator Orange offers an example of the merits of an audio sample as a high-definition voice on its website. Another survey by Orange in June 2010 further demonstrated the value of HD voice to end-users:

* 96% of customers are satisfied with HD voice calls;

* 86% of the testers said that compatible HD voice will be their future purchase of mobile phone when a choice standard;

* 76% of testers are willing to change their phones to get HD voice features.

In addition, the user trial survey conducted by Ericsson and T-mobile in 2006 confirmed the merits of HD voice. Of the 150 sampled users, more than 70% said the call quality was better after using HD voice phones, and the quality of conversations improved in noisy environments.

the use of high-definition voice requires all aspects of the voice communication system to support the Wideband voice frequency range. the key to using HD voice technology is to deploy AMR-WB codecs in cellular and handheld phones. As a wideband speech coding, the effective audio bandwidth of AMR-WB is twice times that of narrowband coding amr-nb. To complete a high definition voice call, the base station and the handheld phone are co-transmitted to AMR-WB encoded voice, in which there is no voice modification or transcoding from the terminal to the terminal. If the HD Voice connection is not implemented, the system will instead use narrowband AMR-NB encoding.

Extended Voice Bandwidth

It can be expected that in the process of introducing HD voice, some parts of the communication system will convert the voice to narrowband frequency because of the inability to support it, which is actually reducing the voice quality and increasing the hearing burden. Artificial bandwidth expansion (BWE) makes up for the high frequency and low frequency voice content lost during transmission by adding artificial speech content to the narrowband voice signal in the terminal link of the communication system. In this way, bwe the advantage of high-definition voice to the narrowband and transition mixed-bandwidth voice communication system.

The BWE algorithm uses a voice-generated sound source filtering model to estimate and produce speech content within the extended frequency range. According to this model, speech is produced by a model of a sound source (e.g. vocal cords) plus an analog channel. The BWE algorithm estimates a wideband sound source model based on narrowband speech, and then uses the parameters of the model to estimate the lost wideband frequency content. In practical applications, the BWE is independent of the source encoding and routing processes, so it can coexist with traditional narrowband and mixed-bandwidth telephony networks.

BWE mainly used in Bluetooth headsets and hands-free devices. on the receiving terminal of these devices, narrowband CVSD encoded speech signals are first decoded and then processed by BWE to generate extended bandwidth speech signals to the subject. The BWE can also be applied on high-definition voice telephony networks, extending the voice signal to an ultra-wideband (SWB) frequency range of 14kHz bandwidth.

HD Voice and sound enhancements

Combining HD voice and sound enhancement processing, such as noise suppression (NS), echo cancellation (AEC), can improve speech intelligibility in noisy environments and improve overall call quality. Noise suppression technology can analyze noise-doped conversations and eliminate noise and increase speech discrimination. The Noise suppression algorithm estimates the noise power spectral density by a large number of frequency points and then extracts the noise from the dialogue. Compared with narrowband processing, wideband noise suppression includes more frequency-point data to compress noise in the extended frequency range when calculating the noise spectrum. In addition to noise suppression, the echo cancellation process eliminates the echo signal generated by the sound coupling between the speaker and the microphone. Echo cancellation works by separating a filtered and delayed copy from the signal received by the microphone. Echo Cancellation technology can calculate the adaptive filter coefficients in wideband speech.

HD Voice in a Bluetooth headset

Since the current Bluetooth headset has become a popular configuration for hands-free mobile calls, it is important that they be compatible with HD voice. This feature has become a reality through the Bluetooth MSBC voice codec.

Bluetooth Advanced Audio distribution Model (A2DP) Specify the use of sub-band encoding (SBC) To enforce the audio codec system to ensure the interoperability between handheld phones and headphones. SBC is a low-complexity codec technology, the compression ratio is moderate, support 16kHz, 32kHz, 44.1kHz and 48kHz sampling rate, and therefore become the choice of Bluetooth HD voice. For 16kHz wideband Voice, SBC can compress it at 64kbps data rate 4:1. However, when an SBC-encoded frame is transmitted over Bluetooth, it may not match the underlying Bluetooth packet. therefore, MSBC codec technology was developed to match SBC and Bluetooth packets, and Year 5 the month is defined as the Bluetooth Hands-free profile 1.6 the mandatory encoding and decoding method in the.

In terms of decoding performance, MSBC can be comparable to ITU-T g.722, a wideband speech codec system that is often used as a reference for quality evaluation of a new decoding system. In general, MSBC has a higher objective audio quality score than g.722 in error-free speech signals. The MSBC encoding system also maintains a higher average voice quality level than g.722 for multi-encoding/decoding channels.

Summary

compared to traditional narrowband voice transmission, HD Voice provides excellent voice quality and reduces the hearing burden in noisy environments. HD Voice has shown significant advantages in both listening tests and user trials. HD voice can be achieved by deploying AMR-WB speech codec systems in cellular and handheld phones, and by deploying MSBC speech codec systems in Bluetooth headsets . In addition, voice processing algorithms such as noise suppression and echo cancellation in handheld phones and headphones make the HD voice experience even better. As network operators and device manufacturers gradually introduce high-definition voice into the consumer market, the bandwidth scaling approach on Bluetooth headsets can bring the benefits of high-definition voice to narrow-band and mixed-bandwidth cellular network users.

Appendix:

The characteristics of narrowband and wideband audio are as follows:

cvsd:pcm:8khz, 1 bits, channel.

Compression ratio:16 (Controller encoding)

Insert Ratio:8

PCM Data rate= 16kb/s =8K*16/8

CVSD Data rate=8kb/s =16kb/s* 8/16

Air Data:cvsd

MSBC:PCM 16kHz, 1 bits, channel.

Compression Ratio:4 (host encoding:240->60)

PCM Data rate= 32kb/s

MSBC Data rate=8kb/s = 32kb/s/4

Air data:transparent data (MSBC)

Reference Documentation:

1 http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=21411227&id=5748646

HD Speech Technology (WBS) and its implementation in mobile phones and Bluetooth headsets

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.