Design flow optimized speech recognition chip structure design based on C language

Last Update:2017-02-27 Source: Internet

Author: User

Tags final

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

It is predicted that demand for voice-control applications will grow dramatically in the market, driven by the telephone market. Telephones will be more controlled by voice commands. Other areas of application include toys and handheld devices such as calculators, voice-controlled security systems, home appliances and on-board equipment (stereo, Windows, environmental controls, headlights and navigation control). This paper introduces the design of speech recognition chip from the perspective of reusable and optimized chip space, which is helpful to develop a series of other speech recognition chips.

Singapore Columns Company started early in the application of portable voice control products, one of which is the "voice control European currency converter" that performs the exchange between the euro and other European currencies. The design requirements for the Euro changer include: 1. Power is small, battery life is at least 1 years; 2. The price is low, the product retail price is not more than 9 U.S. dollars; 3. With a strong flexibility, it is possible to accurately identify and synthesize speaker-related speech in multiple languages; 4. The entire speech control nuclear product should have reusable properties.

This paper introduces the whole process of developing the Euro converter ASIC product by using Frontier Design company's designing tools. The requirement of implementing complex DSP algorithm in ASIC is usually very harsh, but the frontier structure synthesis tool is used a| The RT Designer tool can quickly optimize RTL descriptions, which also allow for free selection of alternative structures to optimize application design.

Through the application of C language design flow, can be in the structural design phase of the new characteristics of the design and hardware optimization, which can reduce the size of 50% of the wafer, by speeding up the design of C language prototype hardware, can further extend the performance of the design to meet the user requirements of product specifications.

Algorithm Research

The efficiency of the euro changer depends to a certain extent on the comparison of voice commands with the storage database and the ability to execute commands. Developing an algorithm that meets the requirements of the final product is critical to the success of the design, because no one wants to see the voice control device not consistently identify the command, and people need the algorithm to achieve more than 98% of the recognition accuracy. Therefore, the current challenges include detection and removal of background noise, distinguish between authentic command words and other noises (breathing, micro static noise and microphone sounds), determining the start and end of a command word, and comparing input to the stored "voice spectrum" database and subsequent command word recognition (Figure 1).

The following advanced computational dense DSP algorithms are applicable to solve the above problems: 1. Mel Frequency spectrum (cepstral) coefficient (MFCC) algorithm, MFCC algorithm consists of fast Fourier transform (FFT) function spectrum, Mel Calibration and log II composition; 2. Inverse discrete cosine transform (IDCT); 3. Continuous noise level estimation program for background sound and speech noise is continuously identified and estimated using multiple estimation and selection algorithm 4. Imprecise and exact command word boundary detection algorithms for detailed analysis of sound levels during and near the command word validity period 5. A dynamic time Warp algorithm for comparing a series of unequal-length vectors and comparing the duration of these vectors to a continuous time change (warp).

In order to adjust and optimize the parameters, the floating-point C code compiles and simulates fast enough to verify the performance of the algorithm. Finally, C language code must be able to run on a traditional PC, and the performance of speech recognition and synthesis algorithms can be tested in a real-world environment. The final speech recognition algorithm is tested on a 450MHz Pentium machine, and when tested with the company's internal voice recorder, 99% of the recognition accuracy is obtained.

The conversion of floating point algorithm to fixed-point algorithm

The chip implementation needs to convert the floating-point algorithm to fixed-point algorithm, to ensure the dynamic range and precision and to prevent the transition beyond the dynamic limit. The non-optimal range of regular fixed-point operands may cause the operand to wrap around (wrap around, such as (max+1) (min)) and cause severe clipping and error. The precision of the fixed-point is equally important, especially in the repetitive signal processing operations. When the accuracy is not enough, repetitive signal processing algorithm will lead to fault propagation and error accumulation, the final signal may gradually degenerate into white noise, which is a disastrous error for speech control products.

The Frontier tool has a name called a| RT Library's C + + class library, which is a tool for analyzing C code fixed-point performance. This class library supports multiple fixed-point data types, provides bit-true modeling (Bit-true modeling) for multiple overflow behaviors (such as saturation and failback), and provides multiple quantization models such as truncation and rounding 0. The original 32-bit floating-point speech recognition algorithm supports data with 8 khz input, its typical signal bandwidth is 32 bits, memory capacity requires thousands of bytes, and the output of a typical voice user interface is measured at a rate of several bytes per second.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More