Design flow optimized speech recognition chip structure design based on C language

Source: Internet
Author: User
Tags final

It is predicted that demand for voice-control applications will grow dramatically in the market, driven by the telephone market. Telephones will be more controlled by voice commands. Other areas of application include toys and handheld devices such as calculators, voice-controlled security systems, home appliances and on-board equipment (stereo, Windows, environmental controls, headlights and navigation control). This paper introduces the design of speech recognition chip from the perspective of reusable and optimized chip space, which is helpful to develop a series of other speech recognition chips.

Singapore Columns Company started early in the application of portable voice control products, one of which is the "voice control European currency converter" that performs the exchange between the euro and other European currencies. The design requirements for the Euro changer include: 1. Power is small, battery life is at least 1 years; 2. The price is low, the product retail price is not more than 9 U.S. dollars; 3. With a strong flexibility, it is possible to accurately identify and synthesize speaker-related speech in multiple languages; 4. The entire speech control nuclear product should have reusable properties.

This paper introduces the whole process of developing the Euro converter ASIC product by using Frontier Design company's designing tools. The requirement of implementing complex DSP algorithm in ASIC is usually very harsh, but the frontier structure synthesis tool is used a| The RT Designer tool can quickly optimize RTL descriptions, which also allow for free selection of alternative structures to optimize application design.

Through the application of C language design flow, can be in the structural design phase of the new characteristics of the design and hardware optimization, which can reduce the size of 50% of the wafer, by speeding up the design of C language prototype hardware, can further extend the performance of the design to meet the user requirements of product specifications.

Algorithm Research

The efficiency of the euro changer depends to a certain extent on the comparison of voice commands with the storage database and the ability to execute commands. Developing an algorithm that meets the requirements of the final product is critical to the success of the design, because no one wants to see the voice control device not consistently identify the command, and people need the algorithm to achieve more than 98% of the recognition accuracy. Therefore, the current challenges include detection and removal of background noise, distinguish between authentic command words and other noises (breathing, micro static noise and microphone sounds), determining the start and end of a command word, and comparing input to the stored "voice spectrum" database and subsequent command word recognition (Figure 1).

The following advanced computational dense DSP algorithms are applicable to solve the above problems: 1. Mel Frequency spectrum (cepstral) coefficient (MFCC) algorithm, MFCC algorithm consists of fast Fourier transform (FFT) function spectrum, Mel Calibration and log II composition; 2. Inverse discrete cosine transform (IDCT); 3. Continuous noise level estimation program for background sound and speech noise is continuously identified and estimated using multiple estimation and selection algorithm 4. Imprecise and exact command word boundary detection algorithms for detailed analysis of sound levels during and near the command word validity period 5. A dynamic time Warp algorithm for comparing a series of unequal-length vectors and comparing the duration of these vectors to a continuous time change (warp).

In order to adjust and optimize the parameters, the floating-point C code compiles and simulates fast enough to verify the performance of the algorithm. Finally, C language code must be able to run on a traditional PC, and the performance of speech recognition and synthesis algorithms can be tested in a real-world environment. The final speech recognition algorithm is tested on a 450MHz Pentium machine, and when tested with the company's internal voice recorder, 99% of the recognition accuracy is obtained.

The conversion of floating point algorithm to fixed-point algorithm

The chip implementation needs to convert the floating-point algorithm to fixed-point algorithm, to ensure the dynamic range and precision and to prevent the transition beyond the dynamic limit. The non-optimal range of regular fixed-point operands may cause the operand to wrap around (wrap around, such as (max+1) (min)) and cause severe clipping and error. The precision of the fixed-point is equally important, especially in the repetitive signal processing operations. When the accuracy is not enough, repetitive signal processing algorithm will lead to fault propagation and error accumulation, the final signal may gradually degenerate into white noise, which is a disastrous error for speech control products.

The Frontier tool has a name called a| RT Library's C + + class library, which is a tool for analyzing C code fixed-point performance. This class library supports multiple fixed-point data types, provides bit-true modeling (Bit-true modeling) for multiple overflow behaviors (such as saturation and failback), and provides multiple quantization models such as truncation and rounding 0. The original 32-bit floating-point speech recognition algorithm supports data with 8 khz input, its typical signal bandwidth is 32 bits, memory capacity requires thousands of bytes, and the output of a typical voice user interface is measured at a rate of several bytes per second.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.