Introduction to PC-side Speech Recognition

Source: Internet
Author: User
Speech recognition:

Speech recognition technology is a high technology that enables machines to transform voice signals into corresponding texts or commands through recognition and understanding processes. It mainly includes three aspects: Feature Extraction Technology, pattern matching criterion, and model training technology. Speech recognition is different from voiceprint recognition. The latter attempts to identify or confirm the speaker that sends the speech, rather than the word content contained in the speech. For more information about the development of speech recognition technology, see http://baike.baidu.com/view/652891.htm.

The following describes PC-side speech recognition products. There are two types of products: cloud speech recognition and offline speech recognition.

Cloud speech recognition:

1. HTML5 voice input labels directly support voice input, browser voice input, and Future Speech Recognition standards. Because Google-based voice libraries are used, the recognition rate is low, need browser support (currently chrome 11 and later versions support better, ie and Firefox are not supported), example: webqq in chrome opened with the use of the Speech Recognition http://web.qq.com/

2. google's Voice Input Method on the PC end should be similar to chrome in that Google's cloud voice database should be called. Google is better at English processing due to the fact that Google is on the wall, therefore, the recognition effect is normal.

3. tencent voice cloud, Chinese speech synthesis and recognition, currently more mature Chinese Speech Recognition in China, mainly in mobile terminals, has recently started layout on PC, Java, windows, the Linux version and the semi-official flash version (the previous three PC versions were directly released on the official website, and the flash version was released on the official forum as of January 7, through the thunder Network Disk, there is an expiration time, so it may be updated frequently), web development requires Flash Support or browser plug-in development.

(1) I have studied the flash controls. There are still many problems for the time being, and there are still various call errors (e.g. xunfei + flash -- Speech Recognition Error: Error #10202: error #120106: Communication sandbox security error when connecting to the socket and not clicking for a long period of time; you have submitted the help for the official website forum and have not received a reply yet ), the demo instances provided on the official website cannot be used normally (the microphone will be moved when input is prompted, but the information is not displayed after recognition, and no error is reported). However, over time, it should be better. Flash edition of Science and Technology fly test page: http://open.voicecloud.cn/iat.php

 

(2) ActiveX plug-in based on Windows package only supports IE (after a study, ActiveX is rarely involved, not understood... later, I started an example of the swing graphic interface of the Java version and tested the recognition effect.) Chrome plug-in and Firefox plug-in haven't seen it yet.

Offline speech recognition:

Offline mainly focuses on IBM ViaVoice. This product provides the best overall rating. It is only downloaded to version 9.0 of for Windows XP (you can search for and download it on Sina love Q ), version 10 is said to support win7, but there is no crack version of the outflow, it is to buy online charges. I tried version 9 on XP, and I feel that IBM is doing this based on "feature extraction technology, pattern matching rules, and model training, after installation, You need to continuously train the software through voice input and settings, so that the software can more accurately recognize the operator's voice, and do some custom settings and other advanced settings. Unlike cloud-based speech recognition, IBM can not only provide speech dictation (that is, voice Translation into Chinese characters), but also use the software to operate computers, for example, "surfing the Internet" will open the IE browser, and similar operations such as opening software and folders. The recognition rate is relatively high without special training. If you set the recognition rate based on your personal usage and preferences, it is said that the recognition rate will go further-the software is an intelligent software, continuous accumulation will make internal software modeling more accurate and easier to use. The disadvantage is that the settings are complex and tend to be used by personal computers, which is not suitable for public computers such as libraries. Non-voice prompts may not be well-understood, and many advanced functions can be set up; in addition, the words in the language library of version 9.0 are relatively old and cannot be well recognized for newer words. It is suitable for identifying traditional words-it may be a new word library of version 10.

In addition, xunfei is also said to have an offline paid speech recognition product. It is unable to experience the product because there is no relevant software. According to the official website of the Forum reply to the msp_support@iflytek.com to send an email to understand, no reply.

Conclusion: As the current speech recognition software products still cannot reach the precise recognition level, we feel that it is not commercially available, or we need to customize the development, or you may need to study how to use it in combination with your actual needs. The above are personal simple learning results for your reference only.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.