Do Hollywood movies often see powerful speech recognition systems? Do I envy and hate every time I see it? But we can't afford it. Do you want to make your computer listen to you? When you are tired, just say "I am tired", the computer will put elegant light music to let you relax. Maybe you want to be busy with the latest NBA score match .... Everything is so comfortable.
Let us tell you, don't be discouraged. we can do one.
What is a speech recognition? I believe that many people will have two attitudes here: curiosity and avoidance.
Otherwise, you do not need to understand too many programming skills, or even natural language processing technology. Although this article achieves voice control, it is not as complicated as you think. IF we only use speech recognition as an implemented interface, the rest of the logic is just the simple elements of IF-ELSE.
How Voice Control Works
Speech control is divided into speech recognition and speech reading.
These two parts will have to be handled by natural language processing skills and a series of extremely complex algorithms. However, this article will skip this section, if you are only interested in algorithms and natural linguistics, you only need to move on. there is no word below to describe the content.
As early as the 1990s s, IBM launched an extremely powerful voice recognition system-vio voice, which subsequently evolved and evolved with an endless stream of related products. We will use SAPI to implement the speech module.
What is SAPI?
SAPI is a Microsoft Speech API and a voice interface launched by Microsoft. careful people will find that the Speech recognition function has been available in the system since WINXP, however, it is rarely used. he did not provide some customized solutions, and the only voice control commands seem quite cool. The task of this article is to use SAPI for personalized speech recognition.
In the preparation phase, you must install at least the following tools:
Python2.7 http://www.python.org/
We strongly recommend that you use Python 2.7. so far, Python2.7 has the largest number of tools and application support in the Python series, and is relatively stable.
Win32Com http://starship.python.net /~ Skippy/win32/Downloads.html
The Python Win32 enhancement tool allows Python to call the WIN32COM interface. the emergence of this tool makes Python extremely powerful.
Speech. py http://pypi.python.org/pypi/speech/
This is a very streamlined encapsulation module. it is optional here. of course, we do not recommend that you repeat the wheel. Currently, only Python2.6 is supported, but you do not need to be discouraged. the code of Python2.6 and Python2.7 is compatible, no exception.
The installation process is in the descending order.
Development Phase
After installing the above tools, you can develop them:
Perform a simple environment debugging first:
The code is as follows:
WhileTrue:
Phrase = speech. input ()
Speech. say ("You said % s" % phrase)
Ifphrase = "turn off ":
Break
The above code is used to start the speech recognition device. at the same time, the system will repeat the speech you have entered. when a "turn off" occurs, the recognition system will be automatically disabled.
If you pass the test, we can start the extension development.
1. define the Chinese semantic Library
The code is as follows:
CloseMainSystem = "disable human-computer interaction"
OpenEclipse = "I want to write a program"
ListenMusic = "I'm tired"
Blog = "View blog"
Php = "php"
Java = "JAVA"
2. define the relevant semantic operation logic
The code is as follows:
Defcallback (phrase, listener ):
Print (": % s" % phrase)
Ifphrase = closeMainSystem:
Speech. say ("Goodbye. human-computer interaction is about to be disabled. thank you for using it ")
Listener. stoplistening ()
Sys. exit ()
Elifphrase = openEclipse:
Speech. say ("Do you want to write PYTHON or JAVA programs? ")
Speech. listenforanything (callback)
Elifphrase = listenMusic:
Speech. say (" will be started soon ")
Webbrowser. open_new ("http://douban.fm /")
Elifphrase = blog:
Speech. say ("coming soon into Dreamforce. me ")
Webbrowser. open_new ("http://dreamforce.me /")
Elifphrase = php:
Speech. say ("start PHP writer ")
OS. popen ("E: \ IDE \ php_eclipse \ eclipse \ eclipse.exe ")
Elifphrase = php:
Speech. say ("start JAVA writer ")
OS. popen ("E: \ IDE \ php_eclipse \ eclipse \ eclipse.exe ")
Among them, OS. popen is an asynchronous enabling program. this operation does not open a SHELL window separately or block the current process.
Speech. say () is to call SAPI to read parameters.
Webbrowser. open_new () is used to open a webpage.
3. set up the main program running
The code is as follows:
Listener = speech. listenforanything (callback)
Whilelistener. islistening ():
Text = input ()
Iftext = "do not speak ":
Listener. stoplistening ()
Sys. exit ()
Else:
Speech. say (text)
This section is the running subject. The main idea is to enable voice listening and support terminal input mode. If your voice is lost, you can also type it. haha ~~