Computer in front of you, whether also want to let the computer obey you? When you are tired, just say "I am tired", the computer will put the elegant light music to let you relax. Perhaps you want your busy schedule, can let the computer Lang read the latest NBA scores .... Everything is so cozy.
Here to tell you, do not lose heart, we can really do one.
Do a speech recognition? I believe that many people here will have two mentality, one is curious, and the other is to avoid thousands of miles.
In fact, you can not know too much programming skills, you can not even understand the natural language processing technology, this article although the implementation of voice control but it is not as complex as you think. If speech recognition is simply an implemented interface, the rest of the logic is simply if-else these simple elements.
The principle of voice control
Voice control is divided into two parts: speech recognition and speech reading.
These two parts are supposed to require the knowledge of natural language processing skills and a series of extremely complex algorithms to do it, but this article will be skipped here, if you are only interested in algorithms and natural linguistics, you only have to move, there is no word will tell the content.
As early as the 90 's, IBM launched a very powerful speech recognition system-vio voice, and then the related products in endless, constantly evolving and evolving. We will use SAPI to implement the voice module here.
What is SAPI?
SAPI is Microsoft Speech API, is Microsoft introduced the Voice interface, and careful people will find from the beginning of the WinXP, the system already has the function of speech recognition, but there is quite a few, he did not give some user-friendly custom scheme, the only voice control commands appear quite chicken threat. Then the task of this article is to use SAPI for personalized speech recognition.
In the preparation phase, you need to install at least the following tools:
Python2.7 http://www.python.org/
Highly paradoxical use of 2.7, to date Python2.7 has the largest number of Python series tools and applications support, but also relatively stable.
Win32com http://starship.python.net/~skippy/win32/Downloads.html
The Python Win32 enhancement tool enables Python to invoke the Win32com interface, which makes Python incredibly powerful
speech.py http://pypi.python.org/pypi/speech/
This is a very thin package module, here is optional, of course, I do not recommend repeating the wheel, or the next, currently only support Python2.6, but not discouraged, Python2.6 and Python2.7 code is compatible, there will be no exception.
The installation process should be in the order of the first.
Development phase
Once you have installed the relevant tools, you are ready to develop:
Start with a simple environment debug:
Copy the Code code as follows:
Whiletrue:
Phrase =speech.input ()
Speech.say ("You said%s"%phrase)
Ifphrase = = "Turn off":
Break
The above code is to start the speech recognizer, while the system will repeat the voice you entered, when encountered "turn off", it will automatically turn off the recognition system.
If you pass the test correctly, we can begin to expand the development.
1. Define Chinese Semantic Library
Copy the Code code as follows:
Closemainsystem = "Turn off human interaction"
Openeclipse = "I want to write a program"
Listenmusic = "I'm so tired."
Blog = "Read Blog"
php = "PHP"
java = "Java"
2. Defining relevant semantic operational logic
Copy the Code code as follows:
Defcallback (phrase, listener):
Print (":%s"%phrase)
Ifphrase ==closemainsystem:
Speech.say ("Goodbye. Human-Computer Interaction will close, thank you for using ")
Listener.stoplistening ()
Sys.exit ()
Elifphrase ==openeclipse:
Speech.say ("Would you like to write a python or Java program?") ")
Speech.listenforanything (callback)
Elifphrase ==listenmusic:
Speech.say ("will launch the Watercress Station for You")
Webbrowser.open_new ("http://douban.fm/")
Elifphrase ==blog:
Speech.say ("Coming into Dreamforce.me")
Webbrowser.open_new ("http://dreamforce.me/")
Elifphrase ==php:
Speech.say ("Start php writer")
Os.popen ("E:\IDE\php_eclipse\eclipse\eclipse.exe")
Elifphrase ==php:
Speech.say ("Start Java Writer")
Os.popen ("E:\IDE\php_eclipse\eclipse\eclipse.exe")
Where Os.popen is an asynchronous opener, this operation does not open a shell window individually, nor does it block the current process.
Speech.say () is called SAPI to read the arguments.
Webbrowser.open_new () is the Web page that opens.
3. Program Operation main body construction
Copy the Code code as follows:
Listener =speech.listenforanything (callback)
Whilelistener.islistening ():
Text =input ()
Iftext = = "Don't Voice Anymore":
Listener.stoplistening ()
Sys.exit ()
Else
Speech.say (text)
This section is the operating body, the effect is to turn on voice monitoring, while supporting the terminal input mode. If you have a hoarse voice, you can also type to achieve, haha ~ ~