The computer in front of you, whether also want to be able to let the computer obey you? When you are tired, just say "I am tired", the computer will put the elegant light music to let you relax. Perhaps you hope you in busy schedule, can let computer Lang read the latest NBA scores .... Everything is so cozy.
Here to tell you, don't lose heart, we can really do one.
Make a speech recognition? I believe many people here will have two mentality, one is curious, the other is to avoid thousands of miles.
In fact, you don't have to understand too much programming skills, you can not even understand the natural language processing technology, this article although the implementation of voice control but it is not as complex as you imagine. If speech recognition is only a implemented interface, the rest of the logic is simply if-else these simple elements.
The principle of realizing voice control
Speech control is divided into two parts: speech recognition and speech reading.
These two parts would have required natural language processing skills and a series of extremely complex algorithms to work on, but this article will skip here, if you are only interested in algorithms and natural linguistics, only ask you to move, there is no word below to tell the content.
As early as the 90 's, IBM launched a very powerful speech recognition system-vio Voice, and then related products emerge endlessly, evolving and evolving. We will use SAPI to implement the speech module here.
What is SAPI?
SAPI is the Microsoft speech API, Microsoft is the introduction of the Voice interface, and careful people will find that from the beginning of the WinXP, the system has been the function of speech recognition, but there are quite a few, he did not give a number of user-friendly custom scheme, the only voice control command appears to be quite chicken threat. So the task of this article is to use SAPI for personalized speech recognition.
In the preparation phase, you need to install at least the following tools:
Python2.7 http://www.python.org/
Strongly constructed using 2.7, so far Python2.7 has the largest number of tools and application support in the Python series, and is relatively stable.
Win32com http://starship.python.net/~skippy/win32/Downloads.html
Python Win32 enhancements enable Python to invoke the Win32com interface, which makes Python incredibly powerful
speech.py http://pypi.python.org/pypi/speech/
This is a very streamlined packaging module, here is optional, of course, I do not recommend repeated wheel, or down, currently only support Python2.6, but do not lose heart, Python2.6 and Python2.7 code is compatible, there will be no exception.
The installation process is in the first order.
Development phase
Once you have installed the relevant tools, you will be able to develop:
Start with a simple debugging environment:
Copy Code code as follows:
Whiletrue:
Phrase =speech.input ()
Speech.say ("You said%s"%phrase)
Ifphrase = = "Turn off":
Break
The above code is to start speech recognizer, and the system will repeat the voice you entered, when "Turn off", will automatically turn off the recognition system.
If you pass the test correctly, we can start to expand the development.
1. Define Chinese Semantic Library
Copy Code code as follows:
Closemainsystem = "Turn off human-computer interaction"
Openeclipse = "I want to write a program"
Listenmusic = "I'm so tired."
Blog = "Read Blog"
php = "PHP"
java = "Java"
2. Define the relevant semantic operational logic
Copy Code code as follows:
Defcallback (phrase, listener):
Print (":%s"%phrase)
Ifphrase ==closemainsystem:
Speech.say ("Goodbye.") Man-machine interaction is about to close, thank you for using ")
Listener.stoplistening ()
Sys.exit ()
Elifphrase ==openeclipse:
Speech.say ("Would you like to write Python or Java program?") ")
Speech.listenforanything (callback)
Elifphrase ==listenmusic:
Speech.say ("will start the Watercress station for You")
Webbrowser.open_new ("http://douban.fm/")
Elifphrase ==blog:
Speech.say ("Coming into Dreamforce.me")
Webbrowser.open_new ("http://dreamforce.me/")
Elifphrase ==php:
Speech.say ("Start php writer")
Os.popen ("E:\IDE\php_eclipse\eclipse\eclipse.exe")
Elifphrase ==php:
Speech.say ("Start Java Writer")
Os.popen ("E:\IDE\php_eclipse\eclipse\eclipse.exe")
Where Os.popen is an asynchronous open program that does not open a shell window individually or block the current process.
Speech.say () is a call to SAPI for parameter reading.
Webbrowser.open_new () is the open Web page.
3. Program Operation main body constructs
Copy Code code as follows:
Listener =speech.listenforanything (callback)
Whilelistener.islistening ():
Text =input ()
Iftext = "Do not Voice":
Listener.stoplistening ()
Sys.exit ()
Else
Speech.say (text)
This section is the operating body, the effect is to turn on voice listening, while supporting the terminal input mode. If you have a hoarse voice, you can also type to achieve, haha ~ ~