Ros exploration Summary (10)-Speech Control

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Nowadays, voice recognition is booming on PCs and smartphones. Ros is at the forefront of technology and will not miss such a handsome technology. Ros uses the code in the CMU sphinx and festival open-source projects, releases an independent speech recognition package, converts the recognized speech into text, and then enables the robot to intelligently process the speech.

I. Speech Recognition package
1. Install and install the dependency library directly by using the Ubuntu command:

$ sudo apt-get install gstreamer0.10-pocketsphinx$ sudo apt-get install ros-fuerte-audio-common$ sudo apt-get install libasound2

Then install the ROS package:

$ svn checkout http://albany-ros-pkg.googlecode.com/svn/trunk/rharmony$ rosmake pocketsphinx

The core file is the recognizer. py file in the nodes folder. This file uses a microphone to collect voice information, and then calls the speech recognition library to identify and generate text information, which is published through/recognizer/output messages, other nodes can subscribe to the message and process it accordingly.
2. After the test and installation are complete, we can run the test.
First, insert your microphone device and test whether the microphone has voice input in the system settings.
Then, run the test program in the package:

$ roslaunch pocketsphinx robocup.launch

At this point, a large segment of information is displayed in the terminal. Try to say some simple statements, of course, must be in English, such as bring me the glass, come with me, to see if it can be identified.
The test in ROS by example is still very accurate, but I feel that the recognition is quite inaccurate during the test. Maybe it is because my English is too bad. We can also directly view the final result message of ROS:

$ rostopic echo /recognizer/output

Ii. Voice Database
1. Viewing the voice database this speech recognition is an Offline Recognition Method. Some commonly used words are put into a file and used as the recognized text library, and then the speech signal is recognized in segments, finally, search for the corresponding text information in the database. To view the text information in the speech recognition database, you can use the following command:

$ roscd pocketsphinx/demo$ more robocup.corpus

2. To add a voice database, we can add other text recognition information to the voice database. Ros by example comes with a voice recognition routine, in addition, an example of adding a voice database is provided.
First, let's take a look at the text information to be added in the example:

$ roscd rbx1_speech/config$ more nav_commands.txt

This is the text to be added. You can modify some of the text and change it to what you need.
Then we need to generate voice information and library files online for this file. This step requires logging on to the website.
Decompress all downloaded files and place them in the config folder of the rbx1_speech package. We can change the names of these files:

$ roscd rbx1_speech/config$ rename -f 's/3026/nav_commands/' *

In the rbx1_speech/launch folder, check the voice_nav_commands.launch file:

<launch><node name="recognizer" pkg="pocketsphinx" type="recognizer.py"output="screen"><param name="lm" value="$(find rbx1_speech)/config/nav_commands.lm"/><param name="dict" value="$(find rbx1_speech)/config/nav_commands.dic"/></node></launch>

We can see that this launch file uses the generated Speech Recognition library and file parameters when running the recognizer. py node, so that we can use our own voice library for speech recognition.
Run the following command to test the effect:

$ roslaunch rbx1_speech voice_nav_commands.launch$ rostopic echo /recognizer/output

3. Speech Control with speech recognition, we can make a lot of sharp applications. First, let's try to use speech to control robot movements.
1. the recognizer. py mentioned earlier in the robot control node will publish the final recognized text information through the message. Then we can compile a robot control node to receive the message and perform corresponding control.
In the pocketsphist package, there is a routine voice_pai_vel.py for voice control to publish twist messages. The rbxw.speech package simplifies the modification. In the nodes folder, you can view the voice_nav.py file:

#!/usr/bin/env python"""  voice_nav.py    Allows controlling a mobile base using simple speech commands.    Based on the voice_cmd_vel.py script by Michael Ferguson in  the pocketsphinx ROS package.    See http://www.ros.org/wiki/pocketsphinx"""import roslib; roslib.load_manifest('rbx1_speech')import rospyfrom geometry_msgs.msg import Twistfrom std_msgs.msg import Stringfrom math import copysignclass VoiceNav:    def __init__(self):        rospy.init_node('voice_nav')                rospy.on_shutdown(self.cleanup)                # Set a number of parameters affecting the robot's speed        self.max_speed = rospy.get_param("~max_speed", 0.4)        self.max_angular_speed = rospy.get_param("~max_angular_speed", 1.5)        self.speed = rospy.get_param("~start_speed", 0.1)        self.angular_speed = rospy.get_param("~start_angular_speed", 0.5)        self.linear_increment = rospy.get_param("~linear_increment", 0.05)        self.angular_increment = rospy.get_param("~angular_increment", 0.4)                # We don't have to run the script very fast        self.rate = rospy.get_param("~rate", 5)        r = rospy.Rate(self.rate)                # A flag to determine whether or not voice control is paused        self.paused = False                # Initialize the Twist message we will publish.        self.cmd_vel = Twist()        # Publish the Twist message to the cmd_vel topic        self.cmd_vel_pub = rospy.Publisher('cmd_vel', Twist)                # Subscribe to the /recognizer/output topic to receive voice commands.        rospy.Subscriber('/recognizer/output', String, self.speech_callback)                # A mapping from keywords or phrases to commands        self.keywords_to_command = {'stop': ['stop', 'halt', 'abort', 'kill', 'panic', 'off', 'freeze', 'shut down', 'turn off', 'help', 'help me'],                                    'slower': ['slow down', 'slower'],                                    'faster': ['speed up', 'faster'],                                    'forward': ['forward', 'ahead', 'straight'],                                    'backward': ['back', 'backward', 'back up'],                                    'rotate left': ['rotate left'],                                    'rotate right': ['rotate right'],                                    'turn left': ['turn left'],                                    'turn right': ['turn right'],                                    'quarter': ['quarter speed'],                                    'half': ['half speed'],                                    'full': ['full speed'],                                    'pause': ['pause speech'],                                    'continue': ['continue speech']}                rospy.loginfo("Ready to receive voice commands")                # We have to keep publishing the cmd_vel message if we want the robot to keep moving.        while not rospy.is_shutdown():            self.cmd_vel_pub.publish(self.cmd_vel)            r.sleep()                                       def get_command(self, data):        # Attempt to match the recognized word or phrase to the         # keywords_to_command dictionary and return the appropriate        # command        for (command, keywords) in self.keywords_to_command.iteritems():            for word in keywords:                if data.find(word) > -1:                    return command            def speech_callback(self, msg):        # Get the motion command from the recognized phrase        command = self.get_command(msg.data)                # Log the command to the screen        rospy.loginfo("Command: " + str(command))                # If the user has asked to pause/continue voice control,        # set the flag accordingly         if command == 'pause':            self.paused = True        elif command == 'continue':            self.paused = False                # If voice control is paused, simply return without        # performing any action        if self.paused:            return                       # The list of if-then statements should be fairly        # self-explanatory        if command == 'forward':                self.cmd_vel.linear.x = self.speed            self.cmd_vel.angular.z = 0                    elif command == 'rotate left':            self.cmd_vel.linear.x = 0            self.cmd_vel.angular.z = self.angular_speed                        elif command == 'rotate right':              self.cmd_vel.linear.x = 0                  self.cmd_vel.angular.z = -self.angular_speed                    elif command == 'turn left':            if self.cmd_vel.linear.x != 0:                self.cmd_vel.angular.z += self.angular_increment            else:                        self.cmd_vel.angular.z = self.angular_speed                        elif command == 'turn right':                if self.cmd_vel.linear.x != 0:                self.cmd_vel.angular.z -= self.angular_increment            else:                        self.cmd_vel.angular.z = -self.angular_speed                        elif command == 'backward':            self.cmd_vel.linear.x = -self.speed            self.cmd_vel.angular.z = 0                    elif command == 'stop':             # Stop the robot!  Publish a Twist message consisting of all zeros.                     self.cmd_vel = Twist()                elif command == 'faster':            self.speed += self.linear_increment            self.angular_speed += self.angular_increment            if self.cmd_vel.linear.x != 0:                self.cmd_vel.linear.x += copysign(self.linear_increment, self.cmd_vel.linear.x)            if self.cmd_vel.angular.z != 0:                self.cmd_vel.angular.z += copysign(self.angular_increment, self.cmd_vel.angular.z)                    elif command == 'slower':            self.speed -= self.linear_increment            self.angular_speed -= self.angular_increment            if self.cmd_vel.linear.x != 0:                self.cmd_vel.linear.x -= copysign(self.linear_increment, self.cmd_vel.linear.x)            if self.cmd_vel.angular.z != 0:                self.cmd_vel.angular.z -= copysign(self.angular_increment, self.cmd_vel.angular.z)                        elif command in ['quarter', 'half', 'full']:            if command == 'quarter':                self.speed = copysign(self.max_speed / 4, self.speed)                    elif command == 'half':                self.speed = copysign(self.max_speed / 2, self.speed)                        elif command == 'full':                self.speed = copysign(self.max_speed, self.speed)                        if self.cmd_vel.linear.x != 0:                self.cmd_vel.linear.x = copysign(self.speed, self.cmd_vel.linear.x)            if self.cmd_vel.angular.z != 0:                self.cmd_vel.angular.z = copysign(self.angular_speed, self.cmd_vel.angular.z)                        else:            return        self.cmd_vel.linear.x = min(self.max_speed, max(-self.max_speed, self.cmd_vel.linear.x))        self.cmd_vel.angular.z = min(self.max_angular_speed, max(-self.max_angular_speed, self.cmd_vel.angular.z))    def cleanup(self):        # When shutting down be sure to stop the robot!        twist = Twist()        self.cmd_vel_pub.publish(twist)        rospy.sleep(1)if __name__=="__main__":    try:        VoiceNav()        rospy.spin()    except rospy.ROSInterruptException:        rospy.loginfo("Voice navigation terminated.")

As you can see, the Code defines the control policies when various commands are received.
2. The simulation test is the same as above. we conduct the simulation test in rviz.
First, run a robot model:

$ roslaunch rbx1_bringup fake_turtlebot.launch

Then open rviz:

$ rosrun rviz rviz -d `rospack find rbx1_nav`/sim_fuerte.vcg

If you do not like to view the output in the terminal, you can open the gui:

$ rxconsole

Then open the speech recognition node:

$ roslaunch rbx1_speech voice_nav_commands.launch

Finally, it is the control node of the Robot:

$ roslaunch rbx1_speech turtlebot_voice_nav.launch

OK. After opening the above nodes, you can start. You can use the following command:

It was my test result, but I felt that the accuracy was poor:

4. Playing voice now robots can follow what we have said. It would be better if robots can talk to us. ROS has already integrated such a package. Let's try it.
Run the following command:

$ rosrun sound_play soundplay_node.py$ rosrun sound_play say.py "Greetings Humans. Take me to your leader."

Have you heard the sound! ROS allows the robot to read the input text. The speaker is kal_diphone. If you don't like it, you can also read it by another person:

$ sudo apt-get install festvox-don$ rosrun sound_play say.py "Welcome to the future" voice_don_diphone

Haha, this is another person. Well, this is not our point.
In the rbx1_speech/nodes folder, there is a node talkback. py that allows the robot to speak:

#!/usr/bin/env python"""    talkback.py - Version 0.1 2012-01-10        Use the sound_play client to say back what is heard by the pocketsphinx recognizer.        Created for the Pi Robot Project: http://www.pirobot.org    Copyright (c) 2012 Patrick Goebel.  All rights reserved.    This program is free software; you can redistribute it and/or modify    it under the terms of the GNU General Public License as published by    the Free Software Foundation; either version 2 of the License, or    (at your option) any later version.5        This program is distributed in the hope that it will be useful,    but WITHOUT ANY WARRANTY; without even the implied warranty of    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the    GNU General Public License for more details at:http://www.gnu.org/licenses/gpl.htmlPoint"""import roslib; roslib.load_manifest('rbx1_speech')import rospyfrom std_msgs.msg import Stringfrom sound_play.libsoundplay import SoundClientimport sysclass TalkBack:    def __init__(self, script_path):        rospy.init_node('talkback')        rospy.on_shutdown(self.cleanup)                # Set the default TTS voice to use        self.voice = rospy.get_param("~voice", "voice_don_diphone")                # Set the wave file path if used        self.wavepath = rospy.get_param("~wavepath", script_path + "/../sounds")                # Create the sound client object        self.soundhandle = SoundClient()                # Wait a moment to let the client connect to the        # sound_play server        rospy.sleep(1)                # Make sure any lingering sound_play processes are stopped.        self.soundhandle.stopAll()                # Announce that we are ready for input        self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")        rospy.sleep(1)        self.soundhandle.say("Ready", self.voice)                rospy.loginfo("Say one of the navigation commands...")        # Subscribe to the recognizer output and set the callback function        rospy.Subscriber('/recognizer/output', String, self.talkback)            def talkback(self, msg):        # Print the recognized words on the screen        rospy.loginfo(msg.data)                # Speak the recognized words in the selected voice        self.soundhandle.say(msg.data, self.voice)                # Uncomment to play one of the built-in sounds        #rospy.sleep(2)        #self.soundhandle.play(5)                # Uncomment to play a wave file        #rospy.sleep(2)        #self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")    def cleanup(self):        self.soundhandle.stopAll()        rospy.loginfo("Shutting down talkback node...")if __name__=="__main__":    try:        TalkBack(sys.path[0])        rospy.spin()    except rospy.ROSInterruptException:        rospy.loginfo("Talkback node terminated.")

Let's take a look at the effect:

$ roslaunch rbx1_speech talkback.launch

Then I spoke again. ROS not only recognized text information, but also read it. Of course, there are no artificial intelligence algorithms that can't make machines speak intelligently. However, this is the foundation. It will be even better if you have time to study AI.

----------------------------------------------------------------

You are welcome to repost my article.

Reprinted, please note: transferred from ancient-month

Http://blog.csdn.net/hcx25909

Continue to follow my blog

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Ros exploration Summary (10)-Speech Control

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Ros exploration Summary (10)-Speech Control

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support