Tools List
1. **树莓派**(型号不要求,本人使用的是3B) 2. **usb麦克风**(某宝有卖,我就不打广告了) 用来录音 3. **音响或者喇叭**(某宝也有卖) 用来播放
These are the tools you need.
The dialogue robot is divided into 5 steps
1. First step: "Recording": Recording I use the *arecord*
Install Arecord:sudo apt-get install arecord
使用arecord录音: `arecord -D "plughw:1" -f S16_LE -r 16000 -d 3 /home/pi/Desktop/voice.wav`
2. Step two: "Speech recognition": Speech recognition self-feeling use Baidu's relatively good, the recognition rate is also very high the following is a speech recognition code
"'
Coding:utf-8
Import Sys
Import JSON
Import Urllib2
Import Base64
Import requests
Reload (SYS)
Sys.setdefaultencoding ("Utf-8")
Def get_access_token ():
url = "Https://openapi.baidu.com/oauth/2.0/token"
BODY = {
"Grant_type": "Client_credentials",
"client_id": "Fill in your client_id here",
"Client_secret": "Fill in your Client_secret here",
}
r = requests.post(url,data=body,verify=True)respond = json.loads(r.text)return respond["access_token"]
def yuyinshibie_api (Audio_data,token):
Speech_data = Base64.b64encode (audio_data). Decode ("Utf-8")
Speech_length = Len (audio_data)
Post_data = {
"Format": "WAV",
"Rate": 16000,
"Channel": 1,
"CUiD": "b8-27-eb-ba-24-14",
"Token": token,
"Speech": Speech_data,
"Len": Speech_length
}
url = "http://vop.baidu.com/server_api"json_data = json.dumps(post_data).encode("utf-8")json_length = len(json_data)#print(json_data)req = urllib2.Request(url, data=json_data)req.add_header("Content-Type", "application/json")req.add_header("Content-Length", json_length)#print("asr start request\n")resp = urllib2.urlopen(req)#print("asr finish request\n")resp = resp.read()resp_data = json.loads(resp.decode("utf-8"))if resp_data["err_no"] == 0: return resp_data["result"]else: print(resp_data) return None
def asr_main (Filename,tok):
Try
f = open (filename, "RB")
Audio_data = F.read ()
F.close ()
RESP = Yuyinshibie_api (Audio_data,tok)
return resp[0]
Except Exception,e:
Print "E:", E
Return "recognition failed". Encode ("Utf-8")
* * After recognition is complete we're about to start the third step. We're going to have to talk to the robot. So it has to get back to us, right? In order to be smart, we used Turing's interface Turing is really very useful to check the weather voice \ Tell the story of jokes \ The following is attached to the third step of the Code
"' * *
3. Step three: "Turing back"
# Coding:utf-8Import requestsImport JSONImport sysreload (SYS) sys.setdefaultencoding ( "Utf-8") def tuling (words): Tuling_api_key = "fill in your own turling KEY" BODY = { "key": Tuling_api_key, "info": Words.encode (" Http://www.tuling123.com/openapi/api "R = Requests.post (Url,data=body, Verify=true) if r:date = json.loads (r.text) print date[" text "] return date[" text "] else: return none
4. Fourth step: "Speech synthesis" Turing back after we have to let it play out the use of Baidu's speech synthesis * *
Coding:utf-8
Import Sys
Import Urllib2
Import JSON
Import OS
Import Yuyinshibie
Reload (SYS)
Sys.setdefaultencoding ("Utf-8")
def yuyinhecheng_api (Tok,tex):
cuid = "xx-xx-xx-xx-xx-xx"
SPD = "4"
url = "http://tsn.baidu.com/text2audio?tex=" +tex+ "&lan=zh&cuid=" +cuid+ "&ctp=1&tok=" +tok+ "& Per=3 "
#print URL
#response = requests.get (URL)
#date = Response.read ()
Return URL
def tts_main (Filename,words,tok):
Voice_date = Yuyinhecheng_api (tok,words)
f = open(filename,"wb")f.write(voice_date)f.close()
After speech synthesis, we're going to play it out and use the mpg123. Why would I use this? Because it can play the audio directly on the page very well
Install mpg123: sudo apt-get install mpg123
After I've installed it, I'll use it after that.
Now the tool code required for audio speech recognition speech synthesis playback is ready.
5. Step Fifth: "Integration"
Put the code on first, and I'll talk about it.
# Coding:utf-8import OsimportTimeimport yuyinhechengimport Turlingimport yuyinshibietok = yuyinshibie.get_access_token () switch = TrueWhile Switch:os.System' sudo arecord-d ' plughw:1 "-F s16_le-r 16000-d 3/home/pi/desktop/voice.wav ')Time.Sleep0.5) info = Yuyinshibie.asr_main ("/home/pi/desktop/voice.wav", Tok)If' Off '. Encode ("Utf-8") in info:While True:os.System' sudo arecord-d ' plughw:1 "-F s16_le-r 16000-d 10/home/pi/desktop/voice.wav ')Time.Sleep) info = Yuyinshibie.asr_main ("/home/pi/desktop/voice.wav", Tok)If' Open '. Encode ("Utf-8") in info:Break URL ="Http://tsn.baidu.com/text2audio?tex= Open success &lan=zh&cuid=b8-27-eb-ba-24-14&ctp=1&tok=" +tok+"&per=3" OS.System' mpg123 '%s '%url) elif' Pause '. Encode ("Utf-8") in Info:url ="http://tsn.baidu.com/text2audio?tex= begins to pause &lan=zh&cuid=b8-27-eb-ba-24-14&ctp=1&tok=" +tok+"&per=3" OS.system ( ' mpg123 "%s" %url) time. sleep (10) url = " &per=3 "OS. system ( ' mpg123 "%s" %url) continue else:tex = turling.tuling (info) url = Yuyinhecheng.yuyinhecheng_api (Tok,tex) OS. system ( ' mpg123 "%s" %url) time. sleep (0. 5)
First I use the recording tool to record a paragraph and then use speech recognition to identify what I said and then use if to judge what I said.
There's no keyword in it. If there is a keyword, such as "off", then it will enter an infinite loop to identify me.
, until I have identified the keyword "open" and then quit the loop back to the main loop. Why do I add this
function, because it's impossible for us to keep the robot there all the time and always recognize what we're saying and then keep replying to us, right?
I've been thinking about this for a long time, but it's a bad idea because it's going to be a loop, but there's no way to do it.
The law. If you have a better way to contact me qq:1281248141. Continue to recognize my words, Turing.
Will reply to me a message and then submit this message text to speech synthesis, speech synthesis will be the text to the synthesis of good
Then use the mpg123 mentioned above to play this audio mpg123 usage mpg123 “url”
so that you can achieve the dialogue.
Raspberry Pi Build Dialogue robot Python (Turn)