Python is very powerful because of its large three-way library, the resources are very rich, of course, there is no lack of audio library
About audio, Pyaudio This library, can enable the microphone recording, can play audio files and so on, at this moment we do not understand the other functions, only to understand how it realizes the recording
First of all, Pip a Pyaudio
Pip Install Pyaudio
I. Pyaudio realization of microphone recording
Then create a py file, copy the following code
ImportPyaudioImportWavechunk= 1024FORMAT=Pyaudio.paint16channels= 2 Rate= 16000Record_seconds= 2Wave_output_filename="Oldboy.wav"P=Pyaudio. Pyaudio () stream= P.open (format=FORMAT, Channels=CHANNELS, rate=Rate , input=True, Frames_per_buffer=CHUNK)Print("start recording, please speak ...") Frames= [] forIinchRange (0, int (rate/chunk *record_seconds)): Data=Stream.read (CHUNK) frames.append (data)Print("the recording is over, please shut up!") Stream.stop_stream () Stream.Close () p.terminate () WF= Wave.open (Wave_output_filename,'WB') wf.setnchannels (CHANNELS) wf.setsampwidth (P.get_sample_size (FORMAT)) wf.setframerate (rate) Wf.writeframes (b "'. Join (frames)) Wf.close ()
Try, there is a oldboy.wav file in the table of contents, listen to, still very clear
Next, we will write this recording code in a function, if you want to record the call
Create a file pyrec.py and write the recording codes and functions
#pyrec.py File ContentsImportPyaudioImportWavechunk= 1024FORMAT=Pyaudio.paint16channels= 2 Rate= 16000Record_seconds= 2defRec (file_name): P=Pyaudio. Pyaudio () stream= P.open (format=FORMAT, Channels=CHANNELS, rate=Rate , input=True, Frames_per_buffer=CHUNK)Print("start recording, please speak ...") Frames= [] forIinchRange (0, int (rate/chunk *record_seconds)): Data=Stream.read (CHUNK) frames.append (data)Print("the recording is over, please shut up!") Stream.stop_stream () Stream.Close () p.terminate () WF= Wave.open (file_name,'WB') wf.setnchannels (CHANNELS) wf.setsampwidth (P.get_sample_size (FORMAT)) wf.setframerate (rate) Wf.writeframe S (b"'. Join (frames)) Wf.close ()
The REC function is the recording function we call, and given him a filename, he will automatically write the sound to the file.
Two. Implement automatic conversion of audio format and call speech recognition
The problem of recording solved, quickly and Baidu voice recognition to join together to use:
No matter how clear your recordings are, you find that Baidu will always return to you:
{'err_msg'Speech quality error. ' ' Err_no ' ' SN ' ' 6397933501529645284 '} # The sound is not clear
Actually not not hear clearly, but Baidu supports the audio format PCM to engage the ghost
So, we're going to convert the recorded WAV audio file to a PCM file
Write a file wav2pcm.py This file inside the function is specifically for us to convert WAV file
Using the Os.system () method in the OS module, this method is used to execute system commands, and commands in the Windows system are those written in Cmd, dir, CD, and so on.
#wav2pcm.py File ContentsImportOSdefWAV_TO_PCM (wav_file):#Suppose wav_file = "audio file. wav" #wav_file.split (".") gets ["Audio file", "wav"] take out the first result "audio file" with ". PCM" stitching until the result "audio file. PCM"Pcm_file ="%S.PCM"% (Wav_file.split (".") [0])#just before we entered the command in the CMD window, which was to have Python help us execute the command in CMDOs.system ("ffmpeg-y-I%s-acodec pcm_s16le-f s16le-ac 1-ar 16000%s"%(wav_file,pcm_file))returnPcm_file
So we have a function of converting WAV to PCM, and then rebuilding our code again.
This is a very satisfying return.
{'Corpus_no':'6569869134617218414','err_msg':'success.','Err_no': 0,'result': ['Old boy Education'],'SN':'8116162981529666859'}
Get the speech recognition string, then use this string of speech synthesis, learn what we say
Three. Speech synthesis and FFmpeg play MP3 file
Get the string, call the synthesis method directly to synthesize it.
This code joins a piece of code, succeeds in getting the Synth.mp3 audio file, and determines the real learning we say
The next step is to let our program automatically play the Synth.mp3 audio file, actually Pyaudio has the function of play, but the operation is a bit complicated
So we choose to solve complex problems in a simple way, is it so simple and rude, do you remember ffmpeg?
FFmpeg This system tool, there is a ffplay tool used to open and play audio files, using the method is probably: Ffplay audio files. mp3
Create a playmp3.py file and write a Play_mp3 function to play the synthesized speech
# playmp3.py File Contents Import OS def Play_mp3 (file_name): os.system ("ffplay %s"% (file_name))
Back to the main file, call the Play_mp3 function in the playmp3.py file
Execute the code when you see: Start recording, please speak ...
Please speak loudly: learn it to find old boy education
And then you'll hear, a prettily voice repeating what you say
Four. Simple questions and Answers
First of all we need to re-comb the code:
The speech synthesis speech recognition part of the code independent into a function in the baidu_ai.py file
#baidu_ai.py File Contents fromAipImportAipspeech#Here are three parameters, corresponding to the Baidu voice creation of the application of three parametersapp_id ="xxxxx"Api_key="xxxxxxx"Secret_key="xxxxxxxx"Client=Aipspeech (app_id, Api_key, Secret_key)defAudio_to_text (pcm_file):#read the file and finally get the PCM fileWith open (Pcm_file,'RB') as Fp:file_context=Fp.read ()#identify local filesres = CLIENT.ASR (File_context,'PCM', 16000, { 'Dev_pid': 1536, }) #The 1th element in the value list for "result" from the dictionary is the recognized string "Old boy education"Res_str = Res.get ("result") [0]returnRes_strdefText_to_audio (RES_STR): Synth_file="Synth.mp3"Synth_context= Client.synthesis (Res_str,"ZH", 1, { "Vol.": 5, "SPD": 4, "Pit"8 {, "per": 4}) with open (Synth_file,"WB") as F:f.write (Synth_context)returnSynth_file
And then we'll make a change to our master file.
ImportPyrec#Recording function FileImportWav2pcm#wav convert PCM function fileImportBaidu_ai#Speech synthesis function, speech recognition function fileImportPlaymp3#Play mp3 function filePyrec.rec ("1.wav")#recording and generating WAV files, using the method to pass in file namesPcm_file= WAV2PCM.WAV_TO_PCM ("1.wav")#Convert a WAV file to a PCM file returns the file name of the PCMRes_str= Baidu_ai.audio_to_text (Pcm_file)#recognize converted PCM audio files as text res_strSynth_file= Baidu_ai.text_to_audio (RES_STR)#RES_STR String composition Voice returns the file name Synth_filePlaymp3.play_mp3 (synth_file)#Play Synth_file
And then it's time to start your own imagination:
Res_str is a string, and if the string equals "What's your name", we're going to give him an answer: My name is old boy education
Create a new faq.py file and create a function FAQ:
#faq.py File ContentsdefFAQ (Q):ifQ = ="What's your name?":#problem return "My name is the old boy education" #Answer
return " I don't know what you're talking about " # The question comes back when there's no answer
Import this function in the main file and pass the speech recognition string into the function
Now try this: "What's Your Name", "How old are you?"
Yes, now you can do more problems with the faq.py file.
Or that sentence, don't play it bad.
Study Questions
1. How to achieve always ask questions stop once?
2. So many questions, is it about writing so many questions?
3. If I ask you who you are, do you want to repeat it once again my name is the answer to the old boy's education?
Python Ai Road-third: Pyaudio realization of recording Automation Interactive implementation FAQ