Gui.
Time: 2017-05-03-12:18:46
Links: http://www.cnblogs.com/xingshansi/p/6799994.html
Objective
This article mainly records the operation of the audio commonly used under Python, as an example of a. wav format file. In fact, there are a lot of ready-made Audio toolkit online, if just call, toolkit is more convenient.
1, bulk read. wav file name:
Import OsFilePath = "./data/" #添加路径filename = Os.listdir (filepath) #得到文件夹下的所有文件名称 for file in filename: print ( Filepath+file)
Here we use the string path:
1.通常意义字符串(str)2.原始字符串,以大写R 或 小写r开始,r‘‘,不对特殊字符进行转义3.Unicode字符串,u‘‘ basestring子类
Such as:
Path = './file/n ' path = R '. \file\n ' path = '. \\file\\n '
The three are equivalent, the right dash \ is the escape character, and the quotation mark preceded by R denotes the original string without escaping (R:raw string).
Common ways to get help:
>>> help(str)>>> dir(str)>>> help(str.replace)
2. Read. wav files
Wave.open usage:
Wave.open (File,mode)
Mode can be:
' RB ', read the file;
' WB ', write files;
Simultaneous read/write operations are not supported.
Wave_read.getparams usage:
f = wave.open (file, ' rb ') params = F.getparams () nchannels, Sampwidth, framerate, nframes = Params[:4]
One of the most common audio parameters for the last behavior:
Nchannels: Number of channels
Sampwidth: Quantization number of digits (byte)
Framerate: Sampling Frequency
Nframes: Sample Count
Corresponding Code:
Import Waveimport Matplotlib.pyplot as Pltimport numpy as Npimport OsFilePath = "./data/" #添加路径filename = Os.listdir (Filepa TH) #得到文件夹下的所有文件名称 f = wave.open (filepath+filename[1], ' rb ') params = F.getparams () nchannels, Sampwidth, Framerate, Nframes = Params[:4]strdata = F.readframes (nframes) #读取音频, string format wavedata = np.fromstring (strdata,dtype=np.int16) # Convert string to Intwavedata = wavedata*1.0/(max (ABS (Wavedata)) #wave幅值归一化 # plot The Wavetime = Np.arange (0,nframes) * (1.0/ framerate) Plt.plot (time,wavedata) Plt.xlabel ("Time (s)") Plt.ylabel ("amplitude") plt.title ("Single channel Wavedata" ) Plt.grid (' on ') #标尺, on: There, off: none.
Result diagram:
Here the number of channels is 3, mainly with the help of Np.reshape, the other same single channel processing is exactly the same, corresponding code:
#-*-Coding:utf-8-*-"" "Created on Wed May 3 12:15:34 2017@author:nobleding" "" Import waveimport Matplotlib.pyplot as P Ltimport NumPy as Npimport OsFilePath = "./data/" #添加路径filename = Os.listdir (filepath) #得到文件夹下的所有文件名称 f = Wave.open (Filepa Th+filename[0], ' rb ') params = F.getparams () nchannels, Sampwidth, framerate, nframes = Params[:4]strdata = F.readframes ( Nframes) #读取音频, string format wavedata = np.fromstring (strdata,dtype=np.int16) #将字符串转化为intwaveData = wavedata*1.0/(max (ABS ( Wavedata))) #wave幅值归一化waveData = Np.reshape (Wavedata,[nframes,nchannels]) f.close () # plot The Wavetime = Np.arange (0, Nframes) * (1.0/framerate) plt.figure () Plt.subplot (5,1,1) Plt.plot (time,wavedata[:,0]) Plt.xlabel ("Time (s)") Plt.ylabel ("amplitude") plt.title ("Ch-1 wavedata") Plt.grid (' on ') #标尺, on: There, off: none. Plt.subplot (5,1,3) Plt.plot (time,wavedata[:,1]) Plt.xlabel ("Time (s)") Plt.ylabel ("amplitude") plt.title ("Ch-2 Wavedata ") Plt.grid (' on ') #标尺, on: There, off: none. Plt.subplot (5,1,5) Plt.plot (time,wavedata[:,2]) Plt.xlabel ("Time (s)") Plt.ylabel ("AmpliTude ") plt.title (" Ch-3 wavedata ") Plt.grid (' on ') #标尺, on: There, off: none. Plt.show ()
:
Single channel is a special case of multichannel, so the multi-channel read mode for any channel WAV files are applicable. It is important to note that Wavedata is different from the previous data structure after reshape. That is, wavedata[0] is equivalent to the wavedata before reshape, but does not affect the mapping analysis, but only when analyzing the spectrum is necessary to consider this point.
3. wav Write
The main directives involved are three:
Nchannels = 1 #单通道为例sampwidth = 2FS = 8000data_size = Len (outdata) framerate = Int (fs) nframes = Data_sizecomptype = "NONE" C Ompname = "Not Compressed" outwave.setparams ((Nchannels, Sampwidth, framerate, Nframes, Comptype, Compname))
- The storage path and file name of the WAV file to be written to:
outfile = filepath+ ' out1.wav ' Outwave = Wave.open (outfile, ' WB ') #定义存储路径以及文件名
For V in Outdata: outwave.writeframes (Struct.pack (' h ', int (v * 64000/2)) #outData: 16-bit, -32767~32767, take care not to overflow
single-channel data write :
import wave#import Matplotlib.pyplot as Pltimport numpy as Npimport osimport Struct#wav file Reads filepath = "./data/" #添加路径filename = Os.listdir (filepath) #得到文件夹下的所有文件名称 f = Wave.open (filepath+filename[ 1], ' RB ') params = F.getparams () nchannels, Sampwidth, framerate, nframes = Params[:4]strdata = F.readframes (nframes) #读取音频 , string format wavedata = np.fromstring (strdata,dtype=np.int16) #将字符串转化为intwaveData = wavedata*1.0/(max (ABS (Wavedata))) # Wave amplitude Normalization f.close () #wav文件写入outData = wavedata# data to be written to WAV, wavedata data is still taken here outfile = filepath+ ' out1.wav ' Outwave = Wave.open (outfile, ' WB ') #定义存储路径以及文件名nchannels = 1sampwidth = 2FS = 8000data_size = Len (outdata) framerate = Int (fs) nframes = Data_sizecomptype = "NONE" Compname = "Not Compressed" outwave.setparams ((Nchannels, Sampwidth, Framerate, Nframes, CO Mptype, Compname)) #outData: 16-bit, 64000/2 (struct.pack (' h ', int (v * -32767~32))) 767, be careful not to overflow outwave.close ()
multi-channel data write :
Multi-channel write is similar to multi-channel read, multi-channel reading is the one-dimensional data reshape to two-dimensional, multi-channel writing is the two-dimensional data reshape as one dimension, in fact, is a reverse process:
Import Wave#import Matplotlib.pyplot as Pltimport numpy as Npimport osimport struct#wav file Read filepath = "./data/" #添加路径filen Ame= Os.listdir (filepath) #得到文件夹下的所有文件名称 f = Wave.open (filepath+filename[0], ' rb ') params = F.getparams () nchannels, Sampwidth, framerate, nframes = Params[:4]strdata = F.readframes (nframes) #读取音频, string format wavedata = np.fromstring (StrData, dtype=np.int16) #将字符串转化为intwaveData = wavedata*1.0/(max (ABS (wavedata)) #wave幅值归一化waveData = Np.reshape (wavedata,[ Nframes,nchannels]) f.close () #wav文件写入outData = wavedata# The data to be written to WAV, here still takes wavedata data Outdata = Np.reshape (outdata,[ nframes*nchannels,1]) outfile = filepath+ ' out2.wav ' Outwave = Wave.open (outfile, ' WB ') #定义存储路径以及文件名nchannels = 3sampwidth = 2FS = 8000data_size = Len (outdata) framerate = Int (fs) nframes = Data_sizecomptype = "NONE" compname = "Not comp Ressed "Outwave.setparams ((Nchannels, Sampwidth, framerate, Nframes, Comptype, Compname)) for V in Outdata:outwav E.writeframes (Struct.pack (' h ', int (v * 64000/2))) #outData: 16-bit, -32767~32767, be careful not to overflow outwave.close ()
Here to use the struct.pack (.) Binary Conversions :
For example:
4. Audio playback
WAV file playback needs to be used to Pyaudio, install the package click here. I'll put it in the \scripts folder, CMD and switch to the corresponding directory
Pip Install PYAUDIO-0.2.9-CP35-NONE-WIN_AMD64.WHL
Pyaudio installation is complete.
Main list parameters of the open () method of the Pyaudio object:
-
- Rate: Sample Rates
- Channels: Number of channels
- Format: The quantitative format of sampled values can be PaFloat32, PaInt32, PaInt24, PaInt16, PaInt8, and so on. In the following example, use Get_from_width () to convert a value of 2 to Sampwidth to paInt16.
- Input: Enter stream flag, ture indicates start input stream
- Output: Export stream flag
Give the corresponding code:
Import waveimport pyaudio import os#wav file Read filepath = "./data/" #添加路径filename = Os.listdir (filepath) #得到文件夹下的所有文件名称 f = wave.open (filepath+filename[0], ' rb ') params = F.getparams () nchannels, Sampwidth, framerate, nframes = Params[:4 ] #instantiate pyaudio p = pyaudio. Pyaudio () #define STREAM chunk chunk = 1024x768 #打开声音输出流stream = p.open (format = P.get_format_from_width ( Sampwidth), channels = nchannels, rate = framerate, output = True) #写声音输出流到声卡进行播放data = F.readframes ( Chunk) i=1while True: data = f.readframes (chunk) if data = = B ': Break stream.write (data) F.close () #stop stream stream.stop_stream () stream.close () #close pyaudio p.terminate ()
Because it is python3.5, the judgment statement if data = = b': Break's B cannot be missing.
5. Signal Plus window
Usually the signal is truncated, the frame needs to be added to the window, because truncation has a frequency-domain energy leakage, and window function can reduce the impact of truncation.
The window function is in the scipy.signal Signal processing toolbox, such as the Hamming window:
Import scipy.signal as Signalpl.plot (signal.hanning (512))
Using the above function, draw the Hanning window:
Import Pylab as Plimport scipy.signal as Signalpl.figure (figsize= (6,2)) Pl.plot (signal.hanning (512))
6, signal sub-frame
The theoretical basis of the signal sub-frame, where x is the voice signal, W is the window function:
Window truncation similar sampling, in order to ensure that the adjacent frame is not too large, usually frame and frame movement between frames, in fact, is the role of interpolation smoothing.
Give the following:
7, the language spectrum map
In fact, the sub-frame signal, frequency domain change in exchange for amplitude, you can get the spectrogram, if only to observe, Matplotlib.pyplot has specgram directive:
Import Waveimport Matplotlib.pyplot as Pltimport numpy as Npimport OsFilePath = "./data/" #添加路径filename = Os.listdir (Filepa TH) #得到文件夹下的所有文件名称 f = wave.open (filepath+filename[0], ' rb ') params = F.getparams () nchannels, Sampwidth, Framerate, Nframes = Params[:4]strdata = F.readframes (nframes) #读取音频, string format wavedata = np.fromstring (strdata,dtype=np.int16) # Convert string to Intwavedata = wavedata*1.0/(max (ABS (wavedata)) #wave幅值归一化waveData = Np.reshape (Wavedata,[nframes,nchannels] ). Tf.close () # Plot the Waveplt.specgram (Wavedata[0],fs = framerate, Scale_by_freq = True, sides = ' default ') Plt.ylabel (' Fre Quency (Hz) ') Plt.xlabel (' Time (s) ') Plt.show ()
What to do with Python audio processing