MyVoix2.0.js source code analysis: WebSpeech and WebAudio, webspeechwebaudio
Wedge
With the advent of the mobile Internet era, various mobile devices have entered our lives. Whether it's a mobile phone for people in daily life, a variety of smart wristbands essential for night runners, or google glass cloud, full of future technologies, they are gradually changing our living habits and user interaction habits. With the touch screen replacing the physical buttons, Siri began to slowly release our hands, and hardware such as leap motion allowed us to control them through gestures without having to touch IT devices. In this context, front-end interaction will involve more and more diverse interdisciplinary disciplines. Just as people experienced the birth of Css more than a decade ago, witness a revolution that drives the entire industry and society.
Put IE6 down, just as you put down the table layout
If your work is still strictly compatible with IE6 every day, it is time to get up and look at the scenery around you. HTML5 is far more than just the gradient and rounded corner. In this blog post, I will introduce the two sets of Apis: Web Speech and Web Audio. This is MyVoix. the core component of the js framework has been implemented in earlier versions of chrome. You can call them through the webkit prefix.
Web Speech
The WebSpeech API is released by the Speech API Community Group. The main function is to convert voice input into text. If you can add a server for semantic analysis, it is a Siri. You can also simply understand it as a programmable input voice input box (the following code ).
<input type=”text” x-webkit-speech lang=”zh-CN” />
In the era when Siri was just born, the above Code was a powerful tool for the front-end industry to make diaosi rich and handsome, the boss did not know that you had to write a line of code to create a voice input box (for students who secretly added x-webkit-speech to input, please like it ). However, the label of only one input box obviously cannot satisfy the programmer's burning desire to control the code, so Web Speech came into being.
Return to the code. To use the webSpeech API, we must first create a window. webkitSpeechRecognition object.
1 var _rec =new window.webkitSpeechRecognition();2 _rec.continuous=true;3 _rec.interimResults=false;4 _rec.lang='en-US';5 _rec.maxAlternatives=1;
For everyone to see clearly, here I slightly modified the source code in MyVoix. We can see that several parameters need to be configured after the SpeechRecognition instance is created.
Continuous: if it is set to false, after the instance starts, if there is no input or input error, it will be returned immediately. Here, we need to wait until a meaningful input is returned, so we set it to true.
InterimResults: if it is set to true, words in the analysis process will be continuously returned during onresult. Here, you only need the final analysis result, so set it to false.
Lang: This is a language that everyone should understand.
MaxAlternatives: sets the maximum SpeechRecognitionAlternatives for each result.
Next, we call the instance start method to enable the SpeechdRecognition listener. However, you also need to specify the onresult event of the instance before this method. After the speech analysis is complete, the word will be passed in this method.
1 _rec.onresult=function(eve){ 2 var len = eve.results.length, 3 i = eve.resultIndex, 4 j = 0, 5 listeners, 6 command; 7 for (i; i < len; i += 1) { 8 if (eve.results[i].isFinal) { 9 // get words10 command = eve.results[i][0].transcript.replace(/^\s+|\s+$/g, '').toLowerCase();11 if(console.log){12 console.log(eve.results[i][0].transcript); 13 } 14 //your code here.... 15 }16 }17 };18 _rec.start();
In MyVoix, the binding of word events has its own architecture, which will be detailed in subsequent blog posts.
Web Audio
After the speech recognition is completed, we need to obtain the input signal of the microphone to implement the Waveform Drawing function in MyVoix. If you have used javascript to call the camera, you must have used navigator. webkitGetUserMedia to obtain the Audio source data of the microphone in Web Audio. Let's take a look at the source code in MyVoix:
1 navigator.webkitGetUserMedia({audio:true},function(e){ 2 var context = new webkitAudioContext(), 3 javascriptNode = context.createScriptProcessor(2048, 1, 1), 4 audioInput = context.createMediaStreamSource(e), 5 analyser = context.createAnalyser(), 6 splitter = context.createChannelSplitter(); 7 analyser.smoothingTimeConstant = 0.3; 8 analyser.fftSize = 1024; 9 audioInput.connect(splitter);10 splitter.connect(analyser,0,0);11 analyser.connect(javascriptNode);12 javascriptNode.connect (context.destination); 13 14 javascriptNode.onaudioprocess = function(e) { 15 var array = new Uint8Array(analyser.frequencyBinCount);16 analyser.getByteFrequencyData(array);17 var average = me.getAverageVolume(e.inputBuffer.getChannelData (0));18 if (average > 0) {19 me.changeNoise(average);20 me.changeFrequence(average);21 }22 }23 },function(){});
At first glance, it is a bit similar to WebGL. You need to link a bunch of things. Further code analysis:
navigator.webkitGetUserMedia({audio:true},function(e){ //success callback //... },function(){ //error callback //... };
The first step is to use the webkitGetUserMedia object to call the local microphone. The main code is implemented in the success callback function.
var context = new webkitAudioContext(), audioInput = context.createMediaStreamSource(e);
Then we need to create a webkitAudioContext instance, through which we can create many useful components. The createMediaStreamSource method and the parameters in the getUserMedia success callback function can be used to create an input source. The output location of context. destination can be eventually connected through layer-by-layer transmission. Let's take a look at several nodes used in MyVoix:
Analyser: This is used to analyze audio sources, which are generally used for sound visualization.
Splitter: this node is used for audio channel conversion. In MyVoix, we use it to change the audio source to the left or right two channels.
JavascriptNode: this node is used for javascript-level listening. Through the onaudioprocess function, every time the sound completes sampling, call the function for drawing the waveform and connect it to the output end.
In MyVoix2.0, only the nodes created by the preceding AudioContext are used. By adding other nodes in a similar way, Web Audio can also implement functions such as locating and playing sound in 3D space.
Tail sound
This article introduces two core technologies used in MyVoix. Through this blog, I hope you will have a general understanding of the implementation of speech technology in html. We write in the garden. We don't have to fight like Mr. Lu Xun, but we are looking forward to the technology to advance this era.
Myvoix Source Code address
Indicate the source for forwarding: Http://www.cnblogs.com/Arthus/p/3889457.html