A recent study on how WebRTC speech runs on iOS found that the voice_engine of WebRTC has implemented iOS-related classes, but encountered a series of problems in specific applications. After several days of hard work, finally, we solved a series of problems and successfully realized recording and playing local loop in the simulator.
Compile the testProgramIn the process, we plan to use the libjingle Library as the WebRTC's outer packaging library to solve the interface problems of voice transmission control and facilitate future expansion, in the end, we found that libjingle did not seem to consider IOS applications. Therefore, the first step of the long journey was taken.
1. Analyze libjingle and extract some classes to compile a for iOS version.
This part of the work is not very difficult. It just references some classes used and compiles them into a static library. Fortunately, the platform-relatedCode. However, I also encountered two problems in the process, which are not related to speech. They are video (voice_engin), but I want to compile some video code first, this makes it easier for later video tasks.
First, libjingle does not implement the internal video capture (include_internal_video_capture) and rendering (include_internal_video_render) class for iOS. OK, block it, and remove it from the configuration file of gyp.
The second yasm project is used to compile the compilation stuff. The yasm project generated through Gyp cannot be compiled. Why, because IOS does not support compiling projects whose target is a command line program. However, this project is used to compile JPEG and libvpx. If it is missing, it will not work. It seems that there is only a detour. After research, we found that the lib1__turbo version is available on the Internet (for iOS simulator and device version), so we can download it decisively and add the ios OS to the Gyp configuration file for judgment, if it is iOS, remove the original lib1__turbo and reference the compiled version.
The third is the libvpx project, which I have not completely solved. The reason is that this project also needs to be compiled using yasm, and the project generated using Gyp cannot be compiled. After a thorough research, we found that in the Mac version project generated by Gyp, the static library compiled by libvpx is for the i386 and x86 architectures. It seems that here, without conjecture, The i386 here can be used in the IOS simulator? Reference it in, and the compilation was successful. However, I don't know whether it can be run. No matter what it is, I only use audio and no video. Can I run the static library on the iPhone.
Here, the test program is finally written, compiled, and tested.
2. A painful test path
Why is there no sound? How can there be no sound? Check carefully from start to end, set the compilation condition to _ debug, set the trace callback of WebRTC voice_engine, and print all log information. That's right, all functions are called successfully, we can see that the collected voice data is sent through the libjingle interface. The voice data received from the interface is also transmitted to voice_engine correctly, but there is no sound!
It seems that there is a bottleneck and the work is stuck.
Three days later, the problem persists.
Why is there data collected and the data has been transferred to the engine. Is it a test program, voice_engin, or libjingle.
Push it backwards to see if there is a problem with playing in the last step? Find the code that the test program receives the voice data and break the breakpoint. Why is the voice data always 0? Is it all 0? Is there a problem with transmission? No matter, manually set the data to a random number. Test, finally there is a sound, the noise, all the noise, but after all it is also the sound, tears... the noise, you have strengthened my determination to continue.
It seems that there is a collection problem.
No way. read and analyze the code, the most primitive method. Find the IOS voice collection code (audio_device_ios.cc). Read it carefully.
It seems that there is no problem from the beginning to the end. Why? The collected data is all 0.
It's really painful. The initialization, buffer management, data collection, and sending of voice are all correct, right ?! The checkout code from goolge's SVN should have been tested. Isn't there any problem? In the spirit of doubting everything, start to verify with caution.
An audiounit sample (auriotouch) is added to the Apple development website to compile and run the sample. Yes, it can be recorded or played. OK. A line of code is similar to initialization, collection, and buffer management, but the sampling rate and other parameters are different, but this does not affect it. It turns out that it does not affect it.
The parameter componentsubtype of audiocomponentdescription is a bit different. kaudiounitsubtype_remoteio is set in the sample, and kaudiounitsubtype_voiceprocessingio is used in WebRTC. Is there a problem? It turns out that the problem is here! Modify and set the WebRTC parameter value to kaudiounitsubtype_remoteio! Ah, I finally heard the local loop. Happy!
What are the differences between the two parameters? Where should they be used? Why is kaudiounitsubtype_voiceprocessingio used in WebRTC? I don't know!
Can someone answer this question?