Android MP3 recording implementation
Android recording supports amr and aac formats, but these two audio formats do not work well across platforms. MP3 is obviously the best cross-platform choice. Recently, due to project requirements, this requirement has been met. The code is hosted on Github. For details about how to use GavinCT/AndroidMP3Recorder, see README. md recommendation to directly download libs.zip and integrate it into your own project. [Tips: download this zip file only. You can use the Chrome plug-in GitHub Mate] Implementation overview. Before analyzing the code, we need to clarify several issues. 1. it is best to use Lame as a mature solution to generate MP3 files. For Android, you need to use JNI to call the Lame C language code to convert the audio format. 2. How to obtain the original audio data AudioRecord class can help us directly obtain the audio data. 3. How to convert code on the internet is recorded first and then converted to MP3, which is relatively inefficient. If the recording time is too long, the conversion time will get longer, and the waiting time for the user to store the recording will get longer. Samsung Developers first records and then transfers the sample code. Obviously, this solution is not desirable. What we need is the implementation of edge recording and edge conversion, so that it will not take too long to stop recording for storage. Since the implementation code is a recording, we also mentioned the need to use the AudioRecord class. Let's start with the constructor of this class and start with the constructor public AudioRecord (int audioSource, int sampleRateInHz, int channelConfig, int audioFormat, int bufferSizeInBytes) There are many parameters for the constructor. For details, see audioSource: sound source. MediaRecorder is generally used. audioSource. MIC indicates that the microphone is from sampleRateInHz: the official statement states that only 44100Hz is supported by all devices. Other 22050, 16000, and 11025 can only be used on certain devices. ChannelConfig: there are two types: Stereo (CHANNEL_IN_STEREO) and single channel (CHANNEL_IN_MONO. However, only the single channel (CHANNEL_IN_MONO) is supported by all devices. AudioFormat: Audio Encoding formats include ENCODING_PCM_16BIT and ENCODING_PCM_8BIT. Similarly, the official statement that only ENCODING_PCM_16BIT is supported by all devices. BufferSizeInBytes: buffer size (in bytes) for writing audio data during recording ). In fact, from the above explanation, we can see that there are many class parameters, but to ensure that they can be used on all devices, we really need to fill in only one parameter: bufferSizeInBytes, you can use common parameters instead of worrying about them. Before going deep into what bufferSizeInBytes should pass in, let's skip this section and talk about the reading and conversion of the recording. The reading of the recording is similar to that of the conversion policy. Therefore, you need to constantly read data. Since it is continuous, we certainly need to read it cyclically, which means we need a thread to read the recording separately to avoid blocking the main thread. Similar to UDP, if the data is not read in time and exceeds the buffer size, the recording data will be lost. As mentioned above, what we want to achieve is edge recording and edge conversion. Then the problem arises. If we read the data and then pass the data to the Lame for MP3 encoding, the encoding time of the Lame is uncertain. Is it possible to cause data loss? The answer is certainly possible, so we cannot coincidental programming. We need another thread, that is, the data encoding thread, dedicated to MP3 encoding. The current recording reading thread is only responsible for reading the PCM Data of the recording. With two threads, we need to confirm when the encoding thread starts to process data? When the encoding thread processes data, the traditional method is to start processing when there is data in the thread. This requires the thread to repeatedly check whether there is data to be processed, if there is data, we can start processing. If there is no data, we can take a few milliseconds off for the time being (of course we can do it without sleep, but it causes too much system consumption ). This method is obviously inefficient, because no matter how long we let the thread rest, it can be determined as unreasonable. Because we do not know the exact time. Is there any other way? Obviously, the recording class knows when to process data and when to rest. Don't call me, I will call you. Yes, we should check whether there is a listener and let the recording notify the coding thread to start working. AudioRecord provides us with this method: public int setPositionNotificationPeriod (int periodInFrames) Added in API level 3 Sets the period at which the listener is called, if set with setRecordPositionUpdateListener (listener) or setRecordPositionUpdateListener (OnRecordPositionUpdateListener, Handler ). it is possible for communications to be lost if the period is too small. set the notification cycle. The Unit is frame. Here, we can come back to explain the size of bufferSizeInBytes passed in. In fact, the AudioRecord class provides a convenient method for getMinBufferSize to obtain the buffer size. Public static int getMinBufferSize (int sampleRateInHz, int channelConfig, int audioFormat) the three parameters here can be seen from the parameters of the constructor, so there is no problem with the input. But the key is that we set the cycle unit above. What if the obtained buffer size is not an integer multiple of the cycle unit? If it is not an integer multiple, it will certainly cause data loss as we guessed. Therefore, we also need some data correction to ensure that the buffer size is an integer multiple. MBufferSize = AudioRecord. getMinBufferSize (DEFAULT_SAMPLING_RATE, DEFAULT_CHANNEL_CONFIG, DEFAULT_AUDIO_FORMAT.getAudioFormat (); int bytesPerFrame = DEFAULT_AUDIO_FORMAT.getBytesPerFrame ();/* Get number of samples. calculate the buffer size * (round up to the factor of given frame size) * enables Division to facilitate the following periodic notifications **/int frameSize = mBufferSize/bytesPerFrame; if (frameSize % FRAME_COUNT! = 0) {frameSize + = (FRAME_COUNT-frameSize % FRAME_COUNT); mBufferSize = frameSize * bytesPerFrame;} finished the data obtaining thread and encoding thread, let's take a closer look at the hero who helps us implement MP3 encoding: Lame acquisition and compilation Lame online steps to decompress libmp 3lame to the jni directory. copy lame. h (under the include directory) to create Android. mk LOCAL_PATH: = $ (call my-dir) include $ (CLEAR_VARS) LOCAL_MODULE: = mp3lameLOCAL_SRC_FILES: = bitstream. c fft. c id3tag. c mpglib_interface.c presets. c quantize. c reservoir. c tables. c util. C VbrTag. c encoder. c gain_analysis.c lame. c newmdct. c psymodel. c quantize_pvt.c set_get.c takehiro. c vbrquantize. c version. cinclude $ (BUILD_SHARED_LIBRARY) Delete non. c /. h file: GNU autotools, Makefile. am Makefile. in libmp 3lame_vc8.vcproj logoe. ico depcomp, folders i386 and other useless files. Edit jni/utils. h. Replace extern ieee754_float32_t fast_log2 (ieee754_float32_t x) with extern float fast_log2 (float x );. If you forget to replace it, the following error will be reported during compilation: [armeabi] Compile thumb: mp3lame <= bitstream. cIn file encoded ded from jni/bitstream. c: 36: 0: jni/util. h: 574: 5: error: unknown type name 'ieee754 _ float32_t 'jni/util. h: 574: 40: error: unknown type name 'ieee754_float32_t'make.exe: *** [obj/local/armeabi/objs/mp3lame/bitstream. o] Error 1: Compile the library file. A warning may be reported. Ignore it. Lame method to be provided externally init initialization inSamplerate: input sampling frequency HzinChannel: Number of input channels outSamplerate: output sampling frequency HzoutBitrate: Encoded bit rate. KHzquality: MP3 audio quality. 0 ~ 9. 0 is the best, very slow, and 9 is the worst. Recommendation: 2: near-best quality, not too slow5: good quality, fast7: OK quality, really fastprivate static final int DEFAULT_LAME_MP3_QUALITY = 7;/*** is related to DEFAULT_CHANNEL_CONFIG, because it is mono single sound, it is 1 */private static final int DEFAULT_LAME_IN_CHANNEL = 1;/*** Encoded bit rate. MP3 file will be encoded with bit rate 32 kbps */private static final int DEFAULT_LAME_MP3_BIT_RATE = 32;/** Initialize lame buffer * mp3 Sampling rate is the same as the recorded pcm sampling rate * The bit rate is 32kbps **/LameUtil. init (DEFAULT_SAMPLING_RATE, percent, DEFAULT_SAMPLING_RATE, percent, percent); encodebufferLeft: left channel data bufferRight: right channel data samples: input data size per channel mp3buf: used to receive converted data. 7200 + (1.25 * buffer_l.length) Here we need to explain: Task task = mTasks. remove (0); short [] buffer = task. getData (); int readSize = task. getReadSize (); int encodedSize = LameUtil. encode (buffer, buffer, readSize, matrix buffer); left-right channel: The current channel is a single channel, so the two sides pass in the same buffer. Input data size: the data read by the recording thread to the buffer is not necessarily full. Therefore, the read method returns the current size, that is, the first size data is valid audio data, the following data is the previous waste data. This size also needs to be passed into the Lame encoder for encoding. Buffer for mp3: the formula for calculating the buffer for mp3 is 7200 + (1.25 * buffer_l.length ). (You can see in the lame. h file) flush writes the MP3 end information to the buffer. Input parameter: mp3buf must be at least 7200 bytes. Here we still use the previously defined mp3buf to pass in, to avoid creating too many arrays. Close and close to release the Lame OK. Here, the core conversion code is complete. Let's start with the icing on the cake. Generally, we have a requirement when recording the volume. An animation is displayed based on the volume to make the recording more vivid. Of course, I also provide it in this library. How can we calculate the volume? I have referred to Samsung's volume calculation. Summary:/*** this calculation method comes from the samsung development example ** @ param buffer * @ param readSize */private void calculateRealVolume (short [] buffer, int readSize) {int sum = 0; for (int I = 0; I <readSize; I ++) {sum + = buffer [I] * buffer [I];} if (readSize> 0) {double amplume = sum/readSize; mVolume = (int) Math. sqrt (amplrt) ;}}; I do not know the maximum volume. The maximum volume is 4000 in Samsung's code, but I found in actual tests that the volume calculated by this formula is generally less than 1500. Therefore, in my provided recording library, I set the maximum volume to 2000. You are welcome to give your comments.