Android 4.1 audio system change description

Last Update:2018-12-05 Source: Internet

Author: User

Tags posix

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Android 4.1, abbreviated as JB. In Chinese eyes, the word JB is also related to animals. Google modified android so frequently, and finally launched a version that can be put on the mouth of JB all day long. In the future, my articles can also use JB to express the version number, while JB to express what Chairman Mao often said, "Strategic contempt". Please try to understand my mood for writing the word JB according to the context. Today, we will give a brief introduction to the earth-shaking changes that JB 4.1 has made in the audio system. Here, I would like to say a few words: just as I often complained after the 80 s that I had been born for a few years later, many coders will immediately complain that it was too late to contact android. Why? The JB audio system is more difficult than 4.0, 2.3, and 2.2. In 99% cases, you didn't see this Nb (this is not a foul word. In 4.1 audio systems, there is a class called nbaio. Do not make the basketball controller mistake into the NBA, the original intention is non-block audio I/O. See, Non-blocking I/OAsk yourself, how many people have a deep understanding of this thing ?) It is unlikely that the JB audio system can be understood based on the evolution of things. Therefore, it is recommended that the students who have never seen the evolution history of audio be carefully studied in the 99% S. (in the past, I only recommended that you take a look at it. Now the requirement for improvement is carefully studied) an in-depth understanding of Android volume I audio system. BTW, in a specific chapter, this book reminds everyone to study various I/O models. I don't know how many people have been jealous of me.This article will be divided into several parts, there is no draft beforehand, so it will be a bit messy. Start with the Java-layer audiotrack class. I. audiotrack Java class change description

In terms of the number of audio channels, Mono and stereo were used in the past, and now they are extended to the Nb eight channels (7.1 hifi AH ). The parameter name is channel_out_7point1_surround. When I see this parameter, my chin crashes. At half past one, I still don't know what it means. We may wish to tell you what we know. Of course, the final output is still dual-channel. When the multi-channel (greater than 2), downmixer will be used for processing (lower conversion processing, which can be searched by students)
There are other changes, but not big. I would like to pick some eye-catching ones here. BTW, rest assured, will not let everyone see the big nostrils just like the first show of yanze Luola.

Description of changes in the JNI layer of audiotrackThis layer includes the JNI layer and audiotrack itself.

The JNI layer does not change much.
The core code of audio native is moved to framework/Av. Yes, you are not mistaken. It's really Av. This is a big change in JB audio. All the core audio native code is moved to the frameworks/AV directory.
Audiotrack adds a variable to control the scheduling priority of the processes that use it (the nicer value is indeed set here, which is incorrect in the previous article ). If the player is playing, the priority of the process scheduling is set to android_priority_audio. Just as you can see the mosaic. I would also like to say a few words here. In the case of single-core CPU, It is silly to set the priority (android_priority_audio value is-16, the priority is extremely high, single-core sets such a high monster, do not know how to play other apps. If you don't know what I'm talking about, read this article first, http://blog.csdn.net/innost/article/details/6940136 ). But now 2-core and 4-core are quite common. Here we can play with scheduling. The true test for silk coders is: multi-core parallel programming, the principle of Linux OS, You need to master silk coders. Audio is no longer so easily abused by you. In addition, the low-end mobile phone, please do not transplant 4.1, this is really not low-end can play.
Audiotrack is upgraded to a father. JB defines an inexplicable timedaudiotrack subclass for it. This class is used in codec aah_rtp (I do not know what AAH is. In terms of annotations, this class is an audio output interface with a timestamp (with a timestamp, You can synchronize it. For a detailed understanding, we need to analyze the specific application scenarios (mainly RTP ). Students who are engaged in codec coding and decoding should hurry up!
Another extremely complex change is that audio defines several output flags (see audio_output_flags_t enumeration definitions of audio. h ). According to the annotation, this value has two functions. One is that at users can specify what outputdevice they want to use. The other is that the device manufacturer can declare the output device that it supports (it seems that the parameter reading and configuration are added during device initialization ). However, from the definition of this enumeration, I still cannot see its relationship with hardware. It defines the following values:

Typedef Enum {audio_output_flag_none = 0x0, // No attributesaudio_output_flag_direct = 0x1, // This output directly connects a track // to one output stream: no software mixeraudio_output_flag_primary = 0x2, // This output is the primary output of // the device. it is unique and must be // present. it is opened by default and // your es routing, audio mode and volume // controls related to voice cballs. Audio_output_flag_fast = 0x4, // output supports "Fast Tracks", = What is fast track? It's hard to understand! Currently, Java-layer audiotrack only uses the first flag. // Defined elsewhereaudio_output_flag_deep_buffer = 0x8 // use deep audio buffers what is = Deep buffer? Is this mosaic too big? I can't see it clearly now ??!} Audio_output_flags_t;

Other changes in audiotrack are not significant. Audiotrack. cpp only has more than 1600 rows in total, so easy!

Okay, there are several mosaics on it. When I look at Japanese movies, I will see them again, but I cannot analyze audio. Pin the hopes of de-mosaic on the analysis of audioflinger in the next step! Description of change to audioflingerWe will introduce the changes based on the main process of AF work:

AF creation, including its onfirstref Function
Openoutput function and mixerthread object Creation
Audiotrack calls the createtrack Function
Audiotrack calls the start Function
AF audio mixing and Output

3.1 af creation and onfirstrefWell, there is no big change. There are three points:

Now we have more detailed control over the volume of the primary device. For example, some devices can set the volume, and some cannot set the volume, so we define a master_volume_support (audioflinger. h) enumeration, used to determine the volume control capability of the primary device.
The standby time in the previous playback process (used for power saving) is hard to write, and can now be controlled by Ro. Audio. flinger_standbytime_ms. If this attribute is not available, the default value is 3 seconds. AF also adds other variable controls. For example, a gscreenstate variable is used to indicate whether the screen is on or off. You can use audiosystem: setparameters to control the parameters. The mbtnrecisoff variable related to Bluetooth SCO is also defined, which is used to control Bluetooth SCO (used during recording, a specialized term on Bluetooth called nrec. If you don't know what it is, tell me) Disable AEC and NS special effects. See audioparameter. cpp

3.2 openoutput FunctionsThe openoutput function is critical, including the old friend mixerthread and audiostreamoutput. The entire process includes loading audio-related hardware so. This part of work came into being in 4.0, not to mention too many changes. However, old friends have changed dramatically. Let's first look at the mixerthread family. Figure 1 playbackthread family figure 1 explained a little:

Threadbase is derived from thread, so it runs in a separate thread, programmers who do not understand multi-thread programming must learn multi-thread carefully ). It defines an enumeration type_t to represent the type of the subclass. These types include mixer, direct, record, duplicating, and so on. Should this be easy to understand?
The internal class trackbase of threadbase is derived from extendedaudiobufferprovider, which should be newly added. Trackbase. You can understand it as a buffer iner.
The internal class pmdeathrecipient of threadbase is used to listen for dead messages of powermanagerservice. This design is a bit tricky, because PMS runs in systemservice. Only when SS crashes will PMS fail. When the SS crashes, the mediaserver will be killed by the init. RC rules, so audioflinger will also die. Since everyone is dead together, the speed is very fast. Therefore, what is the significance of setting this pmdeathrecipient?

Let's take a look at playbackthread, an important subclass of threadbase. This class should be too big.

It defines an enumerative mixer_state to reflect the current mixing status, including mixer_idle, mixer_ready, and mixer_enabled.
Several virtual functions are defined and must be implemented by sub-classes, including threadloop_mix and preparetracks_l. The abstract work of these functions can still be done. However, the great changes are hard to prevent.
The track class has been added to derive from volumeprovider, which is used to control the volume. According to the previous introduction, in JB, Volume management is more detailed than before.
Timedtrack is added. The role of this class is related to the RTP aah mentioned above. After learning this article, you will be able to carry out relevant research and initiate a war of annihilation!

See figure 2. Figure 2 mixerthread and its brethren Figure 2 briefly introduces:

Mixerthread is derived from playbackthread. This relationship will not change from the beginning to the end, and I believe it will not happen in the future.
The biggest change in MT is several important member variables. You must know the audiomixer, which is used for sound mixing.
Add a soaker object (controlled by the compilation macro), which is a thread. The prefix of this word soak is in the Webster Dictionary (I believe people who have experienced the GRE years know what Webster is) the most appropriate explanation is to cause to pay an exorbitant amount. Why? Let's look at the code. It turns out that soaker is a dedicated CPU-playing thread. It constantly performs operations to increase the CPU usage. Its existence should be to test the efficiency of the new AF framework on multi-core CPU and so on. Therefore, you should stop playing JB on low-end smart phones.
Another proof that low-end smart machines cannot play JB is: we can see that a fastmixer is added to Mt, which is also a thread. Understand? In JB, on a multi-core smart machine, mixing can be done in the thread where fastmixer is located. Of course, the speed and efficiency will be high.
The fastmixer workflow is complicated and involves multi-thread synchronization. Therefore, a fastmixerstatequeue is defined here, which is obtained by typedef statequeue <fastmixerstate>. First, it is a statequeue (simply think of it as an array ). The array element type is fastmixerstate. One statequeue saves four fasetmixerstate members through the mstats variable.
Fasetmixerstate is similar to a state machine and has an Enum command to control the state. Fastmixerstate contains the fasttracks array of an eight-element group. FastTrack is a function class used to complete fastmixer.
Each FastTrack has an mbufferprovider, whose member type is sourceaudiobufferprovider.

The above content is already complex. Next we will introduce other content encountered during the creation of the mixerthread object: 3.3 create mixerthreadThrough figures 1 and 2, we should have an understanding of several main members of the AF. Unfortunately, there is also a moutputsink member in the mixerthread above. Didn't you see it? It has a major relationship with the nbaio (non-block audio I/O) we mentioned earlier. Nbaio exists to achieve non-blocking audio input and output operations. The following is the annotation of this class: nbaio annotation: // This header file has the abstract interfaces only. concrete implementation classes are declared // elsewhere. implementations _ shocould _ be non-blocking for all methods, especially read () and // write (), but this is not enforced. in general, implementations do not need to be multi-thread // safe, and any exceptions are noted in the particular implementation. nbaio only defines an interface and requires implementation. Class. Of course, it requires that the read/write function is non-blocking. Whether the actual implementation is blocking or not is controlled by the implementer. I personally feel that this part of the framework is not yet completely mature, but the introduction of nbio requires the attention of the students, which is relatively difficult. Figure 3 shows some nbaio content. Figure 3 nbaio-related content figure 3 is explained as follows:

Nbaio consists of three main classes. One is nbaio_port, which represents the I/O endpoint. A negotiate function is defined for parameter coordination between the caller and the I/O endpoint. Note that the parameter is not set for the I/O endpoint. Because I/O endpoints are often related to hardware, some hardware parameters cannot be changed at will like software. For example, the hardware only supports a sampling rate of 44.1khz at most, while the caller transmits the sampling rate of 48 khz, which directly requires a process of negotiation and matching. This function is difficult to use, mainly because there are many rules. Students can refer to their notes.
Nbaio_sink corresponds to the output endpoint, which defines the write and writevia functions. The writevia function needs to pass a callback function via, which internally calls this via function to obtain data. It is similar to two data push/pull modes.
Nbaio_source corresponds to the input endpoint, which defines the read and readvia functions. The meaning is the same as that of nbaio_sink.
Define a monopipe and monopipereader. PIPE is the pipe. The Pipe communication between monopipe and IPC in Linux is irrelevant, but it only borrows the concept and idea of this pipeline. Monopipe only supports pipe for a single reader (in AF, It is monopipereader ). These two pipelines represent the output and input endpoints of audio.
In Mt, moutputsink points to audiostreamoutsink, which is derived from nbaio_sink and used for normal mixer output. Mpipesink points to monopipe, which is intended for fastmixer. In addition, there is a variable mnormalsink that points to mpipesink or moutputsink Based on fastmixer. The logic of this control is as follows:

Switch (kusefastmixer) {// kusefastmixer is used to control the use of fastmixer. There are four types: Case fastmixer_never: // never use fastmixer. This option is used for debugging, that is, when fastmixer is disabled, Case fastmixer_dynamic: // use it dynamically based on the situation. According to the notes, this function does not seem to fully implement mnormalsink = moutputsink; break; Case fastmixer_always: // use fastmixer forever, and use mnormalsink = mpipesink; break; Case fastmixer_static: // static. This is the default value. However, if you use mpipesink, will you receive the initfastmixer control mnormalsink = initfastmixer? Mpipesink: moutputsink; break;} as described above, kusefastmixer is fastmixer_static by default, but whether mnormalsink points to mpipesink is also controlled by initfastmixer. This variable is determined by the size of mframecount and mnormalframecount. initfastmixer is true only when mframecount is smaller than mnormalframecount. Dizzy... the two framecount are obtained by readoutputparameters of playbackthread. Ask the students to study this code by themselves, which is simple computing. If you want to understand it, it is best to include the parameters and calculate the values. Well, the creation of mixerthread will be analyzed here. It is best to study this code more. Know what a few sibling objects are doing .... 3.4 description of createtrack and startThe biggest change in createtrack is the addition of the mediasyncevent synchronization mechanism. The purpose of mediasyncevent is very simple. Its Java API is interpreted as follows: startrecording (mediasyncevent) is used to start capture only when the playback on a special audio session is complete. the audio session ID is retrieved from a player (E. g mediaplayer, audiotrack or tonegenerator) by use of the getaudiosessionid () method. to put it simply, you must wait for the previous player to finish working before you can start the next playback or recording. This mechanism solves the problem that android sounds often mix up for a long time (currently, a disgusting but effective method is to add a sleep to stagger the problem of multiple players not synchronizing .). Note that this problem does not occur on the iPhone. In addition, the potential benefit of this mechanism is the liberation of students who work on audiopolicy audioroute, It seems(I personally think it can solve this problem) You don't have to worry about the sleep time. In AF, The mediasyncevent mechanism represents syncevent. Let's take a look. The START function does not change much, and syncevent is added to it. In addition, fastmixer and timedtrack processing are involved in createtrack. The core is in the createtrack_l and track constructor of playbackthread. Especially the relationship with fastmixer. According to figure 2, the internal data structure of FM (fastmixer short) is FastTrack, while Mt uses track. Therefore, there is a one-to-one correspondence between them. The FastTrack of FM is saved in the array, so the track using FM will point to this FastTrack through mfastindex. Now we can figure out the relationship between FastTrack and track. We need to discuss the following details about the subsequent data flow to see the workflow of mixerthread. This part is the most important thing! 3.5 mixerthread WorkflowThe difficulty lies in the working principle of fastmixer. However, I would like to tell you in advance that this function has not been completed yet, and there is a bunch of fixme in the code ....But don't be happy too early. It's estimated that it would be good to have the next version right away. Now, looking at this immature thing can relieve the psychological pressure on the mature things. Mt is a thread whose work is mainly completed in threadloop. This function is defined by its base class playbackthread. The general changes are as follows:

Threadloop of playbackthread defines the general process of audio processing. The details are implemented by sub-classes through several virtual functions (such as preparetracks_l, threadloop_mix, and threadloop_write ).
The first major change in MT is preparetracks_l. The first step is to handle fastmix-type tracks and determine whether the track has set the track_fast flag, currently, this flag is not used in JB ). This part of judgment is complicated. Fastmixer first maintains a state machine, and the fastmixer runs in its own thread, so thread synchronization is required. Here, the State is used to control the workflow of fastmixer. Due to multithreading, the audio underrun and overrun statuses (do not know what it is? See the reference books mentioned above !) It is also a tough issue to deal. In addition, a MT object carries an audiomixer object, which completes data mixing, downconversion, and other ultra-difficult tasks, such as digital audio processing. That is to say, for audio mixing, the prepare work in the early stage is still completed by the MT thread, because it can achieve unified management (some tracks do not need to use fastmixer. But when we think about it, everyone wants to process it faster and better. On a multi-core CPU, delivering the mixing work to multiple threads is a good example of making full use of CPU resources, this should be the future trend of Android evolution. So I guess this JB hasn't grown up ....). If you are interested in fastmixer, you must carefully study the preparetracks_l function.
The next important function of MT is threadloop_mix. Because there is a timedtrack class, the process function of audiomixer carries a timestamp, PTS, and presentation timestamp. From the codec perspective, there is also a DTS, decode timestamp. The difference between PTS and DTS is worth mentioning. DTS is the decoding time, but the current frame may be encoded according to the future frame. Therefore, when decoding, the next frame is first parsed, and then the current frame is decoded,. You cannot play future Frames first during playback. You can only play the current frame in the playing order, and then play the future frame (although the future frame is first obtained ). For pts/DTS, Please study IBP-related knowledge. Back to Mt, this PTS is obtained from the hardware Hal object and should be the timestamp maintained by Hal. In principle, this timestamp is accurate.
After mixing, perform special effects (similar to the previous version) and call threadloop_write. The output endpoint of the threadloop_write function of MT is the groose mnormalsink. If it is not empty, the write function is called. The thought is to call nbaio_sink's non-blocking write function. According to the analysis in Figure 2, it may be the monopipe or audiostreamoutputsink. This sink node uses the former audiostreamoutput. The write of monopipe is a buffer internally. There is no link to the actual audio Hal output. What about this ?? (Bold assumptions, careful proof. The buffer can only be obtained by fastmixer, and then written to the Real Audio Hal. Because in the mixerthread constructor, moutputsink was saved for FastTrack, which is used to contact audiostreamoutput)

In addition, there is not much change in dulicatingthread and directouptutthread. Iv. Simple Description of how fastmixer worksI previously thought that the mixing work was completed by both the fastmixer thread and the mixerthread thread, but the output work is still done by the mixerthread. From the monopipe analysis above, this judgment may be inaccurate. It may be that the output work is also done by fastmixer, while mixerthread only performs some mixing work, and then transmits the data to the fastmixer thread through monopipe. The fastmixer thread re-mixes the FastTrack audio mixing result and the MT audio mixing result, and then outputs the data by fastmixer. FM is defined in fastmixer. cpp. The core is a threadloop. Since the MT thread is used to prepare all the track tasks of the AF, The threadloop of the FM basically performs corresponding Processing Based on the status. In this example, the underlying futex (fast userspace mutex) in Linux is used for synchronization ). Futex is the implementation basis of POSIX mutex. I don't know why the developer who writes this Code directly uses mutex (it's probably still inefficient, but what's worse if mutex is used? The code is written to people. It's too B4 for us ...). I admire playing with multiple threads! Programming, POSIX multithread

Fastmixer also uses an audiomixer for its sound mixing.
Then write it out .....

Here is a simple description of FM. For details, I didn't get a real machine for me, and I couldn't make it all .... you are welcome to give me a chance to refresh a 4.1 machine and lend it to me for research... (This is not too difficult. Things. I can't bear to worry about it, but I can always figure it out ). Today, you know the general workflow of FM and Mt. Other changesOther changes include:

Debugging is very important, and a large number of xxxdump classes are added. It seems that Google has encountered many problems during its own development. Who will think about dump a simple function?
Added the audiowatchdog class to monitor af performance, such as CPU usage.

SummaryI remember when I studied 2.2 AF, audioflinger had more than 3 K rows, while JB had more than 9 K rows. Other auxiliary classes are not included yet. Overall, the change trend of JB is:

To make full use of multi-core resources, the emergence of fastmixer is inevitable. It also includes the nbaio interface. I feel that there are great challenges to writing Hal.
Timedtrack and syncevent are added, which provides a good user experience for RTP or multiple players.
Added the native-layer notification interface to the Java-layer.

There are other things... come here today. The test of diaosi:

Linux OS programming and POSIX programming must be proficient.
Complex code analysis capabilities must be improved as soon as possible. Otherwise, it cannot be understood.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More