Use iOS to bring your own AAC encoder

Source: Internet
Author: User

AAC (Advanced Audio Coding), Chinese name: PremiumAudio encoding, appeared in 1997, based onMPEG-2Audio encoding technology. By Fraunhofer IIS,Dolby Labs,At,Sonyand other companies to develop together to replaceMP3format. 2000,MPEG-4after the advent of the standard, AAC re-integrates its features, adding SBR technology and PS technology, in order to differentiate it from the traditional MPEG-2 AAC, also known as MPEG-4 AAC.


The iOS platform supports AAC encoders, primarily using the Audioconverter API in Audiotoolbox. The reason to do AAC encoder is because in doing a HLS function, HLS requires a TS file that requires video to be encoded using H264, audio using AAC. H264 can use hardware or software encoders, as described earlier. AAC can also use hardware or software encoding, all supported by iOS.


First you need to create a converter, an AAC Encoder, using the following interface:

extern osstatusaudioconverternew (      const audiostreambasicdescription*  insourceformat,                        const audiostreambasicdescription*  Indestinationformat,                        audioconverterref*                  outaudioconverter)      __OSX_ Available_starting (__MAC_10_1,__IPHONE_2_0);

Input parameters are data formats for source and destination, respectively.

In the AAC coding scenario, the source format is the collected PCM data, and the destination format is AAC.

    Audiostreambasicdescription inaudiostreambasicdescription;    FILLOUTASBDFORLPCM () Inaudiostreambasicdescription.mformatid = KAUDIOFORMATLINEARPCM;    Inaudiostreambasicdescription.msamplerate = 44100;    Inaudiostreambasicdescription.mbitsperchannel = 16;    Inaudiostreambasicdescription.mframesperpacket = 1;    Inaudiostreambasicdescription.mbytesperframe = 2; Inaudiostreambasicdescription.mbytesperpacket = Inaudiostreambasicdescription.mbytesperframe *    Inaudiostreambasicdescription.mframesperpacket;    Inaudiostreambasicdescription.mchannelsperframe = 1; Inaudiostreambasicdescription.mformatflags = klinearpcmformatflagispacked | Klinearpcmformatflagissignedinteger |    klinearpcmformatflagisnoninterleaved;        inaudiostreambasicdescription.mreserved = 0; Audiostreambasicdescription outaudiostreambasicdescription = {0}; Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ... Outau DiostreambasicdescriptIon.mchannelsperframe = 1;    Outaudiostreambasicdescription.mformatid = KAUDIOFORMATMPEG4AAC;    UInt32 size = sizeof (outaudiostreambasicdescription);    Audioformatgetproperty (kaudioformatproperty_formatinfo, 0, NULL, &size, &outaudiostreambasicdescription); Osstatus status = Audioconverternew (&inaudiostreambasicdescription, &outaudiostreambasicdescription, &_    Audioconverter); if (Status! = 0) {NSLog (@ "Setup Converter failed:%d", (int) status);}

This creates the AAC encoder, which, by default, creates a hardware encoder that, if the hardware is not available, creates a software encoder.

After my testing, the hardware AAC encoder has a very high encoding delay, and it takes about 2 seconds for the data in buffer to start coding. and the coding delay of the software encoder is normal, as long as the feed to 1024 sample points, it will start coding.

So how do you specify to use a software encoder when you create it? You need to use the following interface:

-(Audioclassdescription *) Getaudioclassdescriptionwithtype: (UInt32) Type Fromman        Ufacturer: (UInt32) manufacturer{static audioclassdescription desc;    UInt32 encoderspecifier = type;        Osstatus St;    UInt32 size; st = Audioformatgetpropertyinfo (kaudioformatproperty_encoders, sizeof (Encoderspecifier)    , &encoderspecifier, &size);        if (ST) {NSLog (@ "error getting audio format propery info:%d", (int) (ST));    return nil;    } unsigned int count = size/sizeof (audioclassdescription);    Audioclassdescription Descriptions[count];                                st = Audioformatgetproperty (kaudioformatproperty_encoders, sizeof (Encoderspecifier), &encoderspecifier, &size, D    Escriptions);      if (ST) {  NSLog (@ "error getting audio format propery:%d", (int) (ST));    return nil;            } for (unsigned int i = 0; i < count; i++) {if (type = = Descriptions[i].msubtype) && (manufacturer = = Descriptions[i].mmanufacturer))            {memcpy (&desc, & (Descriptions[i]), sizeof (DESC));        Return &desc; }} return nil;}


Audioclassdescription *desc = [self GETAUDIOCLASSDESCRIPTIONWITHTYPE:KAUDIOFORMATMPEG4AAC fromManufacturer: Kapplesoftwareaudiocodecmanufacturer]; Osstatus status = Audioconverternewspecific (&inaudiostreambasicdescription, & Outaudiostreambasicdescription, 1, desc, &_audioconverter);

If the encoding is correct, the code rate parameter must be set. Otherwise, a 560226676 error code (!DAT) is returned when encoding.

        UInt32 ulbitrate = 64000;        UInt32 ulsize = sizeof (ulbitrate);        Status = Audioconvertersetproperty (_audioconverter, Kaudioconverterencodebitrate, Ulsize, &ulbitrate);

It is important to note that AAC is not a random bitrate to support. For example, if the PCM sampling rate is 44100KHz, then the code rate can be set to 64000bps, if it is 16K, can be set to 32000bps.


After you have created converter and set up bitrate, you can query the size of the maximum encoded output, which will be used later.

    UInt32 value = 0;    size = sizeof (value);    Audioconvertergetproperty (_audioconverter, Kaudioconverterpropertymaximumoutputpacketsize, &size, &value) ;

The value obtained represents the maximum output packet size of the encoder.

Then call Audioconverterfillcomplexbuffer to encode:

        Audiobufferlist outaudiobufferlist = {0};        Outaudiobufferlist.mnumberbuffers = 1;        Outaudiobufferlist.mbuffers[0].mnumberchannels = 1;        Outaudiobufferlist.mbuffers[0].mdatabytesize = Value;//value is the value queried above        Outaudiobufferlist.mbuffers[0].mdata = New Int8[value];                        UInt32 iooutputdatapacketsize = 1;        Status = Audioconverterfillcomplexbuffer (_audioconverter, Ininputdataproc, (__bridge void *) (self), & Iooutputdatapacketsize, &outaudiobufferlist, NULL);


In the encoding interface, ININPUTDATAPROC is a callback function for the input data. To feed the PCM data to Converter,iooutputdatapacketsize 1 means that the encoding produces 1 frames of data that is returned. Outaudiobufferlist is used to store encoded data.

The processing in Ininputdataproc is as follows:

Static Osstatus Ininputdataproc (Audioconverterref inaudioconverter, UInt32 *ionumberdatapackets, AudioBufferList * Iodata, Audiostreampacketdescription **outdatapacketdescription, void *inuserdata) {        Aacencoder *encoder = (__ Bridge Aacencoder *) (inuserdata);    UInt32 requestedpackets = *ionumberdatapackets;    uint8_t *buffer;    uint32_t bufferlength = requestedpackets * 2;    uint32_t Bufferread;    Bufferread = [Encoder.pcmpool readbuffer:&buffer withlength:bufferlength];    if (Bufferread = = 0) {        *ionumberdatapackets = 0;        return-1;    }    iodata->mbuffers[0].mdata = buffer;    Iodata->mbuffers[0].mdatabytesize = Bufferread;    Iodata->mnumberbuffers = 1;    Iodata->mbuffers[0].mnumberchannels = 1;        *ionumberdatapackets = Bufferread >> 1;    return NOERR;}

The Pcmpool is a ring buffer for storing PCM data.

Since the acquisition input does not necessarily have 1024 points at a time, it is possible to cache the data and then call the encoding when it satisfies the 1024 sample points.


In addition, for TS files, each AAC data need to add a Adts header, Adts Head is a 7bit of data, through Adts can learn the encoding parameters of AAC data, convenient decoder decoding.

The Adts head is calculated as follows:

-(nsdata*) Adtsdataforpacketlength: (Nsuinteger) packetlength {int adtslength = 7;    Char *packet = (char *) malloc (sizeof (char) * adtslength);  Variables Recycled by addadtstopacket int profile = 2;    AAC LC//39=mediacodecinfo.codecprofilelevel.aacobjecteld;  int freqidx = 8;  16KHz int chancfg = 1; MPEG-4 Audio Channel Configuration.    1 Channel front-center Nsuinteger fulllength = adtslength + packetlength; Fill in ADTS data packet[0] = (char) 0xff;//11111111 = Syncword packet[1] = (char) 0xf9;//1111 1 xx 1 = syncwor    D MPEG-2 Layer CRC packet[2] = (char) (((profile-1) <<6) + (FREQIDX&LT;&LT;2) + (chancfg>>2));    PACKET[3] = (char) (((chancfg&3) <<6) + (fulllength>>11));    PACKET[4] = (char) ((FULLLENGTH&AMP;0X7FF) >> 3);    PACKET[5] = (char) (((fulllength&7) <<5) + 0x1F);    PACKET[6] = (char) 0xFC;    NSData *data = [NSData datawithbytesnocopy:packet length:adtslength Freewhendone:yes]; return data;} 

The calculation of ADTS head requires several parameters: Profile/frequency/channels/length, refer to Http://wiki.multimedia.cx/index.php?title=ADTS


Reference documents:

Audio Converter Services Reference




Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Use iOS to bring your own AAC encoder

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.