Spread voice over the Internet

Source: Internet
Author: User

Use audio compression manager (ACM)

Author/Peter Morris translation/Chen

A few years ago, IP phones were hyped up, but many people thought this technology was very immature, with poor speech quality, intermittent interruption, and frequent latencies, as a result, the application of this technology is not widely used. However, with the increasing popularity of Internet applications and the rapid development of related technologies, the sound quality is constantly improved. making phone calls over the Internet is no longer a dream and has become part of our lives. Today, I use an IP phone to make a long journey. I only need 3 cents a minute to use OICQ voice chat. It seems that my friends are by your side. If we have enough bandwidth, we can even watch QuickTime live videos on the Internet, use RealAudio to listen to radio broadcasts, or play nice mp3 songs on demand. But for programmers, it is easier for us to program these media streams. Let's take a look at codec first.

 Codecs

What is codecs? In fact, it is the decoding encoder of audio compression, which is actually a bit similar to ActiveX control. ActiveX controls allow programmers to call functions that others can achieve without starting from scratch. Codecs provides similar functions, but it focuses on how to convert media formats. For example, if you want to write a CD-to-MP3 application, you only need to do the following:

L read data from the CD audio track.

L generate a valid MP3 file header.

L call the corresponding CODEC to encode the audio track data into MP3.

Windows has a lot of codec. The following are some of the descriptions:

Name

Description

GSM

It seems to be used for some mobile phone networks.

DSP truespeech

A sound format can be generated for voice calls-the sound is very clear.

Fraunhoffer IIS MP3

This type can be used to generate MP3 formats.

PCM

It is used to generate Windows Standard sound formats. Most codec-supported sound formats can be converted to each other.

The complete list of currently installed codec can be obtained by viewing the multimedia section in the control panel.

ACM API

ACM is the abbreviation of audio compression manager, which is translated as the audio compression manager. It is an interface function library compiled by Microsoft to call the codec function. It should have been declared in the mmsystem. Pas unit, but Borland omitted it for some reason. So the first thing we need to do is find its API Declaration unit msacm. Pas. Thanks to Francois piette, he shared his converted unit file and we can download it from www.Delphi-Jedi.org.

You can use ACM to convert a media format by taking the following steps:

L the input and output formats must be specified first. We need to set the twaveformatex record, but this structure record is too small to accommodate the information required by most codec. To solve this problem, we use a custom tacmformat record, which is increased by 128 bytes based on twaveformatex.

L open an ACM stream. Call the acmstreamopen function to pass the input and output formats as parameters. Then ACM returns a valid handle or an error code (such as acmerr_notpossible) to indicate that the conversion request cannot be completed.

L determine the size of the output buffer. Calling the acmstreamsize function will notify ACM of the number of bytes of data that will be generated each time, and then the function will return the request size buffer (we should always overestimate the size to ensure a buffer that is large enough ).

L then, we need to generate a conversion header. You need to call the acmstreamprepareheader function and use the stream handle returned by the previously called acmstreamopen function as the parameter. The generated conversion header will tell the ACM source buffer and destination buffer address. ACM does not automatically allocate memory. We must apply for memory by ourselves.

L all the preparation work is complete, and only how to convert the data is left. This is very simple. You only need to call the acmstreamconvert function. Parameters of the acmstreamconvert function include the stream handle and conversion header handle. This function sets the cbdstlengthused in the conversion header to indicate the actual number of bytes used during the conversion process.

L once the ACM session is completed, all resources used must be released. The conversion header is released using the acmstreamunprepareheader function, and the stream is closed using acmstreamclose.

 Select format

As mentioned above, the input and output formats must be set before conversion starts. The twaveformatex record (declared in the mmsystem. Pas unit) only specifies the bit rate, frequency, and so on. Twaveformatex is not enough unless we only intend to convert between different PCM formats. The following is an alternative format:

Tacmwaveformat =Packed Record

CaseIntegerOf

0: (Format: twaveformatex );

1: (rawdata:Array[0 .. 128]OfByte );

End;

This variant record allows us to still read the twaveformatex structure data, while rawdata provides sufficient space to accommodate additional information required by other codec.

Although we do not know the size of the additional information, we can use the acmformatchoose function to obtain it.

The acmformatchoose function only requires a tacmformatchoosea parameter. This parameter is a simple structure that can contain the following information:


 

Member

Description

Pwfx

A twaveformatex structure pointer is used to receive results (Here we actually use tacmformat ).

Cbwfx

The buffer size of the received result.

Cbstruct

Structure size.

 

Another member worth mentioning is fdwstyle, which includes a flag used to specify additional information in the format. In particular, the following mark:

 

Acmformatchoose_stylef_inittowfxstruct

 

This flag indicates that the buffer to which pwfx points has included a valid format. When the acmformatchoose function is called, a format selection dialog box is displayed, and the valid format is displayed as the default value.

 

Under what circumstances cannot be converted?

 

One reason is that some codec on one machine may not exist on another machine. As a result, you can read a sound format but cannot generate it. Fraunhoffer IIS MP3 codec has this problem. In Windows 9x and Windows NT, we can generate MP3 files, but this function is removed in Windows 2000. Although we can listen to MP3 in windows, when we cannot generate MP3, unless we pay a sum of money, faint.

 

Another reason is that not all ACM formats can be converted to each other. For example, we cannot convert the following formats:

 

GSM 8-Bit Single Channel> MP3 8-Bit Single Channel

 

Although direct conversion is not possible, the intermediate format can be used for indirect conversion. The intermediate format is generally PCM format, because most codec supports conversion of PCM format. The new conversion path becomes:

 

GSM 8-Bit Single Channel> PCM 8-Bit Single Channel> MP3 8-Bit Single Channel

 

Another step is to convert the 8-bit PCM format to the 16-bit PCM format.

 

ACM's potential skills

 

As mentioned above, ACM has limited functions, but can only convert one media format to another. However, when you think of writing a decoder that can be used for Internet audio streams, such as MP3 compression functions, you will find that ACM is really good, the price is cheap.

 

At the same time, imagine how simple it is to implement an Internet phone with ACM. First, we can obtain the input audio data from the microphone and compress it into a suitable low-bandwidth stream format, then, it is transmitted to the target computer through the TCP/IP protocol. At the same time, the target computer receives compressed data, decompress it, and then play it out through the speaker.

 

The potential power of ACM is that a considerable number of ACM codec can be mapped to the wave format, which means they can be used as standard wave devices to play or record audio in real time. One of the most important requirements of VoIP is real-time performance.

 

For example, we can easily open a GSM Sound Input Source. Once we receive data from the wave input device, the data has been compressed and can be transmitted immediately. At the same time, once the data is received through the TCP/IP socket, we can immediately play it through the wave output device.

 

Note that the standard PCM format data is too big for real-time voice transmission through MODEM, while GSM 6.1 can achieve real-time performance as long as 1.5 K/second, however, a 16-bit single-channel MP3 only requires 2 k/second bandwidth.

 

In addition, the MP3 format can be used as a playback format, but it is not suitable for the input format (because only MP3 codec on the Windows NT platform supports MP3 encoding ), we usually need to use other programs to manually convert and generate MP3 files, so it is not suitable for applications on network phones, but it is very suitable for on-demand network because of its high compression ratio, in addition, the sound quality is less distorted.

 

The principle is as simple as described above, but many things are always easy to say and difficult to do. Therefore, the following describes several controls and programs. Because the code is long, it is not listed in detail here, just a brief description.

 

Widget

 

LTacmconverter: This control has two functions. First, it can convert data between two different media formats. Second, this control can be used to specify the input and output formats of ACM streams. (You can use the right-click Control editor to call the acmformatchoose function display format selection dialog box to specify the format ).

LTacmin: Used to receive data from the microphone. We use the standard PCM format or the format supported by other wave input devices to record data.

LTacmout: This control is used to play back the sound. The numbuffers attribute can be used to specify the number of buffers used before playback. This is of little significance for real-time audio transmission, but it is very convenient for audio broadcast on the Internet. At the same time, when the connection data fluctuates, additional audio data can be buffered.

 

Demo

 

In the first example, use the tacmconvertor control to specify the input and output formats, and then open an acmin and acmout control. The input data in the microphone is immediately played back, but there is a little latency to generate a little echo.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.