Abbreviation explanation
SID Silence Descriptor (Comfort Noise Frame)
1 AMR Code Introduction
AMR Coding is an adaptive multi-rate coding, according to the actual situation of the transmission channel, adjust the coding mode, rate and error correction code number to ensure the voice quality, in the data compression and fault-tolerant balance. The higher the general voice quality, the weaker the anti-jamming ability. In GSM network, base station and base station controller can dynamically adjust the voice coding mode according to the network quality and signal quality situation to improve the voice quality under different network conditions. Now the mobile terminal basically supports AMR code, Nokia has been providing AMR-enabled terminals since 2004, and all new terminals currently support AMR.
AMR algorithm
Reference documents
(1) 3GPP TS26.190, AMR wideband speech codec; transcoding functions (Release 5).
(2) 3GPP TS 26.194, Voice Activity Detection (VAD).
(3) 3GPP TS 26.174, AMR wideband speech codec; Test sequences.
(4) 3GPP TS 26.194, Frame Structure.
2 amr in IP domain net load format
The rfc3267/4867 protocol describes the load format of AMR code in RTP, and is the form of its presence in the IP domain.
For each RTP session, there are two modes of AMR net charge, which is to save the bandwidth mode and byte alignment mode, which is determined by the signaling negotiation, and the latter mode can be used to improve the quality of voice transmission, such as robust sorting, frame cross-coding and CRC checking.
The following three scenarios are for 1IP domain Terminal session, 2 non-IP domain terminal through gateway and 3IP domain Terminal session and non-IP domain terminal communication scenarios, the characteristics of each scenario is described in the protocol.
AMR and the AMR-WB Net charge format
The two formats differ, a. frame type; b. The former has a sampling frequency of 8KHZ and the latter is 16kh;c. Mode is different
The net charge structure is as follows, including the net charge head, the content table and the voice data:
+----------------+-------------------+----------------
| Payload Header | Table of Contents | Speech data ...
+----------------+-------------------+----------------
Payloads containing more than one speech frame-block is called
Compound payloads.
Bandwidth Saving Modes Bandwidth-efficient mode
A. NET charge header format:
0 1 2 3
+-+-+-+-+
| CMR |
+-+-+-+-+
The CMR (codec mode request) encoding mode is requested by the sender to the receiver's request Sender encoder in the future encoding rate mode, save the frame type index, if AMR, the value range is 0-7, 8 rate mode, if AMR-WB, the value range is 0-8, Represents 9 of the rates. A value of 15 means that there is currently no request for which mode is specified.
After the mode is selected, due to the constant sampling frequency, the package length of the packet is still unchanged, the different rate is only the net charge, for example, AMR mode 0, the rate is 4.75kbit/s, the net charge contains the voice data is 95bits.
AMR of 8 Type rate Index table, see [1]
AMR-WB of 9 Type rate Index table, see [2]
B. Net load content table TOC (table of Contents), the first item represents a voice frame
0 1 2 3 4 5
+-+-+-+-+-+-+
| F| FT | q|
+-+-+-+-+-+-+
F: Used to mark whether the last frame, 0 represents the last 1 frames. If multiple frames are used, only the TOC will have multiple items, otherwise there are only 1 items.
FT: Flag to Tone frame speech coding mode or comfort Noise mode, value range with CMR value, ft=14 (Speech_lost, only available for AMR-WB) and ft=15 (No_data), ft=15 indicates no current frame no payload ; value 10-13 the frame to discard.
Q: Indicates frame quality, 0 indicates that the corresponding frame is broken, 1 is not corrupted, and if the frame has been destroyed, it can be handled by directly discarding the frame.
The Net Load content table is the data format description of the frame content,
C. Voice Speech data
The voice data is the real voice frame or the comfortable noise frame data, each frame data description and the TOC each item corresponds, the data length depends on the corresponding pattern TOC item's FT identification pattern.
Example:
Single-channel single-frame
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmr=15|0| Ft=4 |1|d (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| D (147) | p| p|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
Single-channel multi-frame
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Cmr=1 |1| Ft=0 |1|1| Ft=9 |1|1| Ft=15 |1|0| Ft=1 |1|d (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| D (131) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|g (0) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| G (|h) (0)