The SIP protocol is an application-layer control protocol used to establish, modify, and terminate multimedia sessions. It draws a lot from mature HTTP protocols (such as text format encoding and method in request messages ), the text-based UTF-8 encoding method can be used to carry the UDP or TCP protocol (UDP preferred ). Similar to the Diameter protocol, SIP also has a basic protocol and many extension protocols, which are defined in rfc3261. this article mainly summarizes the key points of the basic protocol.
1. Basic concepts:
·Session:In short, a session is a call. All the SIP messages between the dial-up and the end-up belong to one session, and they have the same call-id.
·Dialog:Based on peer-to-peer, it describes the association between user agents at both ends within a period of time. The dialog ID is represented by the dialog ID. The dialog ID consists of three parts: Call-ID, tag in from, and tag in. Only 2XX and 101-199 messages that respond to invite messages can create a dialog. 100 trying cannot create a dialog because the to attribute in 100 trying does not have a tag value.
·Transaction:A transction is composed of one request message and one or more temporary or final response messages. If the response to invite is 200 OK, the next ACK message is considered as another transaction.
2. SIP network entity:
·UA (User Agent ):A sip device that interacts directly with users. It can be a hardware SIP Phone or a software with the SIP Phone function on a computer; the UA that sends the SIP request message is the UAC (User Agent client). The UA that receives the SIP request message is the UAS (User Agent Server). a ua is generally both UAC and UAS.
·Proxy Server:Forwards messages to the end user or another proxy server.
·Redirect Server:Similar to the Redirect server of diameter, a message is not forwarded. Instead, one or more addresses are returned to the message sender. It is recommended that the sender send messages to these addresses. The address returned by the Redirect server is not necessarily the end user address, but may also be the address of another proxy server.
·Registrars:Because SIP needs to support user mobility, when the user changes the location, the user needs to register the terminal in the new location. Registrars accepts the SIP User Registration to know where to find the current user. Registrars are usually located in the SIP Server (proxy server or redirect server.
·Location Server:It is not a sip entity. It is not a sip entity because there is no need to run the SIP protocol stack on the location server. The communication between the SIP server and the location server is not based on SIP (for example, LDAP ). Location server is used to store the location data (IP address or hostname) of a SIP user. For example, when a SIP User registers with the SIP server, the SIP server uploads the location information of the SIP user to the location server.
When the server receives a message that needs to be sent to the SIP user, the SIP server then asks the location server for the user's location information (IP address or hostname ).
3. SIP Message classification:
There are two types of SIP messages:
· Request message: the message sent by UAC to UAS, including invite, ack, bye, cancel, option, and register messages.
· Response message: the message that UAS responds to UAC, including 1xx, 2XX, 3xx, 4xx, 5xx, and 6xx. The meanings of each type of message are as follows:
1xx |
Progress |
Temporary response |
2XX |
Successful |
Final response |
3xx |
Redirection Error |
Final response |
4xx |
Client errors |
Final response |
5xx |
Server errors |
Final response |
6XX |
Global Error |
Final response |
4. SIP Message format:
Because SIP uses text format encoding, the message format is very simple. It consists of the message header and the optional message body. The message header starts from the second line and each line is composed of the "tag: valued format, each line describes an attribute. There are many attributes in the header, some of which are defined in the basic protocol, and the extended Protocol also defines the corresponding header attributes. If the message contains the message body, the message header and the message body are separated by a blank line. The message body usually has the "Content-Type" and "Content-Length" attributes
Body, for example:
Content-Type: Application/SDP
Content-Length: 212
A sip message can also contain multiple message bodies, such as SDP information and a caller's photo, so that the caller's profile is displayed.
When a SIP message passes through the proxy, the proxy only cares about the message header and does not check the message body. Therefore, the message body is transparent to the proxy.
5. SDP (Session Description Protocol ):
The most common message body in SIP is SDP. Here we will give an overview of SDP. Session Description Protocol (SDP) is defined in RFC 2327. SDP carries necessary information for users to join a multimedia session, such as an IP address, port number, the date and time of the session. This is similar to the TV station program list. With the program list, we can switch to the specified channel to watch the expected program at the specified time. SDP is defined separately and is not necessarily related to sip. SDP information can be transmitted through various channels, such as email and webpage. Sip is only one of the many SDP transmission methods.
1) SDP Syntax:
SDP is also described in text format. An SDP description can contain many rows. The format of each row is as follows:
Type = Value
Type is represented by only one letter. An SDP description usually consists of a session-level and multiple media-level information. Session-level information is used to describe the entire session, each media-level information is used to describe a specific media stream. Session-level starts with "V = 0" and media-level starts with "M = <media type> <port number> <Transport Protocol> <media formats> ,. The following is an SDP Description Example, which contains three media-level information:
V = 0
O = Bob 2890844526 2890842807 in ip4 131.160.1.112
S = sip Seminar
I = a Seminar on the Session Initiation Protocol
U = http://www.cs.columbia.edu/sip
E = bob@university.edu
C = in ip4 224.2.17.12/127
T = 2873397496 2873404696
A = recvonly
M = audio49170 RTP/AVP 0
A = rtpmap: 0 PCMU/8000
M = video 51372 RTP/AVP 31
A = rtpmap: 31 h261/90000
M = video 53000 RTP/AVP 32
A = rtpmap: 32 MPVs/90000
In this example, O describes the initiator of the session as Bob and its IP address, s describes the name of the session, and I describes the general information of the session; U indicates that more information related to the session can be obtained from this URL; E describes the email of the session contact. C and T describe the time from where to receive the multicast of the session. M describes the port number, transmission protocol, and media format of a media stream. A can be used to expand the SDP. For example, if both parties negotiate the audio volume, they can use the following SDP description:
M = audio49170 RTP/AVP 0
A = volume: 8
The premise is that both parties need to understand the meaning of volume. If the other party does not understand volume, there will be no errors, but ignore it.
2) common attributes in SDP description:
V |
Protocol version |
B |
Bandwidth Information |
O |
Owner of the session and session identifier |
Z |
Time zone adjustments |
S |
Name of the session |
K |
Encryption key |
I |
Information about the session |
A |
Attribute lines |
U |
URL containing a description of the session |
T |
Time when the session is active |
E |
E-mail address to obtain information about the session |
R |
Times when the session will be repeated |
P |
Phone number to obtain information about the session |
M |
Media Line |
C |
Connection information |
I |
Information about a Media Line |
6. Analysis of SIP call process instances:
Is a complete SIP call message stream. Here we focus on the SIP message stream. The next article will provide a slightly more complex example, in this example, we will focus on the SIP message routing and the meaning of common SIP header fields.
Laura wants to talk to Bob. Laura calls Bob's public URI: SIP: Bob.Johnson@company.com to give Bob an INVITE message carrying SDP in the invite message, it indicates that Laura expects to receive RTP data packets containing PCM voice encoding on UDP port 20002. After receiving the invite message, the proxy forwards it to Bob and sends a 100 trying message to Laura (the trying message is hop-to-hop and will not be forwarded ). Bob starts to ring after receiving the invite message and returns 180
When the ringing message is sent to Laura, the Laura side will hear the return tone.
Invite SIP: Bob.Johnson@company.com Sip/2.0
Via: SIP/2.0/udp workstation1000.university.com: 5060
From: Laura Brown <SIP: Laura.Brown@university.com>
To: Bob Johnson SIP: Bob.Johnson@company.com
Call-ID: 12345678@workstation1000.university.com
CSeq: 1 invite
Contact: Laura Brown <SIP: Laura@workstation1000.university.com>
Content-Type: Application/SDP
Content-Length: 154
V = 0
O = Laura 2891234526 2891234526 in ip4 workstation1000.university.com
S = let us talk for a while
C = in ip4 138.85.27.10
T = 0 0
M = audio20002 RTP/AVP 0
After Bob is disconnected, a 200 OK final response message will be returned to Laura. The message carries an SDP, indicating that Bob can accept packets on UDP port 41000. After receiving the 200 OK message, Laura sends Bob an ACK message and confirms that the 200 OK message has been received. At this time, both parties enter the call.
Sip/2.0 200 OK
Via: SIP/2.0/udp 131.160.1.110
Via: SIP/2.0/udp workstation1000.university.com: 5060
From: Laura Brown <SIP: Laura.Brown@university.com>
To: Bob Johnson <SIP: Bob.Johnson@company.com>; tag = 314159
Call-ID: 12345678@workstation1000.university.com
CSeq: 1 invite
Contact: Bob Johnson <SIP: Bob@131.160.1.112>
Content-Type: Application/SDP
Content-Length: 154
V = 0
O = Bob 2891234321 2891234321 in ip4 131.160.1.112
S = let us talk for a while
C = in ip4 131.160.1.112
T = 0 0
M = audio41000 RTP/AVP 0
When the call ends, Bob sends a bye message to Laura. Then, Laura responds to "200 OK" to Bob, and the call ends.
References:
1. "RFC 3261"-section 4, section 12, section 17, section 24;
2. "sip demystified"-Chapter 4, Chapter 5;