Discussion on the principle of socket communication (C + + for example)

Source: Internet
Author: User
Tags ack connection reset unix domain socket
discussion on the principle of socket communication (C + + for example)

how to communicate between processes in a network.

There are many ways in which native interprocess communication (IPC) but it can be summed up in the following 4 categories: 1, Message delivery (pipeline, FIFO, Message Queuing) 2, synchronization (mutual exclusion, condition variables, read and write locks, file and write lock, Semaphore) 3, shared memory (anonymous and named) 4, Remote Procedure Call (Solaris Gate and Sun RPC)

But these are not the themes of this article. What we are going to talk about is how to communicate between processes in the network. The first problem to be solved is how to uniquely identify a process, otherwise communication is impossible to talk about. The process PID can be used locally to uniquely identify a process, but it is not feasible in the network. In fact, the TCP/IP protocol family has helped us solve this problem, the network layer "IP address" can uniquely identify the host in the network, and the Transport layer of "protocol + port" can uniquely identify the application (process) in the host. In this way, the process of the network can be identified by using ternary group (IP address, Protocol, port), and the process communication in the network can use this flag to interact with other processes.

Applications using the TCP/IP protocol typically use the application programming Interface: UNIX-BSD sockets and Unix System v Tli (have been eliminated) to enable communication between network processes. For now, almost all applications are socket, and now is the network era, the network process communication is ubiquitous, which is why I say "everything is socket."

2, what is socket. above we already know that the network process is through the socket to communicate, then what is the socket it. Sockets originate from UNIX, and one of the basic philosophies of unix/linux is "everything is a file," which can be manipulated with the "open open–> read-write write/read–> turn off close" mode. My understanding is that the socket is an implementation of the pattern, the socket is a special file, some of the socket function is the operation of it (read/write Io, open, closed), these functions we introduced later.

The origin of the word socket: the first use in the field of networking was found in the IETF RFC33, published on February 12, 1970, by Stephen Carr, Steve Crocker and Vint Cerf. According to the American Computer History Museum, Croker writes: "The elements of a namespace can be called socket interfaces." A socket interface forms one end of a connection, and a connection can be fully defined by a pair of socket interfaces. "This is about 12 years earlier than the BSD socket interface definition," added the Computer History Museum. " 3, the basic operation of the socket

Since the socket is an implementation of the "open-write/read-close" pattern, the socket provides the function interface for these operations. The following is an example of TCP, which introduces several basic socket interface functions.

3.1, Socket () function (create socket)

int socket (int domain, int type, int protocol);

The socket function corresponds to the open operation of the normal file. The open operation of a normal file returns a file descriptor, and the socket () is used to create a socket descriptor (socket descriptor) that uniquely identifies a socket. This socket descriptor is the same as the file descriptor, which is useful for subsequent operations, using it as an argument, and doing some reading and writing through it. Just as you can give fopen an incoming different parameter value to open a different file. When creating a socket, you can also specify different parameters to create a different socket descriptor, and the three parameters of the socket function are: domain: The protocol domain, also known as the Protocol Family (family). Common protocol families are af_inet, Af_inet6, af_local (or Af_unix,unix domain socket), af_route, and so on. The protocol family determines the socket's address type and must use the corresponding address in the communication, such as Af_inet decided to use the IPv4 address (32-bit) and the port number (16-bit) combination, Af_unix decided to use an absolute path name as the address. Type: Specifies the type of socket. Common socket types are sock_stream, Sock_dgram, Sock_raw, Sock_packet, Sock_seqpacket, and so on (the type of socket). )。 Protocol: So the name thinks, is to designate an agreement. Commonly used protocols are, IPPROTO_TCP, IPPTOTO_UDP, IPPROTO_SCTP, IPPROTO_TIPC, respectively, they correspond to TCP transmission protocol, UDP Transmission Protocol, STCP Transport Protocol, TIPC Transport Protocol (this agreement will be discussed separately.) )。

Note: Not all of the above type and protocol can be combined arbitrarily, such as sock_stream can not be combined with IPPROTO_UDP. When protocol is 0 o'clock, the default protocol for type types is automatically selected.

When we call the socket to create a socket, the returned socket descriptor exists in the Protocol family (address family,af_xxx) space, but does not have a specific location. If you want to assign an address to it, you must call the bind () function, or the system will randomly assign a port when you call Connect (), listen (). 3.2. Bind () function

As mentioned above, the bind () function assigns a specific address in an address family to the socket. For example, the corresponding af_inet, Af_inet6 is to assign a IPv4 or IPv6 address and port number combination to the socket.

int bind (int sockfd, const struct SOCKADDR *addr, socklen_t Addrlen);

The three parameters of the function are: SOCKFD: The socket descriptor, which is created by the socket () function, uniquely identifying a socket. The bind () function is to bind a name to this descriptor. Addr: A const struct SOCKADDR * Pointer pointing to the protocol address to be bound to SOCKFD. This address structure differs according to the address protocol family when the address is created for the socket, as IPv4 corresponds to:

struct SOCKADDR_IN {
    sa_family_t    sin_family;/* address family:af_inet      /in_port_t sin_port;   /* port in Network byte order
    /struct in_addr sin_addr;   /* Internet address *
/};
/* Internet address. * *
struct IN_ADDR {
    uint32_t       s_addr;     /* address in network byte order *
/};
IPv6 corresponds to:
struct SOCKADDR_IN6 { 
    sa_family_t     sin6_family;   /* AF_INET6 * * 
    in_port_t       sin6_port;     /* Port number 
    *        /uint32_t sin6_flowinfo/* IPv6 flow information/struct IN6_ADDR 
    ;     /* IPV6 Address 
    *        /uint32_t sin6_scope_id/* Scope ID (new in 2.4)/ 
};
struct IN6_ADDR { 
    unsigned char   s6_addr[16];   /* IPV6 address * 
/};
Addrlen: Corresponds to the length of the address.

Typically, the server binds a well-known address (such as an IP address + port number) when it is started and is used to provide a service that the client can use to connect to the server, while the client does not specify that the system automatically assigns a port number and its own IP address combination. This is why it is common for the server side to invoke bind () before listen, and the client will not invoke it, but the system randomly generates one at connect (). 3.3. Network byte order and host byte order

Host byte order: is what we usually say big and small end mode: Different CPUs have different byte order types, these are the order of integers in memory, this is called host order. References to standard Big-endian and Little-endian are defined as follows:

A) Little-endian is the low byte emissions in the memory of the lower address end, high byte emissions at the high address of memory.

b Big-endian is the high byte emissions in the memory of the low address end, low byte emissions in the memory of the higher address.

Network byte order: 4 byte bit values are transmitted in the following order: First is 0~7bit, then 8~15bit, then 16~23bit, and finally 24~31bit. This transmission order is called a big endian byte sequence. Because all binary integers in the TCP/IP header are required in this order when they are transmitted across the network, it is also known as network byte order. byte order, as the name implies byte order, is greater than one byte type of data in memory in the order, a byte of data is not the order of the problem. So: When binding an address to a socket, first convert the host byte sequence into a network byte order, rather than assuming that the host byte sequence is using the same big-endian as the network byte sequence. As a result of this problem has caused the bloodshed. Because of this problem in the company project code, it causes a lot of puzzling questions, so keep in mind that you do not make any assumptions about the host byte sequence, so be sure to turn it into a network byte order and assign it to the socket. 3.4, listen (), connect () function

If as a server, after calling socket (), bind () will call listen () to listen to this socket, if the client then call Connect () issued a connection request, the server side will receive the request.

int listen (int sockfd, int backlog);

int connect (int sockfd, const struct SOCKADDR *addr, socklen_t Addrlen);

The first parameter of the Listen function is the socket descriptor to listen to, and the second parameter is the maximum number of connections that the corresponding socket can queue. The socket () function created by default is an active type, and the Listen function changes the socket to a passive type, waiting for the client's connection request.

The first parameter of the Connect function is the socket descriptor for the client, the second parameter is the socket address of the server, and the third parameter is the length of the socket address. The client establishes a connection to the TCP server by calling the Connect function. 3.5, accept () function

After the TCP server side invokes the socket (), bind (), listen (), it listens for the specified socket address. After the TCP client calls the socket () and connect () in turn, it wants the TCP server to send a connection request. After the TCP server hears this request, it calls the Accept () function to fetch the request, so the connection is established. You can then start network I/O operations, i.e. read and write I/O operations that are similar to normal files.

int accept (int sockfd, struct sockaddr *addr, socklen_t *addrlen);

The first parameter of the Accept function is the socket descriptor for the server, and the second parameter is a pointer to the struct SOCKADDR *, which returns the client's protocol address and the third parameter is the length of the protocol address. If the accpet succeeds, the return value is a completely new descriptor generated automatically by the kernel, representing the TCP connection to the returned customer.

Note: The first parameter of accept is the socket descriptor for the server, which is generated by the server calling the socket () function, called the listening socket descriptor, and the Accept function returns the connected socket descriptor. A server typically creates only one listening socket descriptor, which persists throughout the server's lifecycle. The kernel creates a connected socket descriptor for each client connection that is accepted by the server process, and the corresponding connected socket descriptor is closed when the server completes a service to a customer. 3.6, read (), write () and other functions

All things have only the East wind, so that the server and customers have established a connection. Network I/O can be invoked to read and write operations, that is, the implementation of the network of different processes between the communication. Network I/O operations are in the following groups: Read ()/write () recv ()/send () Readv ()/writev () recvmsg ()/sendmsg ()

Development language may be different from reading and writing functions, as long as the message you want to send to the byte stream to write the socket or read from the socket to achieve the network I/O operation. 3.7, close () function

After the server and the client to establish a connection, will do some reading and writing operations, complete the read and write operation will close the corresponding socket descriptor, like the operation of the open file to call fclose close open file.

#include <unistd.h>

int close (int fd);

Close the default behavior of a TCP socket, mark the socket to shut down, and then immediately return to the calling process. The descriptor can no longer be used by the calling process, that is, it can no longer be the first parameter of read or write.

Note: The close operation simply causes a reference count of the corresponding socket descriptor-1, which triggers a TCP client to send a termination connection request to the server only if the reference count is 0. 4, the socket in TCP three handshake to establish the connection detailed

SYN means to establish a connection,

The fin indicates that the connection is closed,

ACK indicates a response,

PSH represents data data transfer,

RST indicates a connection reset.

We know that TCP establishes a connection with a "three handshake", that is, an exchange of three groupings. The general process is as follows: The client sends a SYN J server to the server to respond to a SYN K to the client, and a confirmation ack to SYN J j+1 the client then sends a confirmation ACK to the server k+1

There are only three shakes, but this three handshake takes place in several functions of the socket. Take a look at the picture below:

Figure 1 TCP three-time handshake sent in socket

As can be seen from the diagram, when the client calls connect, triggered the connection request, sent a SYN J packet to the server, then connect into the blocking state; the server hears the connection request, namely receives the SYN J package, calls the Accept function to receive the request to send the SYN K to the client, ACK j+ 1, then accept into the blocking state, the client received the server SYN K, ACK j+1, then connect back, and the Syn K to confirm; the server received an ACK k+1, accept returned, so three times shook hands, connection established. 5, the socket in TCP Four handshake release connection detailed

The

      above describes the process of establishing TCP three handshake in socket and the socket function involved. Now we introduce the process of four handshake release connections in the socket, see the following figure:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.