"Everything is Socket !"
Although a little exaggerated, the fact is that almost all network programming is now using sockets.
-- I'm inspired by the actual programming and open-source project research.
We are well versed in the value of information exchange, so how can we communicate between processes on the network, such as how the browser processes communicate with the web server when we open a browser to browse the web page every day? When you use QQ chat, how does the QQ process communicate with the server or the QQ process where your friends are located? Which of these depends on socket? So what is socket? What are the socket types? There are also basic socket functions, which will be introduced in this article. The main content of this article is as follows:
1. How do processes communicate with each other in the network?
2. What is Socket?
3. Basic socket operations
3.1 socket () Functions
3.2. bind () function
3.3. listen (), connect () Functions
3.4. accept () function
3.5. read () and write () Functions
3.6. close () function
4. Details about TCP three-way handshake in socket
5. Explanation of TCP's four-way handshake releasing connection in socket
6. An example (Practice)
7. Leave a question. You are welcome to reply !!!
1. How do processes communicate with each other in the network?
There are many methods for local inter-process communication (IPC), but they can be summarized into the following four categories:
Message transmission (pipeline, FIFO, Message Queue)
Synchronization (mutex, condition variable, read/write lock, file and write record lock, semaphore)
Shared Memory (anonymous and named)
Remote Procedure Call (Solaris and Sun RPC)
These are not the topics of this article! We want to discuss how processes in the network communicate with each other? The first problem to be solved is how to uniquely identify a process, otherwise communication will be impossible! A process can be uniquely identified by a process PID locally, but this does not work in the network. In fact, the TCP/IP protocol family has helped us solve this problem. The "IP address" at the network layer can uniquely identify the host in the network, the "protocol + port" in the transport layer can uniquely identify the application (process) in the host ). In this way, the process of the network can be identified by using the triplet (IP address, protocol, port), and the process Communication in the network can be used to interact with other processes.
Applications that use the TCP/IP protocol usually use the Application Programming Interface: socket of unix bsd and TLI of UNIX System V (which has been eliminated ), to implement communication between network processes. At present, almost all applications use sockets, but now it is the Internet era, and process communication is everywhere in the network. That is why I say "everything is socket ".
2. What is Socket?
We already know that processes in the network communicate through sockets. What is socket? Socket originated from Unix, and one of the basic philosophies of Unix/Linux is "Everything is a file". You can use the "open-> read/write-> close" mode to operate. My understanding is that Socket is an implementation of this mode, and socket is a special file. Some socket functions are operations (read/write IO, open, and close) on it ), we will introduce these functions later.
Origin of socket
The first application in the networking field was found in the IETF RFC33 published in February 12, 1970, written by Stephen Carr, Steve Crocker, And Vint Cerf. According to the American Museum of computer history, Croker wrote: "All namespace elements are called socket interfaces. A socket interface forms one end of a connection, and a connection can be completely defined by a socket interface ." "This is about 12 years earlier than the BSD socket interface definition," added the Computer History Museum ."
3. Basic socket operations
Since the socket is an implementation of the "open-write/read-close" mode, the socket provides the function interfaces corresponding to these operations. The following uses TCP as an example to introduce several basic socket interface functions.
3.1 socket () Functions
View plain
Int socket (int domain, int type, int protocol );
The socket function corresponds to the open operation of common files. When a common file is opened, a file description is returned. socket () is used to create a socket descriptor, which uniquely identifies a socket. The socket description is the same as the file description. It is useful for subsequent operations. It serves as a parameter for some read/write operations.
Just as you can input different parameter values to fopen to open different files. When creating a socket, you can also specify different parameters to create different socket descriptors. The three parameters of the socket function are:
Domain: the Protocol domain, also known as the protocol family ). Common protocol families include AF_INET, AF_INET6, AF_LOCAL (or AF_UNIX, Unix socket), and AF_ROUTE. The protocol family determines the socket address type and must use the corresponding address in the communication. For example, AF_INET decides to use ipv4 address (32-bit) and port number (16-bit) AF_UNIX decides to use an absolute path name as the address.
Type: Specifies the socket type. Common socket types include SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_PACKET, SOCK_SEQPACKET, and so on (what types of socket ?).
Protocol: Specifies the protocol. Common protocols include IPPROTO_TCP, IPPTOTO_UDP, IPPROTO_SCTP, and IPPROTO_TIPC, they correspond to the TCP, UDP, STCP, and TIPC transmission protocols respectively !).
Note: The preceding type and protocol can be combined at will. For example, SOCK_STREAM cannot be combined with IPPROTO_UDP. When the protocol is 0, the default protocol corresponding to the type is automatically selected.
When we call a socket to create a socket, the returned socket description is stored in the address family (AF_XXX) space, but there is no specific address. If you want to assign an address to it, you must call the bind () function. Otherwise, when you call connect () and listen (), the system will automatically allocate a random port.
3.2. bind () function
As mentioned above, the bind () function assigns a specific address in an address family to the socket. For example, AF_INET and AF_INET6 assign an ipv4 address or ipv6 address and port number to the socket.
View plain
Int bind (int sockfd, const struct sockaddr * addr, socklen_t addrlen );
The three parameters of the function are:
Sockfd: socket description, which is created by using the socket () function and uniquely identifies a socket. The bind () function binds the description to a name.
Addr: a const struct sockaddr * pointer pointing to the Protocol address to be bound to sockfd. The address structure varies depending on the address protocol family when the address creates the socket. For example, ipv4 corresponds:
View plain
But different, for example, ipv4 corresponds:
Struct sockaddr_in {
Sa_family_t sin_family;/* address family: AF_INET */
In_port_t sin_port;/* port in network byte order */
Struct in_addr sin_addr;/* internet address */
};
/* Internet address .*/
Struct in_addr {
Uint32_t s_addr;/* address in network byte order */
}; Ipv6 corresponds:
Struct sockaddr_in6 {
Sa_family_t sin6_family;/* AF_INET6 */
In_port_t sin6_port;/* port number */
Uint32_t sin6_flowinfo;/* IPv6 flow information */
Struct in6_addr sin6_addr;/* IPv6 address */
Uint32_t sin6_scope_id;/* Scope ID (new in 2.4 )*/
};
Struct in6_addr {
Unsigned char s6_addr [16];/* IPv6 address */
}; The Unix domain corresponds:
# Define UNIX_PATH_MAX 108
Struct sockaddr_un {
Sa_family_t sun_family;/* AF_UNIX */
Char sun_path [UNIX_PATH_MAX];/* pathname */
};
Addrlen: the length of the address.
Generally, a server is bound to a well-known address (such as an IP address and port number) when it is started to provide services, and customers can use it to connect servers; clients do not need to specify it, A system automatically assigns a port number and its own IP address combination. This is why the server usually calls bind () before listen, but the client does not call it. Instead, the system randomly generates one at connect.
Network byte and host byte
The host's byte order is what we usually call the big and small-end mode: different CPUs have different byte order types, which refer to the order in which integers are stored in the memory, this is called the host sequence. The definitions of reference standard Big-Endian and Little-Endian are as follows:
A) Little-Endian is the low-byte emission at the low-address end of the memory, and the high-byte emission at the High-address end of the memory.
B) Big-Endian refers to the low address of the memory where the high byte is discharged, and the low byte is discharged to the high address of the memory.
Network byte order: the 32-bit values of four bytes are transmitted in the following order: the first is 0 ~ 7 bit, followed by 8 ~ 15bit, then 16 ~ 23bit, last 24 ~ 31bit. This transmission order is called the Large-end byte order. Because all the binary integers in the TCP/IP Header must be transmitted in this order, it is also called the network byte order. The byte order, as the name implies, is the order in which data of the same byte type is stored in the memory. The data of a single byte is not sequential.
Therefore, when binding an address to a socket, first convert the host's byte order to the network's byte order, do not assume that the host uses Big-Endian in the same byte sequence as the network. This problem has caused a bloody incident! This problem exists in the company project code, resulting in many inexplicable problems, so please remember not to make any assumptions about the host's byte order, it must be converted to the byte sequence of the network and then assigned to the socket.
3.3. listen (), connect () Functions
As a server, after socket () and bind () are called, listen () is called to listen to the socket. If the client calls connect () to send a connection request, the server receives the request.
View plain
Int listen (int sockfd, int backlog );
Int connect (int sockfd, const struct sockaddr * addr, socklen_t addrlen );
The first parameter of the listen function is the socket description to be listened on, and the second parameter is the maximum number of connections that the corresponding socket can queue. The socket created by the socket () function is an active type by default. The listen function changes the socket to a passive type and waits for the client's connection request.
The first parameter of the connect function is the socket description of the client, the second parameter is the socket address of the server, and the third parameter is the length of the socket address. The client calls the connect function to establish a connection with the TCP server.
3.4. accept () function
After the TCP server calls socket (), bind (), and listen () in turn, it listens to the specified socket address. After the TCP client calls socket () and connect () in turn, it wants the TCP server to send a connection request. After the TCP server monitors this request, it will call the accept () function to receive the request, so that the connection is established. Then you can start the network I/O operation, that is, the operation similar to the normal file read/write I/O operation.
Int accept (int sockfd, struct sockaddr * addr, socklen_t * addrlen );
The first parameter of the accept function is the socket description of the server, the second parameter is the pointer to struct sockaddr *, used to return the Protocol address of the client, and the third parameter is the length of the Protocol address. If accpet succeeds, the returned value is a completely new descriptive word automatically generated by the kernel, representing the TCP connection with the returned customer.
Note: The first parameter of accept is the socket description of the server. It is generated when the server starts to call the socket () function. It is called the listener socket description; the accept function returns the connected socket description. A server generally only creates one listening socket description, which exists throughout the lifecycle of the server. The kernel creates a connection socket description for each client connection accepted by the server process. When the server completes the service to a customer, the corresponding connected socket description is closed.
3.5. read (), write (), and other functions
Everything is in arrears, so far the server has established a connection with the customer. You can call network I/O to perform read and write operations, that is, communication between different processes in the network is realized! Network I/O operations have the following groups:
Read ()/write ()
Recv ()/send ()
Readv ()/writev ()
Recvmsg ()/sendmsg ()
Recvfrom ()/sendto ()
I recommend using the recvmsg ()/sendmsg () function. These two functions are the most common I/O functions. In fact, you can replace all the above functions with these two functions. Their declaration is as follows:
View plain
# Include <unistd. h>
Ssize_t read (int fd, void * buf, size_t count );
Ssize_t write (int fd, const void * buf, size_t count );
# Include <sys/types. h>
# Include <sys/socket. h>
Ssize_t send (int sockfd, const void * buf, size_t len, int flags );
Ssize_t recv (int sockfd, void * buf, size_t len, int flags );
Ssize_t sendto (int sockfd, const void * buf, size_t len, int flags,
Const struct sockaddr * dest_addr, socklen_t addrlen );
Ssize_t recvfrom (int sockfd, void * buf, size_t len, int flags,
Struct sockaddr * src_addr, socklen_t * addrlen );
Ssize_t sendmsg (int sockfd, const struct msghdr * msg, int flags );
Ssize_t recvmsg (int sockfd, struct msghdr * msg, int flags );
The read function reads content from fd. when the read succeeds, read returns the actual number of bytes read. If the returned value is 0, it indicates that the object has been read, and if the returned value is less than 0, an error occurs. If the error is EINTR, the read operation is interrupted. if the error is ECONNREST, the network connection is faulty.
The write function writes the nbytes bytes in the buf to the file descriptor fd. The number of bytes written is returned when the file descriptor fd is successful. -1 is returned when a failure occurs, and the errno variable is set. In network programs, there are two possibilities when we write to the socket file descriptor. 1) the return value of write is greater than 0, indicating that some or all data is written. 2) If the returned value is less than 0, an error occurs. We need to handle the error according to the error type. If the error is EINTR, an interruption error occurs during write. If it is EPIPE, the network connection is faulty (the other party has closed the connection ).
I will not introduce these pairs of I/O functions one by one. For more information, see the man document or baidu or Google. The following example uses send/recv.
3.6. close () function
After a connection is established between the server and the client, some read/write operations will be performed. After the read/write operation is completed, the corresponding socket description should be closed. For example, you need to call fclose to close the opened file after the operation.
View plain
# Include <unistd. h>
Int close (int fd );
When you close the default behavior of a TCP socket, mark the socket as closed and return to the calling process immediately. The description cannot be used by the calling process, that is, it cannot be the first parameter of read or write.
Note: The close operation only counts the reference count of the corresponding socket description-1. Only when the reference count is 0 will the TCP client send a request to terminate the connection to the server.
4. Details about TCP three-way handshake in socket
We know that three handshakes are required for tcp connection establishment, that is, three groups are exchanged. The general process is as follows:
The client sends a SYN J
The server returns a syn k to the client and confirms the syn j. ack j + 1
The client sends a confirmation ack k + 1 to the server.
Only three handshakes are finished, but what about these three handshakes in the socket functions? See:
Figure 1 TCP three-way handshake sent in socket
It can be seen that when the client calls connect, the connection request is triggered and the syn j packet is sent to the server. Then, connect is blocked. When the server listens to the connection request, it receives the syn j packet, call the accept function to receive requests and send syn k, ack j + 1 to the client. Then, accept is blocked. After the client receives the syn k and ack j + 1 from the server, connect returns, then, confirm the syn k. When the server receives the ack k + 1, accept returns. The three handshakes are completed and the connection is established.
Conclusion: the client's connect returns the second time of the three-way handshake, while the server's accept returns the third time of the three-way handshake.
5. Explanation of TCP's four-way handshake releasing connection in socket
The above describes the TCP three-way handshake establishment process in the socket and the socket functions involved. Now we will introduce the four handshakes in the socket to release the connection. Please refer:
Figure 2. TCP four handshakes sent in the socket
The graphical process is as follows:
An application process first calls close to close the connection, and TCP sends a fin m;
After receiving the fin m, the other end performs passive shutdown to confirm the FIN. Its receipt is also passed to the application process as a file Terminator, because the receiving of FIN means that the application process can no longer receive additional data on the corresponding connection;
After a period of time, the application process receiving the file Terminator calls close to close its socket. As a result, TCP also sends a fin n;
The source sending end TCP that receives the FIN confirms it.
In this way, there is a FIN and ACK in each direction.
6. An example (Practice)
After talking so much about it, let's get started with it. Below is a simple server and client (using TCP) -- the server listens to port 6666 of the local machine all the time. If it receives a connection request, it will receive the request and receive the message sent from the client; the client establishes a connection with the server and sends a message.
Server code:
View plain
Server
# Include <stdio. h> # include <stdlib. h> # include <string. h> # include <errno. h> # include <sys/types. h> # include <sys/socket. h> # include <netinet/in. h> # define MAXLINE 4096int main (int argc, char ** argv) {int listenfd, connfd; struct sockaddr_in servaddr; char buff [4096]; int n; if (listenfd = socket (AF_INET, SOCK_STREAM, 0) =-1) {printf ("create socket error: % s (errno: % d) \ n ", strerror (errno), errno); exit (0);} memset (& servaddr, 0, sizeof (servaddr); servaddr. sin_family = AF_INET; servaddr. sin_addr.s_addr = htonl (INADDR_ANY); servaddr. sin_port = htons (6666); if (bind (listenfd, (struct sockaddr *) & servaddr, sizeof (servaddr) =-1) {printf ("bind socket error: % s (errno: % d) \ n ", strerror (errno), errno); exit (0);} if (listen (listenfd, 10) =-1) {printf ("listen socket error: % s (errno: % d) \ n", strerror (errno), errno); exit (0 );} printf ("======= waiting for client's request =====\ n"); while (1) {if (connfd = accept (listenfd, (struct sockaddr *) NULL, NULL) =-1) {printf ("accept socket error: % s (errno: % d)", strerror (errno ), errno); continue;} n = recv (connfd, buff, MAXLINE, 0); buff [n] = '\ 0'; printf ("recv msg from client: % s \ n ", buff); close (connfd);} close (listenfd);} client code:
Client
# Include <stdio. h> # include <stdlib. h> # include <string. h> # include <errno. h> # include <sys/types. h> # include <sys/socket. h> # include <netinet/in. h> # define MAXLINE 4096int main (int argc, char ** argv) {int sockfd, n; char recvline [4096], sendline [4096]; struct sockaddr_in servaddr; if (argc! = 2) {printf ("usage :. /client <ipaddress> \ n "); exit (0);} if (sockfd = socket (AF_INET, SOCK_STREAM, 0) <0) {printf ("create socket error: % s (errno: % d) \ n", strerror (errno), errno); exit (0);} memset (& servaddr, 0, sizeof (servaddr); servaddr. sin_family = AF_INET; servaddr. sin_port = htons (6666); if (inet_ton (AF_INET, argv [1], & servaddr. sin_addr) <= 0) {printf ("inet_ton error for % s \ n", argv [1]); exit (0);} if (connect (sockfd, (struct sockaddr *) & servaddr, sizeof (servaddr) <0) {printf ("connect error: % s (errno: % d) \ n", strerror (errno ), errno); exit (0);} printf ("send msg to server: \ n"); fgets (sendline, 4096, stdin); if (send (sockfd, sendline, strlen (sendline), 0) <0) {printf ("send msg error: % s (errno: % d) \ n", strerror (errno), errno ); exit (0);} close (sockfd); exit (0 );}
Of course, the above code is very simple and has many shortcomings. This is just a simple demonstration of the basic functions of socket. In fact, no matter how complicated a network program is, these basic functions are used. The above server uses the iteration mode, that is, only after processing a client request will it process the request of the next client, and the processing capability of this server is very weak, in reality, servers must have concurrent processing capabilities! For concurrent processing, the server needs to fork () A new process or thread to process requests.
7. Hands-on
Leave a question. Welcome to reply !!! Are you familiar with network programming in Linux? If you are familiar with it, write the following program to complete the following functions:
Server:
The Client that receives the address 192.168.100.2. If the information is "Client Query", "Receive Query" is printed"
Client:
Send "Client Query test", "Cleint Query", and "Client Query Quit" messages to the server at 192.168.100.168, and then exit.
The IP address displayed in the question can be determined based on the actual situation.
The author's "w397090770 column"