Preface:Socket API is a standard API used in network application development. Although this API is simple
New developers may experience some common problems. This article identifies some of the most common risks and shows you how to avoid them. Related documents: "Linux socket programming" was first introduced in 4.2 bsd unix operating system. The Sockets API is now a standard feature of any operating system. In fact, it is difficult to find a modern language that does not support sockets APIs. This API is quite simple, but new developers still encounter some common risks. This article identifies the risks and shows you how to avoid them.Hazard 1. Ignore returned statusThe first hidden danger is obvious, but it is the easiest mistake for new developers. If you ignore the return status of the function, you may be lost when they fail or are partially successful. In turn, this may spread errors, making it difficult to locate the source of the problem. Capture and check each returned status, rather than ignoring them. Consider the example shown in Listing 1, a socket send function. Listing 1. Ignore the status returned by the API Function
Int status, Sock, mode;/* Create a new stream (TCP) socket */sock = socket (af_inet, sock_stream, 0 );... status = Send (sock, buffer, buflen, msg_dontwait); If (status =-1) {/* Send failed */printf ("Send failed: % s/n ",? Strerror (errno);} else {/* Send succeeded -- or did it? */} |
Listing 1 explores a function snippet that completes the socket send operation (send data through the socket ). Function error status is captured and tested, but this example ignores a feature of send in non-blocking mode (enabled by the msg_dontwait flag. The send API function has three possible return values:
- If the data is successfully discharged to the transmission queue, 0 is returned.
- If the queue fails,-1 is returned (you can use the errno variable to understand the cause of the failure ).
- If not all characters can be queued during function calling, the final return value is the number of characters sent.
Because the msg_dontwait variable of send is non-blocking, function calls return after all data, some data, or no data is sent. Ignoring the returned status will lead to incomplete sending and subsequent data loss.
Hidden Danger 2. Equivalent socket ClosureOne interesting aspect of UNIX is that you can regard almost everything as a file. Files, directories, pipelines, devices, and sockets are treated as files. This is a novel abstraction, meaning that a complete set of APIs can be used on a wide range of device types. Consider the read API function, which reads a certain number of bytes from a file. The READ function returns the number of bytes to be read (the maximum value you specify); or-1, indicating an error; or 0, if it has reached the end of the file. If you complete a read operation on a socket and obtain a return value of 0, it indicates that the Peer layer of the remote socket side calls the close API method. This indicator is the same as reading a file-No redundant data can be read through the descriptor (see Listing 2 ). List 2. Properly process the returned values of read API functions
Int sock, status; sock = socket (af_inet, sock_stream, 0 );... status = read (sock, buffer, buflen); If (status> 0) {/* data read from the socket */} else if (status =-1) {/* error, check errno, take action... */} else if (status = 0) {/* peer closed the socket, finish the close */close (sock);/* Further processing... */} |
Similarly, you can use the write API function to detect the closure of a peering socket. In this case, The Write function will return-1 and set errno to epipe if the sigpipe signal is received or the signal is blocked.
Hidden Danger 3. Address usage error (eaddrinuse)You can use the bind api function to bind an address (an interface and a port) to a socket endpoint. You can use this function in server settings to restrict the interfaces that may be connected. You can also use this function in client settings to restrict the interfaces that should be used for outgoing connections. The most common usage of BIND is to associate the port number with the server and use the wildcard address (inaddr_any), which allows any interface to be used for the incoming connection. The common problem with BIND is to try to bind a port that is already in use. This trap may be caused by the TCP socket status time_wait because no active socket exists, but the binding port is still prohibited (BIND returns eaddrinuse. This status is retained for about 2 to 4 minutes after the socket is disabled. After the time_wait status exits, the socket is deleted so that the address can be rebound. It may be annoying to wait for time_wait to end. In particular, if you are developing a socket server, you need to stop the server to make some changes and restart it. Fortunately, there are ways to avoid the time_wait status. You can apply the so_reuseaddr socket option to the socket so that the port can be reused immediately. Consider the example in listing 3. Before binding an address, I used the so_reuseaddr option to call setsockopt. To allow address reuse, I set the integer parameter (on) to 1 (otherwise, it can be set to 0 to prohibit address reuse ). Listing 3. Use the so_reuseaddr socket option to avoid address usage errors
Int sock, RET, on; struct sockaddr_in servaddr;/* Create a new stream (TCP) socket */sock = socket (af_inet, sock_stream, 0 ): /* enable address reuse */on = 1; ret = setsockopt (sock, sol_socket, so_reuseaddr, & On, sizeof (on);/* allow connections to port 8080 from any available interface */memset (& servaddr, 0, sizeof (servaddr); servaddr. sin_family = af_inet; servaddr. sin_addr.s_addr = htonl (inaddr_any); servaddr. sin_port = htons (45000);/* bind to the address (interface/port) */ret = BIND (sock, (struct sockaddr *) & servaddr, sizeof (servaddr )); |
After the so_reuseaddr option is applied, the bind api function allows immediate address reuse.
Hidden Danger 4. Send structured dataSocket is a perfect tool for sending unstructured binary byte streams or ASCII data streams (such as HTTP pages over HTTP, or emails over SMTP. However, if you try to send binary data on a socket, it will become more complicated. For example, if you want to send an integer, are you sure that the recipient will interpret the integer in the same way? Applications running on the same architecture can rely on their common platforms to make the same interpretation of this type of data. However, if a client running on an IBM PowerPC with high priority sends a 32-bit integer to an intel X86 with low priority, what will happen? The Byte arrangement will cause an incorrect explanation. What if I send a C structure through a socket? In this case, it will also be difficult, because not all compilers arrange elements of a structure in the same way. The structure may also be compressed to minimize the waste of space, which further places elements in the structure. Fortunately, there is a solution to this problem, which can ensure the consistent interpretation of data at both ends. In the past, the Remote Procedure Call (RPC) kit provided the so-called external data representation (XDR ). XDR defines a standard representation for data to support the development of communications between heterogeneous network applications. Now there are two new protocols that provide similar functions. Extensible Markup Language/Remote Procedure Call (XML/RPC) Schedules HTTP Process calls in XML format. Data and Metadata are encoded in XML and transmitted as strings. The host architecture separates values from their physical representation. Soap follows the XML-RPC to extend its thinking with better features and functionality. For more information about each protocol, see the references section.
Hidden Danger 5.frame synchronization assumption in TCPTCP does not provide frame synchronization, which makes it perfect for byte stream-oriented protocols. This is an important difference between TCP and UDP. UDP is a message-oriented protocol that retains the message boundary between the sender and receiver. TCP is a stream-oriented protocol. It assumes that the data being communicated is unstructured, as shown in 1. Figure 1.udp's frame synchronization capability and lack of Frame Synchronization the upper part of TCP Figure 1 shows a UDP client and server. The peer layer on the left performs write operations on two sockets, each of which is 100 bytes. The UDP layer of the protocol stack traces the number of writes, and ensures that when the receiver on the right gets data through the socket, it will arrive at the same number of bytes. In other words, the message boundary provided by the writer is reserved for the reader. Now, at the bottom of Figure 1, it demonstrates write operations of the same granularity for the TCP layer. Two independent write operations (each 100 bytes) write to the stream socket. In this example, the reader of the stream socket gets 200 bytes. The TCP layer of the protocol stack aggregates two write operations. This aggregation can happen to either the sender or receiver of the TCP/IP protocol stack. It is important to note that aggregation may not happen-TCP only ensures orderly data transmission. This trap is confusing for most developers. You want to obtain TCP reliability and UDP frame synchronization. Unless you use another transmission protocol, such as the Stream Transmission Control Protocol (STCP), the application layer developers are required to implement the buffer and segment functions.
Tool for debugging socket applicationsGNU/Linux provides several tools to help you find some problems in socket applications. In addition, the use of these tools is instructive and can help explain the behavior of applications and TCP/IP protocol stacks. Here, you will see an overview of several tools. Refer to the following references for more information. View Details of network subsystems the netstat tool provides the ability to view the GNU/Linux Network subsystems. With netstat, you can view the currently active connections (by a single protocol), and view connections in a specific status (such as server sockets in the listening status) and many other information. Listing 4 shows some options provided by netstat and their enabled features.
Listing 4. usage modes of the netstat Utility
View All TCP sockets currently active $ netstat -- tcpview all UDP sockets $ netstat -- udpview all TCP sockets in the listening state $ netstat -- listeningview the Multicast Group Membership Information $ netstat -- groupsdisplay the list masqueraded connections $ netstat -- masqueradeview statistics for each protocol $ netstat -- statistics |
Despite the existence of many other utilities, netstat is fully functional and covers the features of route, ifconfig, and other standard GNU/Linux tools. To monitor traffic, you can use several GNU/Linux tools to check the Low-layer traffic on the network. Tcpdump is an old tool that "sniffers" network packets from the Internet and prints them to stdout or records them in a file. This function allows you to view the traffic generated by the application and the Low-layer Traffic Control Mechanism generated by TCP. A new tool called tcpflow complements tcpdump, which provides protocol stream analysis and appropriate data stream reconstruction methods, regardless of the packet sequence or re-transmission. Listing 5 shows the two usage modes of tcpdump.
Listing 5. usage modes of tcpdump
Display all traffic on the eth0 interface for the local host $ tcpdump-l-I eth0show all traffic on the network coming from or going to host Plato $ tcpdump host platoshow all HTTP traffic for host Camus $ tcpdump host Camus and (Port HTTP) view traffic coming from or going to TCP port 45000 on the local host $ tcpdump TCP port 45000 |
Tcpdump and tcpflow tools have a large number of options, including the ability to create complex filter expressions. Refer to the following references for more information about these tools. Both tcpdump and tcpflow are text-based command line tools. If you prefer a graphical user interface (GUI), an open source code tool ethereal may be suitable for your needs. Ethereal is a professional protocol analysis software that can help debug application-layer protocols. Its plug-in architecture can break down protocols, such as HTTP and any protocol you can think of (there are 637 Protocols Written in this article ).
Previous Article: Development of USB camera driver in Linux Next article: Linux kernel space protection |
|