Five risks in Linux socket programming

Source: Internet
Author: User
Preface:Socket API is a standard API used in network application development. Although this API is simple

 

New developers may experience some common problems. This article identifies some of the most common risks and shows you how to avoid them. Related documents: "linux socket programming" was first introduced in 4.2 bsd unix operating system. The Sockets API is now a standard feature of any operating system. In fact, it is difficult to find a modern language that does not support Sockets APIs. This API is quite simple, but new developers still encounter some common risks.

This article identifies the risks and shows you how to avoid them. Hazard 1. Ignore returned statusThe first hidden danger is obvious, but it is the easiest mistake for new developers. If you ignore the return status of the function, you may be lost when they fail or are partially successful. In turn, this may spread errors, making it difficult to locate the source of the problem. Capture and check each returned status, rather than ignoring them. Consider the example shown in Listing 1, a socket send function. Listing 1. Ignore the status returned by the API Function
Int status, sock, mode;/* Create a new stream (TCP) socket */sock = socket (AF_INET, SOCK_STREAM, 0 );... status = send (sock, buffer, buflen, MSG_DONTWAIT); if (status =-1) {/* send failed */printf ("send failed: % s \ n ",? Strerror (errno);} else {/* send succeeded -- or did it? */}
Listing 1 explores a function snippet that completes the socket send operation (send data through the socket ). Function error status is captured and tested, but this example ignores a feature of send in non-blocking mode (enabled by the MSG_DONTWAIT flag. The send API function has three possible return values:
  • If the data is successfully discharged to the transmission queue, 0 is returned.
  • If the queue fails,-1 is returned (you can use the errno variable to understand the cause of the failure ).
  • If not all characters can be queued during function calling, the final return value is the number of characters sent.
Because the MSG_DONTWAIT variable of send is non-blocking, function calls return after all data, some data, or no data is sent. Ignoring the returned status will lead to incomplete sending and subsequent data loss. Hidden Danger 2. Equivalent socket ClosureOne interesting aspect of UNIX is that you can regard almost everything as a file. Files, directories, pipelines, devices, and sockets are treated as files. This is a novel abstraction, meaning that a complete set of APIs can be used on a wide range of device types. Consider the read API function, which reads a certain number of bytes from a file. The read function returns the number of bytes to be read (the maximum value you specify); or-1, indicating an error; or 0, if it has reached the end of the file. If you complete a read operation on a socket and obtain a return value of 0, it indicates that the Peer layer of the remote socket side calls the close API method. This indicator is the same as reading a file-No redundant data can be read through the descriptor (see Listing 2 ). List 2. Properly process the returned values of read API functions
Int sock, status; sock = socket (AF_INET, SOCK_STREAM, 0 );... status = read (sock, buffer, buflen); if (status> 0) {/* Data read from the socket */} else if (status =-1) {/* Error, check errno, take action... */} else if (status = 0) {/* Peer closed the socket, finish the close */close (sock);/* Further processing... */}
Similarly, you can use the write API function to detect the closure of a peering socket. In this case, The write function will return-1 and set errno to EPIPE if the SIGPIPE signal is received or the signal is blocked. Hidden Danger 3. Address usage error (EADDRINUSE)You
Bind APIs can be used
Function to bind an address (an interface and a port) to a socket endpoint. You can use this function in server settings to restrict the interfaces that may be connected. You can also use
This function is used to restrict the interfaces that should be used for outgoing connections. Bind
The most common usage is to associate the port number with the server and use the wildcard address (INADDR_ANY), which allows any interface to be used for the incoming connection.
The common problem with bind is to try to bind a port that is already in use. This trap may not have active sockets, but the port binding is still prohibited (bind returns
EADDRINUSE), which is caused by the TCP socket status TIME_WAIT. This status is retained for about 2 to 4 minutes after the socket is disabled. In
After the TIME_WAIT status exits, the socket is deleted so that the address can be rebound. And so on
Waiting for TIME_WAIT
Ending may be annoying, especially if you are developing a socket server, you need to stop the server and make some changes and restart it. Fortunately, there are ways to avoid
TIME_WAIT status. You can apply the SO_REUSEADDR socket option to the socket so that the port can be reused immediately. Consider the example in listing 3. Before binding an address, I used the SO_REUSEADDR option to call setsockopt. To allow address reuse, I set the integer parameter (on) to 1 (otherwise, it can be set to 0 to prohibit address reuse ). Listing 3. Use the SO_REUSEADDR socket option to avoid address usage errors
Int sock, ret, on; struct sockaddr_in servaddr;/* Create a new stream (TCP) socket */sock = socket (AF_INET, SOCK_STREAM, 0 ): /* Enable address reuse */on = 1; ret = setsockopt (sock, SOL_SOCKET, SO_REUSEADDR,
& On, sizeof (on);/* Allow connections to port 8080 from any available interface */memset (& servaddr, 0, sizeof (servaddr); servaddr. sin_family = AF_INET; servaddr. sin_addr.s_addr = htonl (INADDR_ANY); servaddr. sin_port = htons (45000);/* Bind to the address (interface/port) */ret = bind (sock, (struct sockaddr *) & servaddr, sizeof (servaddr ));
After the SO_REUSEADDR option is applied, the bind API function allows immediate address reuse. Hidden Danger 4. Send structured dataSocket is a perfect tool for sending unstructured binary byte streams or ASCII data streams (such as HTTP pages over HTTP, or emails over SMTP. However, if you try to send binary data on a socket, it will become more complicated.
For example, if you want to send an integer, are you sure that the recipient will interpret the integer in the same way? Applications running on the same architecture can rely on their common platforms for this type of data.
Make the same explanation. However, if a client running on a high-priority IBM PowerPC sends a 32-bit integer to a low-priority Intel
X86, what will happen? The Byte arrangement will cause an incorrect explanation. What if I send a C structure through a socket? In this case, it will also be difficult, because not all compilers arrange elements of a structure in the same way. The structure may also be compressed to minimize the waste of space, which further places elements in the structure.
Fortunately, there is a solution to this problem, which can ensure the consistent interpretation of data at both ends. In the past, Remote Procedure
Call, RPC) Kit provides so-called External Data Representation (XDR ). XDR
Define a standard representation for data to support the development of communications between heterogeneous network applications. Now
There are two new protocols that provide similar functions. Extensible Markup Language/Remote Procedure Call (XML/RPC) Schedules HTTP Process calls in XML format. Data and Metadata
XML is encoded and transmitted as strings. The host architecture separates the values from their physical representation. SOAP follow
XML-RPC, with better features and functions to expand its thinking. For more information about each protocol, see the references section. Hidden Danger 5.frame synchronization assumption in TCPTCP does not
Provides frame synchronization, which makes it perfect for byte stream-oriented protocols. This is
Protocol, User Datagram Protocol) is an important difference. UDP is a message-oriented protocol that retains the message boundary between the sender and receiver. TCP
Is a stream-oriented protocol. It assumes that the data being communicated is unstructured, as shown in 1.

Figure 1.UDP Frame Synchronization capability and lack of TCP

. UDP's frame synchronization capability and lack of frame synchronization. The upper part of TCP Figure 1 shows a UDP client and server. The peer layer on the left performs write operations on two sockets, each of which is 100 bytes. The UDP layer of the protocol stack traces the number of writes, and ensures that when the receiver on the right gets data through the socket, it will arrive at the same number of bytes. In other words, the message boundary provided by the writer is reserved for the reader. Now
At the bottom of Figure 1, it shows write operations of the same granularity for the TCP layer. Two independent write operations (100 for each
Byte) is written to the stream socket. In this example, the reader of the stream socket gets 200 bytes. The TCP layer of the protocol stack aggregates two write operations. This aggregation can occur in TCP/IP
Either the sender or receiver of the protocol stack. It is important to note that aggregation may not happen-TCP only ensures orderly data transmission. This trap is confusing for most developers. You want to obtain TCP reliability and UDP frame synchronization. Unless you use another transmission protocol, such as the Stream Transmission Control Protocol (STCP), the application layer developers are required to implement the buffer and segment functions.Tool for debugging socket applicationsGNU/Linux provides several tools to help you find some problems in socket applications. In addition, the use of these tools is instructive and can help explain the behavior of applications and TCP/IP protocol stacks. Here, you will see an overview of several tools. Refer to the following references for more information. View Details of the network subsystem
The netstat tool provides the ability to view the GNU/Linux network subsystem. Use
Netstat allows you to view the currently active connections (based on a single protocol), connections in a specific status (such as server sockets in the listening status), and many other information. Listing 4
Displays some options provided by netstat and their enabled features.
Listing 4. usage modes of the netstat Utility

View all TCP sockets currently active $ netstat -- tcpView all UDP sockets $ netstat -- udpView all TCP sockets in the listening state $ netstat -- listeningView the multicast group membership information $ netstat -- groupsDisplay the list masqueraded connections $ netstat -- masqueradeView statistics for each protocol $ netstat -- statistics
Despite the existence of many other utilities, netstat is fully functional and covers the features of route, ifconfig, and other standard GNU/Linux tools. Monitor traffic
You can use several GNU/Linux tools to check the Low-layer traffic on the network. Tcpdump is an old tool that "sniffed" network packets from the Internet and printed them
Stdout or record in a file. This function allows you to view the traffic generated by the application and the Low-layer Traffic Control Mechanism generated by TCP. A new tool called tcpflow and
Tcpdump complements each other. It provides protocol stream analysis and the method for restructuring the data stream, regardless of the packet sequence or resend. Listing 5 shows the two usage modes of tcpdump.
Listing 5. usage modes of tcpdump
Display all traffic on the eth0 interface for the local host $ tcpdump-l-I eth0Show all traffic on the network coming from or going to host plato $ tcpdump host platoShow all HTTP traffic for host camus $ tcpdump host camus and (port http) view traffic coming from or going to TCP port 45000 on the local host $ tcpdump tcp port 45000
Tcpdump and tcpflow tools have a large number of options, including the ability to create complex filter expressions. Refer to the following references for more information about these tools. Tcpdump and
Tcpflow is a text-based command line tool. If you prefer a graphical user interface (GUI), there is an open source code tool Ethereal
It may be suitable for your needs. Ethereal is a professional protocol analysis software that can help debug application-layer protocols. Its plug-in Architecture (plug-in)
Architecture) can break down the protocol, such as HTTP and any protocol you can think of (there are 637 Protocols Written in this article ).

Reference announcement: http://blog.csdn.net/hairetz/archive/2009/05/29/4223234.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.