[UNIX Network Programming] basic TCP socket programming and unix Network Programming

Source: Internet
Author: User

[UNIX Network Programming] basic TCP socket programming and unix Network Programming

The concurrent server described in this chapter is a single customer Process Model Implemented by fork.

The following is the socket function of the basic TCP client/server program (the schedule of some typical events ):

 

 

TCP status transition diagram:

 

1. socket functions:

# Include <sys/socket. h> int socket (int family, int type, int protocol); // return non-negative descriptor or-1 (error)

Family: protocol family (AF_INET, AF_INET6, AF_LOCAL, AF_ROUTE, AF_KEY)

Type: socket type (SOCK_STREAM, SOCK_DGRAM, SOCK_SEQPACKET, SOCK_RAW)

Protocol: protocol, which can be 0 (select the system default value for the combination of the given family and type) (IPPROTO_TCP, IPPROTO_UDP, IPPROTO_SCTP)

AF_XXX and PF_XXX represent the address family and protocol family respectively, but they are now equal. (Address family & protocol family)

 

2. connect function:

# Include <sys/socket. h> int connect (int sockfd, const struct sockaddr * servaddr, socklen_t addrlen); // success returns 0, error returns-1

Sockfd: socket descriptor, returned by the socket function

Servaddr: Structure of the server socket address to be connected, which usually needs to be forcibly converted (sockaddr is a general socket address structure)

Addrlen: structure size of servaddr, sizeof (servaddr)

Before the client calls function connect, you do not have to call the bind function. If you call the function, the kernel determines the source IP address and selects a temporary port as the source port.

TCP socket call to connect will trigger TCP's three-way handshake process and will only be returned if the connection is established successfully or an error occurs.

1) The TCP client does not receive the SYN subnode response and returns an ETIMEDOUT error. (If no response is received, the system returns this error)

2) RST indicates that no process is waiting to connect to the server host on the specified port (the process may not be running ), once the customer receives the RST, The ECONNREFUSED error is returned immediately.

Conditions generated by RST: There is no server being monitored on the port; TCP wants to cancel an existing connection; TCP receives a shard on a connection that does not exist at all.

3) If SYN triggers a "destination unreachable" ICMP error on a middle vro, it is considered as a soft error ). The client host kernel saves the message and continues sending SYN at the interval specified in the first case. If no response is received within the specified time period, the stored message (ICMP error) is returned to the process as an EHOSTUNREACH or ENETUNREACH error. (Cause: According to the forwarding table of the local system, the path to the remote system is not reached at all; the connect call will return without waiting .)

The following is my test: (ip: 192.168.1.100)

The first example shows No route to host, indicating that it is an IP address that cannot be reached by the Internet.

The second example shows Connection timed out (this result is returned after a long wait), indicating that the route can be connected, but such an IP address does not exist.

The third example is displayed normally, and then the server program is closed.

The last example shows Connection refused because the host exists but the port is not occupied. The server immediately responds to an RST shard.

The connect function transfers the current socket from the CLOSED status to the SYN_SENT status. If the socket is successfully switched to the ESTABLISHED status.

If connect fails, the socket is no longer available and must be closed. Therefore, when the function connect is called cyclically, the current socket descriptor must be closed and the socket must be re-called whenever the socket fails.

 

3. bind functions:

# Include <sys/socket. h> int bind (int sockfd, const struct sockaddr * myaddr, socklen_t addrlen); // success returns 0, error returns-1

The second parameter is a pointer to the Protocol-specific address structure. This struct can either specify an IP address or a port. (IP must belong to one of the network interfaces of the host)

If the IP address is set to the wildcard address or the port is set to 0, the kernel automatically selects the IP address and the temporary port.

If it is the temporary port selected by the kernel, because the second parameter is const, to return the port value, you must call getsockname to return the Protocol address.

On hosts that provide Web servers for multiple organizations, You need to bind a non-Wildcard IP address.

A common error in calling the bind function is EADDRINUSE (the address is used)

 

4. listen function:

# Include <sys/socket. h> int listen (int sockfd, int backlog); // success returns 0, error returns-1

Two things to do when a TCP server is called:

1) when the socket function creates a socket, it is an active socket by default. The listen function converts an unconnected socket that has not called connect into a passive socket, indicates that the kernel should receive connection requests directed to the socket. (Active/customer-> passive/Server)

2) The backlog parameter specifies the maximum number of connections in the socket queue.

The listen function transfers the current socket from the CLOSED status to the LISTEN status.

This function is after socket and bind and before accept.

The kernel maintains two queues for any given listening socket:

1) Unfinished Connection queue: Customer socket in SYN_RCVD status

2) completed connection queue: client socket in the ESTABLISHED status

The sum of the two queues does not exceed the value of backlog (some of them are not the value of backlog but correspond to the value of backlog). Each time the Server accept retrieves a response from the header of the completed queue, the socket that completes the three-way handshake is transferred to the completed queue. If the queue is empty, the process will be put into sleep until TCP puts one in the queue to wake it up.

The backlog parameter of each system has a different correspondence relationship with the actual number of queued connections, but do not set it to 0. If you do not want any customer to connect to the listening socket, turn it off.

To dynamically change the backlog parameter, You can override the default value using the command line option or environment variable.

voidListen(int fd, int backlog){char*ptr;/*4can override 2nd argument with environment variable */if ( (ptr = getenv("LISTENQ")) != NULL)backlog = atoi(ptr);if (listen(fd, backlog) < 0)err_sys("listen error");}

When a customer SYN arrives, if the queue is full, TCP will ignore this shard (no RST is sent, that is, no error will be reported immediately, and wait for the resend ), we hope to find available space in these queues soon. The purpose is to let the customer distinguish between "this port has no server listening" and "this port has servers listening, but the queue is full"

 

5. accept function:

# Include <sys/socket. h> int accept (int sockfd, struct sockaddr * cliaddr, socklen_t * addrlen); // if a non-negative descriptor is returned successfully, error-1 is returned

The first parameter indicates the listener socket descriptor of the original socket.

The second parameter returns the Protocol address of the connected peer process (customer ).

The third parameter is the value-result parameter. The length of the socket address structure referred to by cliaddr is returned, which is the exact number of bytes that the kernel stores the socket address structure.

The returned value is the connected socket descriptor. Note that the connected socket and the listener socket are differentiated: The Listener socket has been present throughout the life cycle of the server, the kernel creates a connected socket for each client connection accepted by the server process. When the server completes the service for a given customer, the corresponding connected socket is closed.

2nd and 3 parameters. If you are not interested in the returned client Protocol address, you can set the last two parameters to a null pointer.

#include"unp.h"#include<time.h>intmain(int argc, char **argv){intlistenfd, connfd;socklen_tlen;struct sockaddr_inservaddr, cliaddr;charbuff[MAXLINE];time_tticks;listenfd = Socket(AF_INET, SOCK_STREAM, 0);bzero(&servaddr, sizeof(servaddr));servaddr.sin_family      = AF_INET;servaddr.sin_addr.s_addr = htonl(INADDR_ANY);servaddr.sin_port        = htons(13);/* daytime server */Bind(listenfd, (SA *) &servaddr, sizeof(servaddr));Listen(listenfd, LISTENQ);for ( ; ; ) {len = sizeof(cliaddr);connfd = Accept(listenfd, (SA *) &cliaddr, &len);printf("connection from %s, port %d\n",   Inet_ntop(AF_INET, &cliaddr.sin_addr, buff, sizeof(buff)),   ntohs(cliaddr.sin_port));        ticks = time(NULL);        snprintf(buff, sizeof(buff), "%.24s\r\n", ctime(&ticks));        Write(connfd, buff, strlen(buff));Close(connfd);}}

This program outputs the connection from which IP address and port on the server every time there is a connection.

You can know that the client program that does not call bind will bind a temporary port for connection.

 

----- Concurrent Server -----

 

6. fork functions:

# Include <unistd. h> pid_t fork (void); // returns two times. In the child process, 0 is returned, and the parent process is the child process id. In the error,-1 is returned.

The fork function is the only way to generate new processes in unix.

When this function is called, a new process is derived. Therefore, the id of the child process is returned in the parent process, and 0 is returned in the child process. (If successful)

Therefore, the returned value can be used to determine whether the process is in a child process or a parent process.

All descriptors opened by the parent process before fork is called are shared by the child process after fork is returned. Therefore, generally, the parent process calls fork after accept is called, and the child process then reads and writes the connected socket, the parent process closes the connected socket to achieve concurrency.

Typical fork usage:

1) A process creates a copy of itself, so that each copy can process a specific operation while another copy executes other tasks. This is a typical usage of the network server.

2) A process wants to execute another program. Generally, the parent process creates a copy (child process), and the child process calls exec to replace itself with a new program, typical usage of programs such as shell.

 

7. exec functions:

# Include <unistd. h> int execl (const char * pathname, const char * arg0 ,... /* (char *) 0 */); int execv (const char * pathname, char * const * argv []); int execle (const char * pathname, const char * arg0 ,... /* (char *) 0, char * const envp [] */); int execve (const char * pathname, char * const * argv [], char * const envp []); int execlp (const char * filename, const char * arg0 ,... /* (char *) 0 */); int execvp (const char * filename, char * const * argv []); // If a success is not returned, error-1 is returned.

The differences between these functions are:

1) whether the program file to be executed is specified by the file name (filename) or pathname.

2) whether the parameters of the new program are listed one by one or referenced by a pointer Array

3) Pass the environment of the calling process to the new program or specify a new environment for the new program.

When these functions fail,-1 is returned to the caller. Otherwise, no result is returned, indicating the starting point that the control will be passed to the new program.

Execve is a system call in the kernel, and the other five are the library functions that call execve.

The three functions in the preceding row specify each parameter string of the new program as an independent parameter of exec and end the variable number of parameters with a null pointer. The following three functions have an argv array as the exec parameter, it contains all pointers pointing to each parameter string of the new program (the argv array must contain a null pointer to specify its end ).

The two functions in the left column specify a filename parameter. exec will use the current PATH environment variable to convert the file name parameter to a PATH name. However, if the filename parameter contains a slash (/), no longer use environment variables (I understand this as a relative path (?)), The four functions in the right two columns specify a fully qualified pathname parameter (so this is an absolute path (?) ).

The four functions in the left column do not explicitly specify an environment pointer. They use the current value of the external variable environ to construct an Environment list passed to the new program. The two functions in the right column are explicitly specified. The envp pointer array must end with a null pointer.

The descriptor opened by the process before exec is called is usually kept open across exec, but this default behavior can be disabled by setting the FD_CLOEXEC descriptor flag using fcntl. (I will talk about it later)

 

----- Split line -----

 

Concurrent Server:

In Unix, the simplest way to write concurrent server programs is to fork a sub-process to serve every customer.

// Concurrent server profile
Pid_t pid; int listenfd, connfd; listenfd = socket (...); bind (listen ,...); listen (listenfd, LISTENQ); for (;) {connfd = accept (listenfd ,...); if (pid = fork () = 0 ){
// The sub-process closes the listening socket and processes the transaction close (listenfd); doit (connfd); close (connfd); exit (0 );}
// The parent process closes the connected socket and continues accept close (connfd );}

When the parent process is close (connfd), the child process may still be in doit (connfd). At this time, TCP socket will not send FIN and terminate the connection with the customer because:

Each file or socket has a reference count (the number of ECHO connections when ls-al is used), which is the number of descriptors currently opened to reference the file or socket.

Therefore, when the parent process disables connfd, the corresponding reference count value is reduced from 2 to 1, the actual cleanup and resource release of this socket will not occur until the reference count value reaches 0. (When the sub-process also closes connfd)

 

8. close function:

# Include <unistd. h> int close (int sockfd); // 0 is returned for success and-1 is returned for Error

The close function is also used to close the socket and terminate the TCP connection. The default action of closing a TCP socket is to mark the socket as closed and then return to the calling process immediately. The socket descriptor can no longer be used by the calling process, and cannot be the first parameter of read or write. After the call, TCP will try to send any data that has been queued and is waiting to be sent to the peer end. After the call, a normal TCP termination sequence occurs. The SO_LINGER option in the following sections can change the default behavior (you can be sure that the peer process has received all unprocessed data)

When the reference count is greater than 0, TCP's four waves are not triggered. However, if you really want TCP to send a FIN, you can use the shutdown function instead of close.

Important: if the parent process does not call close for each connected socket returned by accept, the parent process will eventually exhaust the available descriptor, and no client connection will be terminated (after the sub-process exits, the reference count is reduced to 1, because the parent process will never close the connected socket, so it will not send FIN)

 

9. getsockname and getpeername functions:

# Include <sys/socket. h> int getsockname (int sockfd, const struct sockaddr * localaddr, socklen_t addrlen); int getpeername (int sockfd, construst CT sockaddr * peeraddr, socklen_t addrlen); // success returns 0, error returned-1

Note: calling these two functions returns a combination of IP addresses and ports, not a domain name.

Use of these two functions:

1) For TCP customers without bind, getsockname is used to return the local IP address and local port number assigned to the connection by the kernel after connect is returned successfully.

2) After bind is called at Port 0, getsockname is used to return the local port number assigned by the kernel

3) getsockname can be used to obtain the address family of a socket, as shown in the following code:

intsockfd_to_family(int sockfd){struct sockaddr_storage ss;socklen_tlen;len = sizeof(ss);if (getsockname(sockfd, (SA *) &ss, &len) < 0)return(-1);return(ss.ss_family);}

4) Call the TCP server of bind with the wildcard IP address. Once a connection is established (accept is returned successfully), getsockname can be used to return the local IP address assigned to the connection by the kernel. (Sockfd must be assigned with the connected socket descriptor)

5) when a server is executed by a process that has called accept by calling the exec program, the only way it can obtain the customer's identity is to call getpeername. (Both the parent and child processes can use the address structure returned by accept. However, after exec is called, the memory image of the child process is replaced with a new file, only the socket descriptor is still open-> (yes, but it must be passed as a parameter or another method), so only getpeername can be used)

How to obtain the descriptor of a connected socket in the new exec program: 1) the process that calls exec can format the descriptor into a string and pass it as a command line parameter to the new program. 2) Before exec is called, a specific descriptor is always set to the accepted connected socket Descriptor (?) -> Inetd Method

 

Summary:

Client: socket-> connect-> close

Server: socket-> bind-> listen-> accept-> close

Most TCP servers are concurrent, and most UDP servers are iterative.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.