Network programming
The purpose of network programming is to communicate with other computers directly or indirectly through network protocols. In Network programming
There are two main problems, one is how to locate one or more hosts on the network accurately, and the other is to locate the host after
How to reliably and efficiently transfer data. In the TCP/IP protocol, it is mainly responsible for the location of network host, data transmission
routing, which can uniquely determine a host on the Internet by an IP address. The TCP layer provides an application-oriented, reliable
or non-reliable data transmission mechanism, which is the main object of network programming, generally do not need to care about how the IP layer handles the data
Of
At present, the popular network programming model is the client/server (c/s) structure. That is, the communication between the two parties as a service
client requests and responds. The customer requests the server when the service is required. Servers generally act as
The daemon is always running, listening to the network port, and once a client requests it, a service process is initiated to respond to the guest
Users, while continuing to monitor the service port, so that later customers can also be timely service.
On the Internet IP address and hostname is one by one corresponding, through the domain name resolution can be obtained by the host name of the machine's IP,
Because the machine name is closer to natural language, it is easy to remember, so it is used more widely than IP address, but only IP to the machine
The address is a valid identifier.
Typically, there are always many processes on a single host that require network resources for network communication. The object of the network communication is accurate speaking
Not a host, but a process that is running in the host. This time the light has a hostname or IP address to identify so many processes apparently
is not enough. The port number is a means of providing more network resources on a single host, and it is also a TCP layer
Provides a mechanism. Only the combination of the host name or IP address and port number can uniquely determine the objects in the network traffic:
Process.
Sockets
A socket, often referred to as a socket, is used to describe an IP address and port, which is a handle to a communication chain. Application process
The order usually makes a request to the network through a "socket" or answers a network request.
Sockets can be categorized according to the nature of the communication, which is visible to the user. Applications are generally only in the same class of
To communicate between sockets. However, as long as the underlying communication protocol allows, different types of sockets can still communicate. Sets
There are two different types of sockets: Stream sockets and datagram sockets.
The explanations below are abstract and not to be seen.
Sockets are the cornerstone of communication and a basic operating unit for network communication that supports TCP/IP protocol. You can treat sockets as
The endpoints of two-way communication between the processes of different hosts, which constitute the programming interface between the single host and the whole network. Socket connection
The word exists in the communication domain, the communication domain is to handle the general thread through the socket communication and the introduction of an abstract concept. Sets
Sockets typically exchange data with sockets in the same domain (data exchange may also traverse the boundaries of a domain, but it is important to perform
Some kind of interpretation program). Various processes use the same domain to communicate with each other using an Internet Protocol cluster.
How Sockets work
To communicate over the Internet, you need at least one pair of sockets, one of which runs on the client side, which we call
Clientsocket, another running on the server side, we call ServerSocket.
Depending on how the connection is started and the destination to which the local socket is connected, the connection between sockets can be divided into three
Steps: Server listener, client request, connection acknowledgement.
The so-called Server listener, is the server side socket does not locate the specific client socket, but is waiting for the connection
Status and monitor network status in real time.
A client request is a connection request made by the client's socket, and the target to be connected is the server-side socket
Word. To do this, the client's socket must first describe the socket of the server it is connecting to, indicating that the server-side socket
Address and port number, and then make a connection request to the server-side socket.
The so-called connection acknowledgement is when the server-side socket hears or receives a connection request to the client socket, it
In response to a client socket request, a new thread is created to send the server-side socket description to the client, once
The client confirms the description and the connection is established. While the server-side socket continues to be in the listening state, continue to receive additional
The connection request for the client socket.
Socket address Structure
struct IN_ADDR { in_addr_t s_addr; 32-bit IPv4 address //network byte ordered}struct sockaddr_in { sa_family_t sin_family; Af_inet in_port_t sin_port; 16-bit TCP or UDP port nummber, network byte ordered struct in_addr sin_addr; 32-bit IPv4 address, network Byte ordered char sin_zero[8]; Unused
SOCKADDR_IN is a network socket address structure, the size of 16 bytes, defined in the <netinet/in> header file, generally we use the structure in the program, but as a parameter to the socket function needs to be strongly converted to SOCKADDR type, Note that the port and addr members in the struct are network-ordered (big-endian structures).
struct SOCKADDR { sa_family_t sa_family; Address family:af_xxx value char sa_data[14]; Protocol-specific Address}
SOCKADDR is through the socket address structure, when passed as a parameter to the socket function, the socket address structure is always used as pointers, such as the Bind/accept/connect function.
Htons, Ntohs, htonl, and Ntohl functions
#include <netinet/in.h>uint16_t htons (uint16_t host16bitvalue); uint32_t htonl (uint32_t host32bitvalue); uint16 _t Ntohs (uint16_t net16bitvalue); uint32_t Ntohl (uint32_t net32bitvalue);
Linux provides 4 functions to complete the conversion between the host byte order and the network byte order. In the name of these functions, h means that host,n means that net,s represents short,l for long. When using these functions, you do not care about the host byte order and the real value of the network byte order, that is, the big or small end, to do is to call the appropriate function between the host and network byte order to a specific value.
Inet_aton, inet_addr, and INET_NTOA functions
#include <arpa/inet.h>int inet_aton (const char *strptr, struct in_addr *addrptr); Returns: 1 if the character is valid, otherwise 0in_addr_t inet_addr (const char *strptr); Returns: If the string is valid, it is a 32-bit binary network byte-order address, otherwise Inaddr_nonechar *inet_ntoa (struct in_addr inaddr); Return: Address that points to a dotted decimal number string
Inet_aton, inet_addr, and Inet_ntoa convert IPv4 addresses between dotted decimal string (such as "192.168.1.1") and its network byte-order binary value with a length of 32 bits. When calling inet_addr, it is important to note that the input parameter of the INET_NTOA function is the IP address of the unsigned int, and the pointer to the IP string is returned, and it is clear that the memory occupied by the IP string is allocated inside the function, and we do not need to release the memory, so It allocates memory that is static, and internally uses the static variable to store the IP dotted decimal string, which means that the memory is overwritten the first time the function is called when the function is called the second time.
Inet_pton and Inet_ntop functions
#include <arpa/inet.h>int inet_pton (int family, const char *strptr, void *addrptr); Return: Success is 1, input is not a valid expression return 0, error is -1const char *inet_ntop (int family, const void *addrptr, char *strptr, size_t len); Return: Success for pointer to result, error null
Both functions apply to both IPV4 and IPv6, and p represents the expression (presentation) and n for the numeric value (numeric). The first function attempts to convert a string that is referred to by the strptr pointer, addptr the binary result with the pointer, returns 1 successfully, and returns 0 if the input is not a valid expression format for the specified family.
Inet_ntop does the opposite, and if Len's value is too small to hold the expression result, it returns a null pointer and the error is ENOSPC. The StrPtr parameter of the Inet_ntop function cannot be a null pointer, the caller must allocate memory for the target storage unit and make its size, which is the function return value when the call succeeds.
Socket function
To perform network IO, the first thing a process must do is call the socket function, specifying the desired type of communication protocol (such as TCP with IPV4, UDP with IPv6, UNIX domain Byte throttle protocol), and socket type (byte stream, datagram, or raw socket).
#include <sys/socket.h>int socket (int family, int type, int protocol); Successfully returned non-negative descriptor, error 1
Family specifies the protocol family, type specifies the socket types, protocol specifies a protocol type literal, or is set to 0.
The values of the family are:
- Af_inet IPV4 Protocol
- Af_inet6 Ipv6 Protocol
- Af_local UNIX Protocol domain
- Af_route Routing sockets
- Af_key secret key sockets
The values of type are:
- Sock_stream byte throttle socket
- Sock_dgram datagram sockets
- Sock_seqpacket ordered packet sockets
- Sock_raw RAW sockets
The values of the protocol are:
- IPPROTO_CP TCP Transport Protocol
- IPPROTO_UDP UDP Transport Protocol
- IPPROTO_SCTP SCTP Transfer Protocol
The socket function returns a small nonnegative integer value on success, similar to a file descriptor, to be a socket descriptor, in order to get this descriptor, you need to specify the protocol family and socket type, but do not specify the local protocol address and the remote protocol address.
Connect function
#include <sys/socket.h>int connect (int sockfd, const struct sockaddr* servaddr, socklen_t Addrlen); Return: Success is 0, error-1
The TCP client uses the Connect function to establish a connection to the TCP server, SOCKFD is the socket descriptor returned by the socket function, the second, the third parameter is a pointer to a socket address structure, and the size of the structure, The socket structure must contain the IP address and port number of the server. Note: If connect fails, you must close the current socket descriptor and call the socket again. The client does not have to call the BIND function before calling connect (for example, in UDP client programming generally does not call bind), the kernel determines the source IP address and selects a temporary port as the source port.
In the case of a TCP socket, calling the Connect function fires the three-time handshake of TCP and returns only if the connection is successful or has an error. Note: Connect is returned when the syn+ack of the service-side response is received, that is, after the second action of the three-time handshake.
UDP can call the Connect function, but UDP's connect function and TCP's Connect function call are quite different, there is no three handshake process. The kernel simply checks for an immediately known error (such as a destination address unreachable), logs the IP and port number to the end, and then immediately returns to the calling process. UDP programming with connect does not have to use the SendTo function, so you can use the Write/read directly.
Bind function
#include <sys/socket.h>int bind (int sockfd, const struct SOCKADDR *myaddr, socklen_t Addrlen); Return: Success is 0, error-1
The BIND function assigns a local protocol address to a socket, which simply assigns a protocol address to a socket, and the meaning of the protocol address depends on the protocol itself. The second parameter points to a pointer to the protocol address structure, the third parameter is the length of the protocol address, and for TCP, the BIND function can either specify a port number, specify an IP address, or both, or neither.
The BIND function binding specific IP address must belong to one of the network interfaces of its host, the server binds their well-known port at startup, and if a TCP client or service does not call bind binding a port, when connect or listen is called, The kernel will select a temporary port for the response socket. It is normal for the TCP client to have the kernel select a temporary port, and then it is really rare for the TCP server, because the server is known through their well-known ports.
Listen function
#include <sys/socket.h>int Listen (int sockfd, int backlog); Return: Successful return 0, error 1
When a socket is created, it is assumed to be an active socket, that is, it is a client socket that will invoke connect to initiate a connection. The listen function converts an unbound socket to a passive socket, indicating that the kernel should accept connection requests to that socket, and calling the listen function will cause the socket to transition from Closee state to listen state. The second parameter specifies the maximum number of connections that the kernel should queue for the corresponding socket.
- connection Queue not completed : Each of these SYN sub-sections corresponds to one of the following: A customer has been issued and reached the server, and the server is waiting to complete the corresponding TCP three-way handshake process. These sockets are in the SYN_RCVD state.
- completed connection Queue : Each customer that completes the TCP three-way handshake process corresponds to one of these sockets in the established state.
Image from UNIX network programming-Volume One
The backlog parameters are interpreted differently in different systems, but they are roughly similar. UNP ( version 3 ) gives the definition of:the backlog of listen () should specify the maximum number of completed connections to which the kernel is queued on a given socket.
When a client SYN is reached, if these queues are full, TCP ignores the sub-section, that is, does not send the RST, this is temporary, the client will resend the SYN, expect to be able to get the service. If the server responds to an RST, the client's connect returns an error instead of a retransmission mechanism, so that the customer cannot differentiate the SYN's RST because "the port is not listening" or "the port is listening, but its queue is full."
After the three-way handshake is complete, the data arriving before the server calls accept should be queued by the server-side TCP, with the maximum amount of data being the receive buffer size for the corresponding connected socket.
In TCP service-side socket programming, after listen is executed, and no accept is executed, the client can successfully establish a connection, except that the connection is added to the connected queue and is extracted when the accept is called.
Accept function
#include <sys/socket.h>int Accept (int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen); return: Successfully returned connected descriptor (non-negative), error 1
The Accept function has a TCP server call that returns the next completed connection from the column header in the completed queue, and if the completed queue is empty, the process is put to sleep (if the socket is blocked). If accept succeeds, then its return value is a new socket automatically generated by the kernel, representing the TCP connection to the returned client , the first parameter of the function is a listening socket, and the return value is a connected socket.
Close Function
#include <unistd.h>int close (int sockfd); If 0 is returned successfully, error 1
The default behavior of the close one TCP socket is to mark the socket as closed and then immediately return to the calling process. Note that close essentially cuts the socket reference value by 1, and if the reference value is greater than 0, the corresponding socket is not actually turned off.
Server, client interaction flowchart
TCP state transition diagram
GetSockName and Getpeername functions
#include <sys/socket.h>int getsockname (int sockfd, struct sockaddr *localaddr, &addrlen); int getpeername (int SOCKFD, struct sockaddr *peeraddr, &addrlen); Return: Success is 0, error is-1
GetSockName gets the sockfd corresponding to the local socket address and stores it in the memory address specified by the address parameter, which is stored in the variable that the Addrlen points to. Getpeername gets the remote socket address.
UDP clients can also use Getpeername if they call connect.
Recv and send functions
#include <sys/socket.h>ssize recv (int sockfd, void *buff, size_t nbytes, int flags); ssize Send (int sockfd, void *buf F, size_t nbytes, int flags); Return: Number of bytes read or written successfully, error 1
TCP stream Data read-write operation function. The flag values are as follows:
- Msg_oob for Send, which indicates that out-of-band data will be sent, only one byte on the TCP connection can be sent as out-of-band data, and for recv, this flag indicates that out-of-band data is about to be read instead of normal data.
- Msg_peek This flag applies to recv and Recvfrom, which allows us to view the data that has been read, and discards it when the system is not returning recv and Recvfrom
Note that the flags parameter is valid only for the current invocation of send and recv, and of course, certain properties of the socket can be permanently modified by setsockopt system calls.
Network programming--socket (Sockets)