Linux network programming -- socket
-- By firo 2011.5.2
Cherish the memory of the master -- W. Richard Steven s
Reference
Man
Google
Understanding the Linux kernel 3E [ulk 3E]
Professional Linux kernel architecture [plka]
Computer systems: a programmer's perspective 2E [Cs: app 2E]
Advanced Programming in the Unix environment second edition [apue 2E]
UNIX network programming the sockets networking API volume 1 third edition [unpv1 3E]
The Linux programming interface a Linux and UNIX System Programming Handbook [tlpi]
Preface
Sockets are a method of IPC that allow data to be exchanged between applications, either on the same host (Computer) or on different hosts connected by a network.
-- By Michael kerrisk author of tlpi
What is socket?
[1] socket: an interface between an application process and transport layer.
[2] in UNIX jargon, a socket is a file descriptor an integer associated with an open file.
[3]SocketIs one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the TCP layer can identify the application that data is destined to be sent.
[4] in computer networking,Internet socketOrNetwork socketIs an endpoint of a bidirectional inter-process communication flow serving SS an Internet protocol-based computer network, such as theInternet.
Int socket (INT domain, int type, int Protocol );
Socket () parameters [1] domain: Address Family, used to specify the local address format corresponding to the socket and the fields used by the socket, the range such as pf_packet and af_unix are used for local data transmission, af_inet and af_inet6 transmit data over the Internet. [2] type: socket type, such as sock_stream or sock_dgram. The address family is used to indicate the purpose of this socke. [3] Protocol: protocols and rules for data transmission, such as ipproto_tcp and ipproto_udp. Description: Man 2 socket.
Socket () creates a socket data structure by calling sys_socket. This structure maintains the parameters required for data transmission between hosts or processes; important programming members include [1] The I-node of the Network File System sockfs, which is created by the kernel. By associating the returned file descriptor, we can perform operations on this socket: read and Write. [2] the IP address of the local host, in the format specified by the parameter domain, is initialized by the BIND () system call to a specific value. [3] socket type, initialized by the parameter type. [4] data transmission protocol, specified by the Protocol parameter. [5] address family, specified by domain.
General socket address Structure -- struct sockaddr
Because different domain domains determine the address formats of different sockets, system calls must provide consistent and unified interfaces for all applications, which leads to conflicts; therefore, the general address structure is used to eliminate the impact of different socket address structures on system calls, so that system calls are not specific to a specific address structure. You want to use these system calls. Well, first convert it into the general-purpose structure struct sockaddr ~~ You may say that I can implement it through the void * pointer. Why should I define another structure? However, in 1982, the void * of ANSI C was not born yet ~~
Byte order byte
You need to remember: Big endian big end, network byte order, network byte order, human writing, and multi-byte type variables. They are in the same order: the most effective byte is at the beginning, that is, the lowest address ~~ The rest is Linux's little endian ~ POSIX specifies the destination host address and local address in the protocol header of our TCP or UDP group, and uses the large-end network in the byte sequence for storage ~~ The port used to identify the process in the same address structure should also be converted to the network byte sequence so that the TCP protocol stack can be correctly identified. When talking about byte order and its related operations, we need to transmit binary data through the network in OS systems where the byte order may be inconsistent? There are two solutions: [1] converting to the string text format [2] display instructions on the byte order of transmitted data. Note that because the characters in the text do not have this sort of bytecode problem: ASCII is a single-byte arrangement of each character, naturally there will be no issue of the bytecode between multiple bytes; and like UTF-8, A character encoding set such as UTF-16 is used to display the byte order of the specified data ~~~
Why can't I re-connect after the connect () fails? This is really a problem to be solved.
Google's result is that the status of the socket is unknown after the connect () failure, and connect () again may produce an error. You need to close and re-connect the socket ~~ If it fails, it will become kernel, hard, hard enough! I will talk about it later ~~
What is a TCP port/udp port?
The port is on the TCP/IP protocol stack and can be viewed as a data structure and a buffer zone. It is used to indicate a service type. As long as your application provides this service, you can apply to use this port with the kernel (call BIND () System Call ); for example, the most familiar service is Web browser, and the corresponding port number is 80. Web application server configuration software such as Tomcat and Apache can use this port; remember [1] A port can only be used by one application (process or thread) at a time)Listeners. On Unix (BSD), you can set the socket setting option so_reuseport to enable multiple process processes or threads to listen to the same port at the same time, but this will cause confusion: the TCP/IP protocol stack cannot correctly identify and transmit data to the correct process. To eliminate this problem, Linux does not support the socket selection of so_reuseport. I have no definition of this constant on ubuntu10.10; [2] A parent-child process or multiple threads can simultaneouslyUseMark different processes or threads in the same port, without confusion: the reason is that the two processes or threads of socket communication use socket address pairs (Unique !) . For the second unpv2 sections 2.9 and 2.10, I have a wonderful discussion. post an unpv2 source image to help you understand it. If you really cannot understand it, let's look at unpv2:
UNIX domain socket
Note the following when talking about the path name bind of the file system to Unix domain socket as the address:
[1] You cannot bind an existing file path name to Unix domain socket [2] it is best to bind an absolute file path name to avoid any trouble ~ [3] The Path Name of the BIND file and socket are one-to-one ing of injective functions [4] open () cannot be used for Unix domain socket [5] after the operation ends, call unlink () or remove () to delete the created file.
The Linux abstract socket namespace [lasn] -- Linux virtual socket namespace
Lasn provides a new mechanism to create the address of the UNIX domain socket. First, let's talk about lasn. [1] There is no need to worry about the problems with the created files ~ [2] deletion of the created file does not need to be displayed. When the socket is closed, this name is automatically deleted ~ [3] when it is not convenient to read and write the file system, you can also use the Unix domain socket ~~ It is very easy to use, that is, add a null before the formal name of the file, and the others remain unchanged. This is the code of tlpi:
Struct sockaddr_un ADDR; <br/> memset (& ADDR, 0, sizeof (struct sockaddr_un);/* clear address structure */<br/> ADDR. sun_family = af_unix;/* Unix domain address */<br/>/* ADDR. sun_path [0] has already been set to 0 by memset () */<br/> strncpy (& ADDR. sun_path [1], "XYZ", sizeof (ADDR. sun_path)-2); <br/>/* abstract name is "XYZ" followed by NULL bytes */<br/> sockfd = socket (af_unix, sock_stream, 0 ); <br/> If (sockfd =-1) <br/> errexit ("socket"); <br/> If (BIND (sockfd, (struct sockaddr *) & ADDR, sizeof (struct sockaddr_un) =-1) <br/> errexit ("bind ");
Internet Domain socket