When communicating between processes in a network, the process of each communication must know which process it is communicating with on which computer. Otherwise the communication is impossible to talk about! A process can be uniquely identified locally through the process PID, but it does not work in the network. In fact, the TCP/IP protocol family has helped us solve this problem, the network layer "IP address " can uniquely identify the host in the network, and the transport layer of " protocol + port "You can uniquely identify the applications (processes) in the host. This allows the process of the network to be identified by using triples (IP addresses, protocols, ports), which can be used to interact with other processes through the process communication in the network.
What is a socket?
Sockets are the process communication mechanism for 4BDS UNIX, which is used to describe IP addresses and ports, and is a file descriptor for communication connections . For example, a socket is like a bank's service window. Then each service window has a label (corresponding to the IP address of the socket) , and then the bank has a specified number of the window to provide what services (the window label is similar to the record in the socket IP address, the service is similar to the socket port number), then the customer can be based on this window label to request the required services.
How do I connect to another socket in the socket?
First, it is obvious that the above has been emphasizing that the two processes in the network communication, two processes must know who the other party (through the "IP address + port"), otherwise it will not be able to communicate, you can imagine that you write a letter to your friend Xiaoming, So you forgot to write the return address and the sender. Then you sent the letter. Then you start to wait for Xiao Ming's reply, then xiaoming even received the letter and do not know who sent the letter and the reply letter to whom. So can you get an answer?
But sometimes when you write a communication such as TCP client, you find that you do not address the socket and still can communicate with the server, then why? Because the operating system did this for you, the operating system finds that you did not bind a socket to the corresponding IP address and port number. Listen and connect. Then it will automatically assign you the IP address and port number. And then the example of the letter Let's say you sent a letter to a friend to send it to you. Then the good friend found out the careless mistake you made. Then he fills in your name and address at the sender and the mailing address, so xiaoming can communicate with you normally.
Why does the server have to go through the trouble of addressing a listening port? We still use the above example of the letter, if you are very rich, there are three units (that is, the server three IP), you still do not write the sender and the return address. So though you send your friend, But your friend is not sure that you are living in that house, so he fills in one of the three addresses. And it is unfortunate that you have not lived in that house for a long time. So Xiao Ming even replied to your letter, you may not receive the letter. So The server must bind the listener socket to address it!
What is the size end
Because computer memory is stored in bytes (eight bits). Then there is a problem when storing more than one byte of data, in what order to store such data, is the low-bit bytes emitted at the lower address end of the memory or the high-bit bytes emitted at the higher address end of the memory. However, this is often related to specific CPU architectures, not the operating system. So first, let's explain the difference between the big-endian and the small-end modes:
-
- Small-end Mode (Little-endian): Is the low-level bytes emitted in the memory of the lower address, high-byte emissions at the high address of the memory. (Logic low Low High)
- Big-endian mode (Big-endian): The high byte is emitted at the low address end of the memory, and the low byte is discharged at the upper address of the memory. (populated as data streams)
That might not be a good idea. If you store a 16-bit integer 0x1234 in a short integer variable, the short integer variable is stored in memory by a big-endian or small-ended pattern, as shown in the following table.
Approximate flow of TCP programming in the socket
Attention issues
- the sending and receiving of large data . This is where the large data refers to data that is larger than the receive buffer or the send buffer. For this kind of data, you need to call recv or send and so on to send or receive the function more than once.
- TCP is sent based on a stream . So when the interval is very short sent data, in the server may be received at one time, for this to be related processing.
- if TCP keeps a long connection, check to see if the connection exists. A common approach is to use a heartbeat packet to check if a connection exists.
- size-and-end problems . When transmitting data over the network, remember that as long as more than two bytes of binary data will be used htosxx function and stohxx function to process the data, Because of the size end problem on the terminal platform. But more than two bytes of character data do not need to pay attention to the terminal platform size end problem. TCP/IP layer protocol defines byte order (network byte order) as Big-endian in network
- for socket systems that bind addresses and ports, the ports are randomly assigned and the IP on the machine is bound to the socket . For Internet domains, if the IP address specified is Inaddr_any, Socket endpoints can be bound to all system network interfaces. This means that packets of all network cards that are installed on the system lock can be accepted, and if connect or listen is called, but no address is bound to a socket, the system will select an address and bind it to the socket.
Address structure Body
IPv4: In order to use a different format address can be passed to the socket function, the address is converted to a common address structure sockaddr IPv4, in the IPv4 Internet domain (AF_INET), the socket address is represented by the following structure sockaddr_in:
struct in_addr{in_addr_t s_addr;}; struct sockaddr_in{ sa_family_t sin_family; in_port_t Sin_port; struct in_addr sin_addr;};
IPv6: The IPV6 Internet domain (AF_INET6) socket address is represented by the following structure SOCKADDR_IN6:
struct in6_addr{ sa_family_t sin6_family; in_port_t Sin6_port; uint32_t Sin6_flowinfo; struct IN6_ADDR sin6_addr; uint32_t sin6_scope_id;};
Address Query
By calling Gethostent to get the host information, when Gethostent returns, it gets a pointer to the hostent struct, which may contain a static data buffer, each call gethostlent This buffer will be overwritten. The address returned is in network byte order. related functions :
-
- Gethostent
- Sethostent
- Endhostent
Related Structures :
struct hostent{ char *h_name; char **h_aliases; char H_addrtype; char h_length; Char **h_addr_list;..};
The Protocol name and protocol number are mapped using the following function
Related functions
-
- Getprotobyname
- Getprotobynumber
- Getprotoent
- Setprotoent
- Endprotoend
related structural bodies
struct protoent{ char* p_name; char** p_aliases; int P_proto;}
Service Enquiry
The service is represented by the port number portion of the address. Each service is provided by a unique, well-known port number, using the function getservbyname to map a service name to a port number, the function Getservbyport to map a port number to a server name, or to use a function Getservent sequential Scan Service database. related functions:
-
- Getservbyname
- Getservbyport
- Getservent
- Setservent
- Endservent
Related structures:
struct servent{ char *s_name; char **s_aliases; int s_port; Cjar *s_proto;};
Common APIs for sockets
- Create socket : socket ()
The SOCK_RAW socket provides a datagram interface for direct access to the following network layer (IP in the Internet domain). When using this interface, the application is responsible for constructing its own protocol header, This is because the transport protocol (TCP and UDP, etc.) is bypassed. When creating a raw socket, you need superuser privileges to prevent malicious programs from bypassing the built-in security mechanism to create messages.
- bind a socket to a port and address : Bind ()
Bind some restrictions
- On the machine where the process is running, the address must be valid and cannot be set up for another machine.
- The address must match the format supported by the address family when the socket was created
- The port number must be no smaller than 1024 unless the process has the appropriate privileges
- Typically only socket endpoints can bind to addresses, although some protocols can be multi-bound
Suppose that my server-bound socket address is 123.255.1.3 client can only make request connection request from 123.255.1.3 Address
- listener Sockets: Listen ()
- Connection Kit Sub : Connect ()
If the socket descriptor is in nonblocking mode, connect will return-1 when the connection cannot be established immediately, and the errno will be set to a special error code, einprogress. Applications can use Select or poll to determine when a file descriptor can be written, if the connection is writable. The function connect can also be used for non-connected network services (SOCK_DGRAM). If you call connect on a SOCK_DGRAM socket, The destination address of all sending messages is the one established in the connect call, so that no address is required for each transmission of the message, and only messages from the specified address are accepted
- Request received : Accept ()
The file descriptor returned by the function accept lock is a socket descriptor that is connect prompt to the client calling connect. This new socket descriptor and the original socket (SOCKFD) have the same socket type and address family. The original socket passed to accept is not associated to this connection, Instead, it continues to be available and accepts other connection requests. If there is no connection request waiting to be processed, the accept will block until a request arrives, and if SOCKFD is in nonblocking mode, accept returns 1 and sets errno to Eagain or Ewouldblock
- Data Communication :
- Send () \ recv ()
- Write () \ Read ()
- Recvfrom () \ sendto ()
- Close socket: Close ()
- Resolve host name and address: gethostbyname ()/gethostbyaddr ()
- Get to the address of the getpeername to get the opposite address connected to the socket
- GetSockName gets the address of the socket binding
- Query/Setup Kit sub-option: getsockopt ()/setsockopt ()
- Information
- Wikipedia: Http://zh.wikipedia.org/wiki/Socket
- Linux Common C functions: http://net.pku.edu.cn/~yhf/linux_c/
Resources
- Linux Socket Programming: http://www.cnblogs.com/skynet/archive/2010/12/12/1903949.html
- TCP and udp:http://www.cnblogs.com/magicbox/archive/2012/02/15/2363875.html for Linux socket programming
- Recognize (big-endian-small end) End mode: http://blog.csdn.net/luckyabcd/article/details/4341873
- Socket programming Small problem: http://blog.csdn.net/piaojun_pj/article/details/6098438
Linux Socket programming