To differentiate between network communications and connections between different application processes , there are 3 main parameters: The destination IP address of the communication , the Transport Layer Protocol (TCP or UDP) used , and the port number used .
Socket is intended to be "socket." By combining these 3 parameters with a "socket" socket, the application layer can communicate with the transport layer through the socket interface to distinguish the communication from different application processes or network connections, and realize the concurrent service of data transmission.
Accept () The number of socket port (s) generated.
To write a Web program, you must use a socket, which is known to programmers. And, when the interview, we will also ask each other whether the socket programming. In general, many people will say that the basic socket programming is listen, accept, and send, write, and several other basic operations. Yes, just like the common file operation, as long as you write it must know.
For network programming, we also say must call TCP/IP, it seems that other network protocols no longer exist. For TCP/IP, we also know TCP and UDP, which can guarantee the correctness and reliability of data, while allowing data loss. Finally, we know that the IP address and port number of each other must be known before the connection is established. In addition to this, ordinary programmers will not know too much, many times this knowledge is enough. At most, when writing a service program, multithreading is used to handle concurrent access .
We also know the following facts:
1. a specified port number cannot be shared by multiple applications . For example, if IIS occupies 80 ports, then Apache cannot use 80 ports;
2. Many firewalls allow only packets of specific destination ports to pass through .
3. When a service program listen a port and accept a connection request, it generates a new socket to process the request.
So, a problem that puzzled me for a long time arose, if a socket after the creation and binding with 80 port, whether it means that the socket occupies 80 ports.
If so, when it accept a request, what port is generated for the new socket (I always assume that the system assigns it a free port number by default).
If it is a free port, then it must not be 80 ports, so the future TCP packet destination port is not 80-the firewall will certainly prevent it from passing.
In fact, we can see that firewalls do not prevent such a connection, and this is the most common connection request and processing method. What I don't understand is why the firewall doesn't block such a connection. How it determines that the connection is generated because of the CONNECT80 port. Is there any particular flag in the TCP packet? Or the firewall remembers something.
Later, I carefully studied the TCP/IP protocol stack principle, a lot of concepts have a more profound understanding. For example, TCP and UDP belong to the transport layer, together in the IP layer (network layer) above. The IP layer is primarily responsible for the packet transfer between nodes (end to end), where the node is a network device, such as a computer. because the IP layer is only responsible for sending data to the node, and can not distinguish between the different applications above, so the TCP and UDP protocol on the basis of the addition of the port information, the port is identified as an application of a node . In addition to increasing the port information, the UDP protocol basically does not have any processing to the IP layer data. The TCP protocol also adds more complex transmission controls, such as a sliding data sending window (Slice window), and a receiving acknowledgement and retransmission mechanism to achieve reliable data transfer. Regardless of how a stable TCP data stream is seen by the application layer, the following IP packets are transmitted, which require a TCP protocol to reorganize the data.
So, I have reason to suspect that the firewall does not have enough information to judge the TCP packet for more information, in addition to the IP address and port number. Also, we see that the so-called ports are designed to differentiate between different applications to be forwarded correctly when different IP packets arrive .
TCP/IP is just a protocol stack, just like the operating system operation mechanism, it must be implemented, but also to provide external operational interface. Just as the operating system provides standard programming interfaces, such as WIN32 programming interfaces, TCP/IP must also provide a programming interface, which is the socket programming interface--that's what happened.
In the socket programming interface, the designer has put forward a very important concept, that is socket. This socket is similar to a file handle, which in fact is stored in the same process handle as a file handle in the BSD system. This socket is actually an ordinal number that represents its position in the handle table. We've seen a lot of this, such as file handles, window handles, and so on. these handles, in fact, represent certain objects in the system that are used as parameters in various functions to operate on a particular object -this is actually a C language problem, in C + + language, this handle is actually the object pointer.
Now we know that there is no inevitable connection between the socket and TCP/IP. Socket programming interface in the design, it is expected to be able to adapt to other network protocols. Therefore, the appearance of the socket is only more convenient to use the TCP/IP protocol stack, the TCP/IP has been abstracted, forming a few of the most basic function interface. such as Create, listen, accept, connect, read and write.
Now we understand that if a program creates a socket and lets it listen on 80 ports, it actually declares its possession of 80 ports to the TCP/IP protocol stack. Later, all TCP packets with a target of 80 ports are forwarded to the program (the program here, because the socket programming interface is used, is first handled by the Socekt layer). The so-called accept function, in fact, is an abstraction of the TCP connection establishment process. The new socket returned by the Accept function actually refers to the connection created this time, and a connection consists of two parts of the information, one is the source and the source, the other host and the host port. in this case, the socket host port can be all 80. at the same time, the firewall of the IP packet processing rules are clear, there is no previous scenario of the various complex scenarios.
It is important to understand that the socket is simply an abstraction of the TCP/IP protocol stack operation, rather than a simple mapping relationship.
Yesterday I chatted with my friends about the network programming, about socket, here to write some of my personal understanding:
The program can create a socket, divided into ordinary socket and the original socket two types.
One: Ordinary socket is the TCP/IP protocol stack in the operation of the transport layer of the programming interface (an API).
There is a connection-oriented streaming socket (SOCK_STREAM), which belongs to the application of TCP mode;
There is no connected data packet socket (SOCK_DGRAM), which belongs to the application of UDP mode.
For ordinary sockets, I have a vague problem, in multi-threaded situations , server-side listening (listen) a port (assuming 8080), each accept a client's connection will produce a new socket. So what are the ports of these newly generated sockets? There is definitely no designation in the program, there should be two possibilities, 1: generating random ports. 2: or 8080 port. The first false assumption is impossible, and firewalls are very likely to block packets from these random ports. So the second assumption is that the server port is still 8080. But this overturned my original understanding that "one port is occupied by the program, and other programs cannot use that port." I think the most likely is that the scope is different: The program and the program can not use the same port, but within the program different sockets can still use the same port. So, in order to enable "the client to the server to the same port (8080) different threads (that is, different socket connections) of the package can be separated and combined", there must be a distinction between the packet is a distinct feature from different connections, which is the transport layer header in the source port, That is, the port on the client side of a socket connection. To sum up, in this case, the source port (client) of the transport layer header will be different from the socket, and the host port is the same (server side).
Two: The original socket, built on the network layer, so we can build our own protocol on the transport layer .
If you do a sniffer (network sniffer), then the packet is heard from the same network section of the ordinary socket packet (TCP or UDP), so in the program we have to write their own data structure (IP headers and TCP or UDP headers), and binding data.
If the client and server are written by themselves with the original socket, then you can control the protocol, like some network applications (MSN, Skype, etc.), you can rewrite the protocol on the network layer.