"VS development" socket programming principle

Source: Internet
Author: User
Tags data structures error code ftp reserved semaphore socket file transfer protocol port number

Socket Programming Principle

1, the introduction of the problem

1) Normal I/O operation process :

The set of I/O commands for UNIX systems evolved from commands in Maltics and earlier systems, with a pattern of opening a read/write one close (open-write-read-close). When a user process makes an I/O operation, it first invokes "open" to obtain the right to use the specified file or device and returns an integer called the file descriptor to describe the process by which the user makes I/O operations on an open file or device. The user process then invokes "read/write" multiple times to transmit the data. When all the transfer operations are complete, the user process closes the call, notifying the operating system that it has completed the use of an object.

2) The TCP/IP protocol is integrated into the UNIX kernel

When the TCP/IP protocol is integrated into the UNIX kernel, it is equivalent to introducing a new type of I/O operation on UNIX systems. The interaction of UNIX user processes with network protocols is much more complex than the interaction of user processes with traditional I/O devices. First, the network operation of the two processes on different machines, how to establish the connection between them. Secondly, there are many kinds of network protocols, how to establish a general mechanism to support multiple protocols. These are the network application programming interface to solve the problem.

3) A common network programming interface is required: independent of specific protocols and common network programming

In Unix systems, there are two types of network application programming interfaces: The UNIX BSD socket (socket) and the Unix System V Tli. Due to the adoption of TCP/IP-supported UNIX BSD operating system, the application of TCP/IP has been greatly developed, and its network application programming interface-socket (socket) has been widely used in network software, and has been introduced into the computer operating system DOS and Windows system.  As a powerful tool for developing Web applications, this chapter will discuss the issue in detail. 2. Socket Programming Basic Concept

Before you begin programming with sockets, you must first establish the following concepts. 2.1 Inter-network process communication

The concept of process communication originated from a stand-alone system. Since each process operates within its own address range, the operating system provides the appropriate facilities for process communication to ensure that the two communicating processes are not interfering and working together.

as UNIX BSD: piping (pipe), named pipe (named pipe) soft interrupt signal (signal)

UNIX System V has: message, shared memory, and semaphore (semaphore).

They are limited to communicating between native processes. Inter-network process communication is to solve the problem of communication between different host processes (the same machine process communication can be regarded as a special case). To do this, the first thing to solve is the inter-network process identification problem. On the same host, different processes can be uniquely identified by the process ID. However, in the network environment, the process number assigned by each host individually cannot uniquely identify the process. For example, host A is assigned to a process number 5, and the number 5th process can exist in machine B, so the phrase "process 5th" is meaningless. Second, the operating system supports a large number of network protocols, different protocols work in different ways, address format is also different. Therefore, the inter-network process communication also solves the problem of multi-protocol recognition.  In order to solve these problems, the TCP/IP protocol introduces the following concepts. 1) Port

A communication port that can be named and addressed in a network is a resource that the operating system can allocate.

As described in the OSI Seven layer protocol, the most significant difference between the transport layer and the network layer is the ability of the transport layer to provide process communication. In this sense, the final address of the network communication is not only the host address, but also some kind of identifier that can describe the process. For this reason, the TCP/IP protocol proposes the concept of a protocol port (protocol port, or port) for identifying the process of communication.

A port is an abstract software structure (including some data structures and I/O buffers). After an application (that is, a process) establishes a connection (binding) to a port through a system call, the data passed to the port by the transport layer is received by the corresponding process, and the data sent by the process to the transport layer is output through that port. In the implementation of the TCP/IP protocol, the operation of the port is similar to the general I/O operation, and the process acquires a port, which is equivalent to acquiring the locally unique I/O file, which can be accessed using the general read and write Primitives. Similar to file descriptors, each port has an integer-type identifier called the port number, which distinguishes the different ports.

Due to the TCP/IP Transport Layer Two protocol TCP and UDP are completely independent of the two software modules, so the respective port number is also independent of each other, such as TCP has a port No. 255, UDP can also have a port No. 255, the two do not conflict.

The allocation of port numbers is an important issue. There are two basic distribution methods: the first is called global allocation, which is a centralized control method, which is distributed uniformly according to user needs by a recognized central agency, and the results are published to the public. The second is the local allocation, also known as dynamic connection, that is, when the process needs to access the Transport Layer service, the application to the local operating system, the operating system returns a local unique port number, the process again through the appropriate system calls to the port number itself (lashing). The TCP/IP port number allocation is combined with the above two methods. TCP/IP divides the port number into two parts, and a small amount of it as a reserved port that is globally assigned to the service process. Therefore, each standard server has a globally recognized port (known as well-known port), and its port number is the same even on different machines. The remaining free ports are allocated locally.  Both TCP and UDP stipulate that a port number less than 256 can be reserved. 2) Address

The two processes that communicate in network communication are on different machines. In an interconnected network, two machines may be located in different networks connected by network interconnection devices (gateways, bridges, routers, etc.). Therefore, three levels of addressing are required:

1. A host can be connected to multiple networks, a specific network address must be specified;

2. Each host on the network should have its unique address;

3. Each process on each host should have a unique identifier on that host.

Typically, the host address consists of a network ID and a host ID, represented by a 32-bit integer value in the TCP/IP protocol, and TCP and UDP use a 16-bit port number to identify the user process.

3) Network byte order

Different computers hold multibyte values in different order, and some machines store low-order bytes (small-endian) at the starting address, and some have high-order bytes (big endian). To ensure the correctness of the data, the network byte order must be specified in the network protocol. The TCP/IP protocol uses 16-bit integers and 32-bit integers in a high-priced, pre-existing format, both of which are included in the protocol header file. Detailed http://blog.csdn.net/hguisu/article/details/7449955#t1

4) Connection

A communication link between two processes is called a connection. The connection is manifested internally as some buffers and a set of protocol mechanisms, which exhibit higher reliability than no connectivity on the outside.

5) Semi-correlated

In summary, the network with a ternary group can be globally unique flag a process:

(Protocol, local address, local port number) such a ternary group, called a semi-correlation (half-association), specifies every part of the connection. 6) All-relevant

A complete inter-network process communication needs to consist of two processes and can only use the same high-level protocol. That is, it is not possible to communicate at one end with the TCP protocol, and the other end with the UDP protocol. Therefore a complete inter-network communication requires a five-tuple to identify:

(Protocol, local address, local port number, remote address, remote port number) Such a five-tuple, called a Correlation (association), that is, two protocols of the same semi-correlation can be combined into a suitable correlation, or fully specified to form a connection.

2.2 Service Mode

In the network hierarchical structure, each layer is strictly one-way dependent, the Division of labor and cooperation at all levels are embodied in the interface between different layers. A service is an abstraction that describes the relationships between different layers, a set of operations that are provided to the upper layer in the network. The lower layer is the service provider, and the upper layer is the user requesting the service. The representation of a service is a primitive (primitive), such as a system call or library function. A system call is a service primitive provided by the operating system kernel to a network application or a high-level protocol. The n layer in the network always provides a more complete service to the N+1 layer than the n-1 layer, otherwise there is no value for n layer. In the OSI terminology, the network layer and its following layers are also referred to as communication subnets, providing point-to-point communication only, without the concept of a program or process. The transport layer realizes "end-to-end" communication, introduces the concept of inter-network process communication, and also solves the problems of error control, flow control, data sorting (message sequencing), connection management, etc., to provide different service modes for this purpose:

1) connection-oriented (virtual circuit) or non-connected connection-oriented service (TCP protocol): It is the abstraction of the telephone system service mode, that is, every time the complete data transmission has to be connected, use the connection, terminate the connection process. During the data transfer process, each packet does not carry the destination address, and the connection number (connect ID) is used. In essence, a connection is a pipeline, and sending and receiving data is not only consistent in order, but also content. The TCP protocol provides connection-oriented virtual circuits.

No connection service (UDP protocol): is the abstraction of the service of the postal system, each grouping carries the complete destination address, each group transmits independently in the system. No connection service can guarantee the order of grouping, the recovery and retransmission without grouping error, and the reliability of transmission is not guaranteed. The UDP protocol provides a non-connected datagram service.

Here are the types of these two services and examples of their applications:

2) Order

In the network transmission, two successive messages may pass through different paths in the end-to-end communication, so that the order of arrival at the destination may be different from the sending time. Order refers to the order in which the data is received in the same order as the sending data.  The TCP protocol provides this service. 3) Error control

A mechanism to ensure that the data received by the application is error-free. The way to check for errors is generally to use the method of checking "inspection and (Checksum)". The way to ensure the transmission error-free is to use the confirmation response technology.  The TCP protocol provides this service. 4) Flow control

A mechanism for controlling the data transfer rate during data transmission to ensure that it is not lost. The TCP protocol provides this service.

5) Byte stream

A byte stream means that only the messages in the transmission are treated as a sequence of bytes, and no bounds are provided for the data stream.  The TCP protocol provides a byte stream service. 6) Message

The receiver wants to save the message boundary of the sender.  The UDP protocol provides message services. 7) Full duplex/half duplex

End-to-end data is transmitted in two directions/One Direction at a time.

8) Cache/out -of-band data

In the byte stream service, because there is no message boundary, the user process can read or write any number of bytes at a time. Caching is required to ensure that the transmission is correct or that a flow-controlled protocol is in use. However, for some special needs, such as interactive applications, this cache is also required to be canceled. In the process of data transmission, you want to not pass the regular transmission to the user for timely processing of a certain type of information, such as the UNIX system interrupt key (delete or control-c), the terminal flow control (Control-s and Control-q), called out-of-band data. Logically, it seems that the user process uses a separate channel to transmit the data. The channel is associated with each pair of connected streams. Because the implementation of out-of-band data in the Berkeley Software distribution is inconsistent with the host agreement specified in RFC 1122, in order to minimize interoperability problems, application writers, unless they require out-of-band data to interoperate with existing services, It is best not to use it.

2.3 Client/server mode

In TCP/IP network applications, the main mode of interaction between the two processes of communication is the client/server mode (Client/server model), that is, the customer sends a service request to the server, and the server receives the request and provides the corresponding service. The client/server model is based on the following two points: first, the cause of network is the network hardware and software resources, computing power and information is not equal, need to share, so as to create a host with a large number of resources to provide services, less resources of Customer request service This non-reciprocal role. Secondly, the inter-network process communication is completely asynchronous, there is no parent-child relationship between the process of communication, and the memory buffer is not shared, so a mechanism is needed to establish a connection between the processes wishing to communicate and to provide synchronization for their data exchange, which is TCP/IP based on different client/server patterns. The client/server model takes the active request mode in the process of working:

Server side:

First, the server must start, and provide the appropriate services upon request:

1. Open a communication channel and inform the local host, it is willing to a recognized address (known, such as FTP 21) to receive customer requests;

2. Wait for customer request to reach the port;

3. Receives a duplicate service request, processes the request, and sends a response signal. Receives a concurrent service request to activate a new process to handle this customer request (such as fork, exec in a UNIX system). The new process handles this customer request and does not need to respond to other requests. After the service is complete, close the communication link with the customer for this new process and terminate.

4. Return to the second step and wait for another customer request.

5. Close the client side of the server:

1. Open a communication channel and connect to a specific port on the host where the server resides;

2. Send a service request message to the server, wait and receive an answer, continue to make a request ...

3. Close the communication channel and terminate after the request has ended.

From the process described above:

1. The role of the client and the server process is asymmetric, so the encoding is different.

2. The service process is generally initiated prior to the customer's request. As long as the system is running, the service process persists until normal or forced termination.

2.4 Socket Types

The socket for TCP/IP provides the following three types of sockets. Streaming Sockets (SOCK_STREAM):

Provides a connection-oriented, reliable data transfer service, with no errors, no duplication of transmission, and in the order of delivery received. Internal flow control

Data flow is considered to be a byte stream with no length limitation.  File Transfer Protocol (FTP) uses streaming sockets. Datagram Sockets (SOCK_DGRAM):

Provides a non-connected service (UDP). Packets are sent in a separate package, without error-free guarantees,

The data may be lost or duplicated, and the reception sequence is confusing. The Network File system (NFS) uses a datagram socket.

Primitive Sockets (Sock_raw):

This interface allows direct access to lower-level protocols such as IP and ICMP. Often used to verify a new protocol implementation or to access new devices that are configured in an existing service.

2.4 Typical socket invocation procedure as mentioned earlier, the application of TCP/IP protocol generally uses the client/server mode, so in practical application, there must be two processes of client and server, and start the server first, the system call sequence diagram is as follows. The socket system invocation for the connection-oriented protocol (such as TCP) is shown in Figure 2.1:

The server must be started first until it executes the accept () call and enters the wait state before it can receive client requests. If the customer starts before this, connect () returns an error code and the connection is unsuccessful.

The socket invocation for no Connection protocol (UDP) is shown in Figure 2.2:

A non-connected server must also be started, otherwise the client request cannot pass through the service process. No connect client calls connect (). Therefore, before the data is sent, the client and the server have not yet established a complete correlation, but each has established a semi-correlation through the socket () and bind (). When sending data, the sender must specify the receiver socket size in addition to the local socket size, thus dynamically establishing the full correlation in the data receiving and sending process.

Instance

This example uses the client/server model for the connection protocol, as shown in Figure 2.3:

Server-side program:

[CPP]  View plain  copy  print? /* file name: streams.c */    #include  <winsock.h>     #include  <stdio.h>    #define  TRUE 1   /*  This program creates a socket and then begins an infinite loop, and whenever it receives a connection through a loop, it prints out a message.   When the connection is broken, or the termination information is received, the connection ends and the program receives a new connection. The format of the command line is:streams */          main ( )     {     int sock, length;    struct sockaddr_in server;    struct sockaddr tcpaddr;    int msgsock;    char buf[1024];     int rval, len;      /*  set up sockets  */     Sock = socket (af_inet, sock_stream, 0);    if  (sock <  0)  {    perror ("Opening stream socket");    exit (1); &NBSP;&NBsp; }   

Large-Scale Price Reduction
  • 59% Max. and 23% Avg.
  • Price Reduction for Core Products
  • Price Reduction in Multiple Regions
undefined. /
Connect with us on Discord
  • Secure, anonymous group chat without disturbance
  • Stay updated on campaigns, new products, and more
  • Support for all your questions
undefined. /
Free Tier
  • Start free from ECS to Big Data
  • Get Started in 3 Simple Steps
  • Try ECS t5 1C1G
undefined. /

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.