Details of TCP/IP 2: Implementation -- plug-in Layer

Source: Internet
Author: User
Tags sendmsg

The main function of the plug-in layer is to map protocol-related requests sent by the process to the protocol-related implementation specified when the plug-in is generated. Description

The hierarchical relationship between the interface and the Protocol implementation in the kernel.



1. Socket Structure

The plug-in represents one end of a communication link and stores or points to all information related to the link. The information includes the used protocol and Protocol status information (including

Source and target addresses), arrived link queue, data cache, and optional flag. The plug-in and plug-in-related cache definitions are provided.


So_type is specified by the process that generates the plug-in, indicating the communication semantics supported by the plug-in and related protocols. For UDP, It is sock_dgram, for TCP, it is

Sock_stream.

So_options is a group of signs that change the interface behavior. As shown in:


The getsockopt and setsockopt System Call processes can modify all plug-in options except so_acceptconn. When the listen system call is sent on the plug-in

This option is set by the kernel.

So_linger is equal to the interval at which the plug-in will continue to send data when a connection is closed (Unit: one time tick)

So_state indicates the internal status of the plug-in and some other features. Is the possible value of so_state.


Processes can directly modify ss_async and ss_nbio through fcntl and IOCTL system calls.

If ss_nbio is set, When I/O operations are performed on the plug-in and the requested resources are not available, the kernel does not block the process, but returns ewouldblock.

If ss_async is set, when the plug-in status changes due to one of the following circumstances, the kernel sends a sigio signal to the process or process group identified by so_pgid:

  • Connection Request completed
  • The disconnected request has been started.
  • Disconnected request completed
  • A connected channel has been closed
  • Data arrives at the plug-in.
  • Data has been sent
  • A one-step error occurs on the UDP or TCP plug-in.
So_pcb points to the Protocol control block, which contains the protocol-related status information and plug-in parameters. Each Protocol defines its own control block structure. So so_pcb is defined as a common pointer. Lists the control block structures we have discussed.
So_proto points to the protosw structure of the protocol selected by the process in the socket system call. The plug-in with the so_acceptconn flag configured maintains two connection queues. Connections that have not been fully established (for example, TCP's three-way handshake is not completed) are placed in the queue so_q0. The established connection or the received connection (TCP's three-way handshake is completed) is put into the so_q queue. The queue length is so_q0len and so_qlen. Each queued connection is represented by its own plug-in. In a queued plug-in, so_head points to the Source Plug-in with so_acceptconn set. The number of queued connections on the plug-in is controlled by so_qlimit. The process can set so_qlimit through the listen system call. It indicates the queue content when three connections are accepted and one connection is established.
So_timeo is used as the waiting channel during accept, connet, and close processing. So_error saves the error code until the error code is sent to the process during the next system call that references the plug-in. So_oobmark identifies the start point of the recently received out-of-band data in the input data stream. Each plug-in contains two data caches, so_rcv and so_snd, which are used to cache the data received or sent respectively. No so_tpcb is used in net/3. So_upcall and so_upcallarg are only used for NFS software in net/3.
2. System calling processes interact with the kernel through a set of defined functions. These functions are called system calls. The conversion from a process to a protected environment in the kernel is related to machines and implementations. In the following discussion, we use the net/3 Implementation on 386 to explain how to implement the relevant operations. In the BSD kernel, each system call is numbered. When a process executes a system call, the hardware is configured to send control only to one kernel function. Pass the system-called Integer as a parameter to the kernel function. In implementation 386, this kernel function is syscall. Using the system call number, syscall finds the requested System Call sysent structure in the table. Each unit in the table is in a sysent structure. Struct sysent {
Int sy_narg;/* number of arguments */
INT (* sy_call) ();/* implementing function */
};/* System call table entry */
Shows the table form: struct sysent [] = {
/*...*/
{3, recvmsg},/* 27 = recvmsg */
{3, sendmsg},/* 28 = sendmsg */
{6, recvfrom},/* 29 = recvfrom */
{3, accept},/* 30 = accept */
{3, getpeername},/* 31 = getpeername */
{3, getsockname},/* 32 = getsockname */
/*...*/
}
Syscall copies the parameters from the calling process to the kernel and assigns an array to save the system call results. Then, after the system call is completed, syscall returns the result to the process. Syscall assigns control to kernel functions corresponding to system calls. In implementation 386, the call is a bit like: struct sysent * callp;
Error = (* callp-> sy_call) (p, argS, rval );
Here, the pointer callp points to the relevant sysent structure, and the pointer P points to the process entry of the system call. ARGs is passed to the system call as a parameter, which is a 32-bit long word group, rval is an array used to save the returned results of system calls. The array has two elements, each of which is a 32-bit long word. When we use the system to call this term, we refer to the functions in the kernel called by syscall, rather than the functions in the process called by the application. Syscall expects that the system calls the function to return 0 if there are no errors; otherwise, the system returns non-0 error codes. If no error occurs, the kernel sends the value in rval to the process as the return value of the system call. If an error occurs, syscall ignores the value in rval and returns the error code to the process in a machine-related manner, so that the process can get the error code from the external variable errno. The function called by the application returns-1 or a null pointer, indicating that the application should check errno for error information.
2. 1. For example, the prototype of a socket system call is: int socket (INT domain, int type, int Protocol). The prototype of the kernel function called by the socket system is: struct socket_args {
Int domain;
Int type;
Int protocol;
};
Socket (struct proc * P, struct socket_args * UAP, int * retval );
When an application calls a socket, the process uses the system call mechanism to send three independent Integers to the kernel. Syscall copies the parameter to the 32-bit value array and
The Group pointer is passed as the second parameter to the kernel version of the socket. The socket of the kernel version uses the second parameter as a pointer to the socket_args structure. Shown
The above process.

Similar to socket, Each kernel function that implements system calls describes ARGs as a structure pointer related to system calls, rather than a 32-bit character.
Array pointer.

3. Network System Call shows the network system call flowchart.


4. process, descriptor, and plug-in

This section describes the data structures of processes, descriptors, and plug-ins. These structures and related structure members are given.

The first parameter of the function to be called by the system is always P, that is, the pointer to the proc structure of the calling process. The kernel uses the proc structure to record information about the process.

In the proc structure, p_fd points to the filedesc structure. The main function of this structure is to manage the description table pointed to by fd_ofiles. The descriptor table size changes dynamically,

A pointer array pointing to the file structure. Each file structure describes an open file, which can be shared by multiple processes.

In the file structure, we are interested in two structure members: f_ops and f_data. The implementation of an I/O system call varies depending on the type of the descriptor I/O object.

F_ops points to the fileops structure, which contains a function pointer table that implements read, write, ioctl, select, and close system calls.

F_data points to the dedicated data of the relevant I/O object. For the plug-in, f_data points to the socket structure related to the descriptor. Finally, in the socket structure

So_proto points to the protosw structure of the protocol selected when the plug-in is generated.


5. Socket System Call

The socket system calls a new plug-in and associates the plug-in with the Protocol specified by the process in the parameter domain, type, and protocol. This function is assigned

A new descriptor is used to identify the plug-in subsequent system calls and return the descriptor to the process.

The system call statement is as follows:


The approximate processing of the function is as follows:

1. falloc allocates a new file structure and an element in the fd_ofiles array, sets the type of the file structure, which is readable and writable and serves as a plug-in.

2. Call socreate to allocate and initialize a socket structure.


5.1.socreate Function

Most plug-in system calls are divided into at least two functions, similar to socket and socreate. The first function obtains the required sequence from the process and calls the second function.

Function soxxx to complete the function processing, and then return the result to the process. This approach of dividing multiple functions is to enable the second function to be directly used by the kernel-based network protocol.

Call.

The general processing process of the socreate function is as follows:

1. Protocol exchange table found. Find the pointer or NULL pointer of the protosw structure matching the protocol based on the function parameters.

2. Allocate and initialize the socket result. Allocate a new Socket Structure and initialize related fields.

3. pru_attach request. Each Protocol provides a function to Process Communication requests from the plug-in layer. So-> so_proto-> pr_usrreq is a point

User request function pointer associated with the plug-in so protocol. The function prototype is:

Int pr_usrreq (struct socket * So, int req, struct mbuf * MO, * m1, * m2 );

REQ is a constant that identifies a request. The last three parameters vary with requests. Lists the communication requests provided by the pr_usrreq function.


4. Exit Processing. Return the new socket.


6. getsock and sockargs Functions

These two functions are repeated in the plug-in system call.

Getsock maps descriptors to a file table.

Sockargs copies the parameters passed by the process to a new allocated mbuf in the kernel.


7. Bind system call

The BIND System Call associates a local network transport layer address with the plug-in. Generally, a process as a customer does not care about its local address.

In this case, the process does not need to call the BIND before communication; the kernel automatically selects a local address for it.

The server process always needs to be bound to a known address. Therefore, the process must call bind before receiving a connection or a datagram, because the customer Process

You need to establish a connection with a known address or send data to a known address. The BIND System Call statement is as follows:


The approximate processing of the BIND function is as follows:

1. getsock returns the file structure of the descriptor

2. sockargs copies the local address to the mbuf of the kernel.

3. Pass the file structure and mbuf to the sobind function.


7.1.sobind Function

Sobind is an encapsulation tool that sends a pru_bind request to the Protocol associated with the plug-in (call the so-> so_proto-> pr_usrreq function ).


8. Listen system call

The listen system calls the function to notify the Protocol process to prepare to receive connection requests on the plug-in. It also specifies the threshold value of the number of connections that can be queued on the plug-in.

When the threshold value is exceeded, the plug-in layer will refuse to wait in the queue for connection requests. In this case, TCP ignores the connection request. The process can

Call accept to obtain connections in the queue.

The listen System Call statement is as follows:


The listen system calls are handled as follows:

1. Call getsock to return the file structure of the descriptor.

2. Call solisten to pass the request to the protocol layer.


8.1.solisten Function

The solisten function sends a pru_listen request (call the so-> so_proto-> pr_usrreq function) and prepares the plug-in to receive connections.


9. tsleep and wakeup Functions

When a process executed in the kernel cannot continue execution because the kernel resources are not available, it calls tsleep to wait. The prototype of tsleep is:

Int tsleep (caddr_t Chan, int PRI, char * mesg, int timeo );

Chan is called a wait channel, which indicates the specific resource or event that the process is waiting. Many processes can sleep on the same waiting channel at the same time. When resources are available or events occur

At present, the kernel calls wakeup and passes in the wait channel as a unique parameter. The wakeup prototype is:

Void wakeup (caddr_t Chan );

All processes waiting for this channel are awakened and set to running. When each process resumes execution, the kernel arranges tsleep to return. Listed

Tsleep return value.


Because all the processes waiting on the same wait channel are awakened by wakeup, we always see that tsleep is called in a loop. Every wake-up process

Before proceeding, you must check whether the waiting resources are available, because another wake-up process may have obtained the resources step by step. If you still cannot get resources,

The process then calls tsleep to wait.


10. Accept system call

After listen is called, the process calls accept to wait for the connection request. ACCEPT returns a new descriptor pointing to a new plug-in connected to the customer. Original plug-in

Still not connected, and prepare to receive the next connection. If the name points to a correct cache, accept returns the address of the other party.

The connection processing system is completed by the Protocol associated with the plug-in. For TCP, when a connection has been established (that is, the three-way handshake is completed), the plug-in layer is notified.

The accept System Call statement is as follows:


The approximate process of the function is as follows:

1. Verify the parameters.

2. Wait for the connection. The while loop calls the tsleep function. When the following conditions occur, the while clause exits: A connection arrives, an error occurs, and the plug cannot receive

Data. In the loop, the process waits in tsleep. When there is a connection, tsleep returns 0. If the tsleep is interrupted by signal or the plug-in is set to non-blocking,

Then, accept returns eintr or ewouldbolck.

3. asynchronous errors. If an error occurs during sleep, the error code in the plug-in is assigned to the return code in accept. After clearing the error code in the plug-in, accept

. Therefore, the plug-in must check the return value after each wake-up to check whether errors occur during the process's sleep.

4. Associate the plug-in with the descriptor. The falloc function allocates a descriptor for the new connection. It deletes the plug-in from the accept queue and stores it in The FLE structure of the descriptor.

5. protocol processing. Call soaccept to complete protocol processing. Finally, copyout is called to copy the address to the process space.


10.1.soaccept Function

The soaccept function is connected to a descriptor and sends a pru_accept request to the Protocol. After pr_surreq is returned, it contains the name of the external plug-in.


11. sonewconn and soisconnected Functions

Accept waits for the protocol layer to process incoming connection requests and puts them into so_q. TCP is used to describe this process.


Accept Calls tsleep to wait for the connection to enter. Tcp_input calls sonewconn to generate a plug-in for the new connection to process the incoming tcp syn. Sonewconn

Put the generated plug-in the so_q0 queue because the three-way handshake is not completed yet. '

When the last ack of the TCP handshake protocol arrives, tcp_input calls soisconnected to update the plug-in and move it from so_q0 to so_q,

Wake up all processes that call the connection that accept is waiting to enter.

When tsleep returns, accept gets a connection from so_q and sends a pru_attach request. The plug-in establishes a connection with a new file descriptor, and accept also

Return to the calling process.


12. Connect system call

The server process calls the listen and accept system calls to wait for the process to initialize the connection. If the process wants to initialize a connection (client) by itself, connect is called.

For connection-oriented protocols such as TCP, connect establishes a connection with the specified external address. If the process does not call bind to bind the address, choose

And an address is implicitly bound to the plug-in.

For non-connection protocols such as UDP or ICMP, connect records external addresses for data reporting. Any previous external address is replaced by a new address.

Shows the functions involved when UDP or TCP calls connect.


The left side shows how connect handles connectionless protocols, such as UDP. In this case, the protocol layer calls soisconnected and then the connect system calls it and returns immediately.

On the right, explain how connect processes connection-oriented protocols, such as TCP. In this case, the protocol layer starts to establish a connection and calls soisconnecting to indicate the connection.

It will be completed at some time. If the plug-in is not blocked, soconnect calls tsleep to wait for the connection to complete. For TCP, when three handshakes are completed, the protocol layer calls

Soisconnected identifies the plug-in as connected. Then, call wakeup to wake up the waiting process to complete the connect system call.

The Declaration of the Connect system call is as follows:


The general handling process of connect system calls is as follows:

1. Start connection processing. The connection starts when soconnect is called.

2. Wait for the connection to be established. The while loop does not exit until the connection is established or an error occurs.

3. Identify the "in connection" flag because the connection has been completed or the connection request has failed. Release the mbuf that stores the external address.


12.1.soconnect Function

The soconnect function ensures that the plug-in is in the correct connection status. If the plug-in is not connected or the connection is not suspended, the connection request is always correct. If the plug-in

If the connection has been established or is waiting for processing, the new connection request will be rejected by the connection-oriented protocol (such as TCP. For non-connection protocols such as UDP, multiple connections are

Yes, but the external address in each new request replaces the original external address.

Soconnect sends an rpu_connect request to start corresponding protocol processing to establish a connection or association.


13. Shutdown System Call

Shutdown System calls to close the connected read channel, write channel, or read/write channel. For read channels, shutdown discards data that has not been read or written by all processes and

Data generated after shutdown is called. For the write channel, Shutdown enables the protocol to be processed accordingly. For TCP, all the remaining data will be sent, sent

Send fin after completion. This is the half-off feature of TCP.

To delete the plug-in and release descriptor, you must call close. You can directly call close without calling shutdown. Like all descriptors,

When the process ends, the kernel will call close to close all plug-ins that have not been closed.

The Declaration of the shutdown system call is as follows:


The expected values of how and how ++ are as follows:


Shutdown is the wrapper function of soshutdown ). Getsock returns the plug-in associated with the descriptor and calls soshutdown,

And return its value.

The read channel that closes the connection is processed by the plug-in layer calling sorflush, And the write channel is closed by the pru_shutdown request at the protocol layer.


14. Close System Call

The close system call can be used to close all kinds of descriptors. When FD is the last descriptor of the referenced object, the close function related to the object is called:

Error = (* FP-> f_ops-> fo_close) (FP, P );

The FP-> f_ops-> fo_close function of the plug-in is the soo_close function.

The soo_close function is the encapsulation of the soclose function. The soclose function cancels all unfinished connections on the plug-in (that is, connections not fully accepted by the process ),

Wait for the data to be transmitted to the external system and release the unwanted data structure.

The approximate processing process of the soclose function is as follows:

1. Discard unfinished connections. Traverse the so_q0 and so_q connection queues and call the soabort function to cancel each suspended connection. Soabort sends pru_abort

Request to the protocol and return the result.

2. Disconnect the established connection or association. If the plug-in is connected to an external address, you must disconnect the plug-in from the peer address.

3. Release the data structure. If the plug-in is still connected to the protocol, the pru_detach request is sent to disconnect the plug-in from the protocol. Finally, the plug-in is identified as the same as any

The descriptor is not associated. Call the sofree function to release the plug-in.

Details of TCP/IP 2: Implementation -- plug-in Layer

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.