1, TCP and UDP differences:
1 TCP provides connection-oriented transmission, the communication must first establish a connection (three times handshake mechanism), UDP provides connectionless transmission, no need to establish a connection before communication.
2 TCP provides reliable transmission (orderly, error-free, no loss, no duplication); UDP provides unreliable transmissions.
3 TCP is oriented to the transmission of byte stream, so it can split the information into groups and reorganize it at the receiving end; UDP is a datagram-oriented transmission with no packet overhead.
4 TCP provides congestion control and flow control mechanism, UDP does not provide congestion control and flow control mechanism.
2, flow control and the implementation of congestion control mechanism:
1 TCP adopts the sliding window mechanism with variable size to realize the flow control function. The size of the window is byte. The value that is written to the window field in the header of the TCP segment is the upper limit of the data currently being sent to the other.
In the process of data transmission, TCP provides a flow control mechanism based on sliding window protocol, which can control the amount of data sent by the sender by the size of receiving capacity (buffer capacity).
2 The sliding window mechanism can also be used for congestion control of the network, the number of packets in the network (TCP segment as part of its data) is maintained under a certain number, when the value of the network, the performance will deteriorate dramatically. The congestion control of the transport layer has four algorithms, such as slow start (slow-start), congestion avoidance (congestion avoidance), fast retransmission (fast retransmit) and fast recovery (fast Recovery).
Congestion: A large number of datagrams pouring into the same switched node (such as routers), causing the node to run out of resources and have to discard the data that came in the back, which is congestion.
3, retransmission mechanism:
Each time TCP sends a message segment, it sets a timer. As long as the timer set the time of the reset and has not received confirmation, it is necessary to repeat the message segment.
TCP Environment
Packet round-trip time is variable, there is a big difference
A, b in a local area network, round-trip delay is very small
A, C in an Internet, round-trip time delay is very large
Therefore, a is difficult to determine a fixed, and B, c communication is applicable to the timer time
TCP uses an adaptive algorithm. This algorithm records the time that each message segment is issued and the time it receives the corresponding acknowledgement message segment. The difference between these two times is the round-trip delay of the message segment. The average round-trip time delay t of the message segment is obtained by weighted average of the round-trip time delay samples of each message segment.
4, sliding window mechanism:
TCP uses a sliding window with variable size for flow control. The Unit of the window size is bytes.
The value that is written to the window field in the header of the TCP segment is the upper limit of the number of send Windows currently set to the other. The Send window is agreed by both parties when the connection is established. However, in the process of communication, the receiver can dynamically adjust the maximum value of the sending window (can increase or decrease) according to its own resource situation at any time.
5, multithreading How to sync:
Four ways of critical, mutex, event, Semaphore
Difference between critical sections (Critical section), mutexes (mutexes), semaphores (semaphore), Events (event)
1, critical area: Through the serialization of multithreading to access public resources or a section of code, fast, suitable for controlling data access. Only one thread is allowed to access the shared resource at any time. If more than one thread attempts to access a public resource, after one thread enters, other threads that attempt to access the public resource are suspended and wait until the thread entering the critical section leaves and the critical section is freed before other threads can preempt it.
2), mutual exclusion: the use of mutually exclusive object mechanism. Only a thread that has a mutex has access to a public resource because there is only one mutex, so that public resources are not accessed by multiple threads at the same time. Mutual exclusion not only realizes the common resources security sharing of the same application, but also realizes the security sharing of public resources of different applications. The mutex is more complex than the critical region. Because mutual exclusion not only enables the secure sharing of resources within different threads of the same application, but also enables secure sharing of resources between threads of different applications.
3), signal quantity: It allows multiple threads to access the same resource at the same time, but needs to limit the maximum number of threads that access the resource at the same time. Semaphore objects are synchronized to threads in a way that allows multiple threads to use shared resources at the same time as the PV operation in the operating system. It indicates the maximum number of threads accessing the shared resource at the same time. It allows multiple threads to access the same resource at the same time, but needs to limit the maximum number of threads that access the resource at the same time.
The concept of PV operation and semaphore is proposed by E.w.dijkstra, a Dutch scientist. Semaphore S is an integer, s greater than equals zero represents the number of resource entities available to the concurrent process, but s less than zero indicates the number of processes waiting to use the shared resource.
P Operation Request Resources:
(1) s minus 1;
(2) If s minus 1 is still greater than or equal to zero, then the process continues to execute;
(3) If s minus 1 is less than 0, the process is blocked into a queue corresponding to the signal and then transferred to the process schedule.
V Operation Frees Resources:
(1) s plus 1;
(2) If the addition result is greater than 0, the process continues to execute;
(3) If the addition result is less than or equal to zero, then a waiting process is awakened from the waiting queue of the signal, which is then returned to the original process for execution or transfer to the process schedule.
4, Events: the way to notify the operation to maintain the synchronization of the thread, but also to facilitate the implementation of multiple threads of priority comparison operations.
Summarize:
1 The mutex is very similar to the function of the critical section, but the mutex can be named, that is, it can be used across processes. So creating a mutex requires more resources, so using a critical section for just the inside of a process can lead to a speed advantage and reduce the amount of resources taken up. Because a mutex is a mutex that spans a process, once created, it can be opened by name.
2 Mutex (mutex), Semaphore (semaphore), events (event) can be used across processes to synchronize data operations, while other objects are independent of data synchronization operations, but for processes and threads, if processes and threads are not signaled in the state of the run, After exiting for signaled status. So you can use WaitForSingleObject to wait for processes and threads to exit.
3 The exclusive use of the resource can be specified by the mutex, however, if there is one of the following conditions can not be processed through the mutex, for example, now a user purchased a three concurrent access to the database system, can be based on the number of access licenses purchased by the user to determine how many threads/processes can be database operations at the same time, At this time, if the use of mutual exclusion is no way to complete this requirement, the semaphore object can be said to be a resource counter.
6, process and thread differences:
A: A thread is a unit of execution within a process and a scheduled entity within a process. The difference from the process:
(1) Dispatch: As the basic unit of dispatch and allocation, the process acts as the basic unit of owning resources.
(2) Concurrency: Not only can concurrent execution between processes, but also can be executed concurrently between multiple threads of the same process.
(3) Owning a resource: a process is a stand-alone unit of resources, and threads do not own system resources, but they can access resources that are subordinate to the process.
(4) Overhead: When you create or undo a process, because the system allocates and reclaims resources for it, the overhead of the system is significantly greater than the cost of creating or undoing the thread.
7. What are the ways of communication between processes, and what are their advantages and disadvantages:
1 Pipeline: Pipeline is a kind of half-duplex communication, data can only be one-way flow, and can only be used between relational processes. A process's affinity usually refers to a parent-child process relationship.
2 well-known pipeline (FIFO): A well-known pipe is also half-duplex communication, but allows the use of unrelated processes, the pipeline is a first-in first out of communication.
3 semaphore: Semaphore is a counter that can be used to control the access of multiple processes to shared resources. It is often used as a locking mechanism to prevent a process from accessing a shared resource, and other processes also access that resource. Therefore, it is primarily used as a means of synchronization between processes and between different threads within the same process.
4 Message Queuing: Message Queuing is a linked list of messages, stored in the kernel and identified by the message queue identifier. Message Queuing overcomes the disadvantage of less signal transmission information, the pipeline can only host the unformatted byte stream, and the buffer size is limited.
5) signal (sinal): signal is a more complex way of communication, used to inform the receiving process that an event has occurred.
6 Shared Memory: Shared memory is a mapping of memory that can be accessed by other processes, and this shared memory is created by one process, but can be accessed by multiple processes. Shared memory is the fastest way to IPC, and is designed specifically for the inefficient operation of other interprocess communication methods. It is often used in conjunction with other communication mechanisms, such as semaphores, to achieve synchronization and communication between processes.
7 socket (SOCKET): Socket is also a interprocess communication mechanism, unlike other communication mechanisms, it can be used for process communication between different machines.
8, the TCP connection establishes the time 3 times handshake the concrete process, as well as each step reason:
(1) The first step: the source host a TCP send a connection request message segment to Host B, its header in the SYN (SYNC) flag bit should be set to 1, indicating that you want to communicate with the target Host B, and send a synchronous serial number X (example: seq=100) for synchronization, Indicates that the ordinal of the first byte of data when the data is transferred is x+1 (that is, 101). The SYN synchronization message indicates the port used by the client and the initial ordinal number of the TCP connection.
(2) The second step: the target Host B TCP received the connection request message paragraph, if agreed, then send back confirmation. The ACK bit and SYN position 1 should be in the confirmation report to indicate that the client's request was accepted. The confirmation number should be x+1 (101 in figure), and also select an ordinal y for yourself.
(3) The third step: the source host a TCP receives the target Host B to confirm to the target Host B to give the confirmation, its ACK to set 1, the confirmation number is y+1, but own serial number is x+1. The TCP standard stipulates that the SYN 1 message segment consumes an ordinal number.
The TCP notification for the source host a of the client process, the upper application process, has been established. When source host A sends the first data segment to target Host B, its serial number is still x+1 because the previous acknowledgment segment does not consume the ordinal number.
When TCP of the target host B that is running the service process receives confirmation from source host A, it also notifies the top-level application process that the connection has been established. A Full-duplex connection has been established.
9, TCP Disconnect the specific process, each step is why do so:
1 First step: The application process of source host a first sends a connection release request to its TCP and no longer sends the data. TCP notifies the other party to release the connection from A to B, the end bit of the TCP segment header that is destined for Host B is 1, and its ordinal x equals the ordinal of the last byte of the previously transmitted data plus 1.
2 The second step: the target Host B TCP received the release of the connection notification is issued after the confirmation, the serial number is Y, the confirmation number is x+1, and the high-level application process is notified, so that the connection from A to B is released and the connection is in a semi shutdown state, which is equivalent to host a saying to Host B: "I have no data to send." But if I still send the data, I still receive it. "Thereafter, Host B no longer receives data from host a. However, if the Host B has some data to send host A, you can continue to send. Host A can still send confirmation to Host B as long as the data is received correctly.
3 Third step: If Host B no longer sends data to host a, its application process notifies TCP to release the connection. The connection release segment emitted by Host B must have the terminating bit fin and acknowledgment bit ACK set to 1 and the ordinal number is still y, but must also repeat the last sent ack=x+1.
4 The Fourth step: Host A must issue a confirmation, will ACK set 1,ack=y+1, and its own serial number is x+1. This will release the connection from B to a in the opposite direction. Host A's TCP is then reported to its application process, and the entire connection has been released.
10. Details of state transitions in the various processes in which TCP establishes a connection and disconnects:
Client: Active Open syn_sent->established-> active shutdown fin_wait_1->fin_wait_2->time_wait
Server side: LISTEN (passively Open)->syn_rcvd->established->close_wait (passive shutdown)->last_ack->closed
11, the difference between Epool and select:
The problem arises, when you need to read more than two I/O, if the use of blocking I/O, then may be blocked for a long time on a descriptor, while the other descriptor, although there is data but not read, so real-time does not meet the requirements, the approximate solution has the following:
1. Use multiple processes or multiple threads, but this approach can cause complexity of the program, and also requires a lot of overhead for the creation and maintenance of processes and threads. (Apache server is the way to use subprocess, benefits can isolate users)
2. Use a process, but read data with non-blocking I/O, return immediately when I/O is unreadable, check for next readable, this form of loop is polling (polling), which is a waste of CPU time, because most of the time is unreadable, But it still takes time to repeatedly perform the read system call.
3. Asynchronous I/O (asynchronous I/O), when a descriptor is ready to tell the process with a signal, but because the number of signals is limited, multiple descriptors do not apply.
4. A better way for I/O multiplexing (I/O multiplexing) (seemingly also translated multiplexing), construct a list of descriptors (Epoll in the queue) and call a function until one of these descriptors is ready to return, telling the process which i/ O ready. Both the Select and Epoll mechanisms are solutions to the multipath I/O mechanism, the Select is in the POSIX standard, and the Epoll is unique to Linux.
The difference (epoll relative to the Select advantage) is mainly three:
The number of handles for 1.select is limited, and the Linux/posix_types.h header file has the following statement: #define __FD_SETSIZE 1024 Indicates that the select listens for up to 1024 FD at the same time. And Epoll does not, its limit is the maximum number of open file handles.
The biggest advantage of 2.epoll is that it will not reduce efficiency as the number of FD increases, polling processing is used in Selec, where the data structure is similar to an array data structure, and epoll is to maintain a queue to see if the queue is empty. Epoll only operates on "active" sockets-this is because the Epoll is implemented in the kernel implementation based on the callback function above each FD. Then, only "active" socket will be active to call the callback function (put this handle into the queue), the other idle state handle will not, in this case, epoll implementation of a "pseudo" AIO. But if most of the I/O is "active" and each I/O port usage is high, epoll efficiency is not necessarily higher than select (perhaps to maintain a queue complexity).
3. Use Mmap to accelerate message delivery between the kernel and user space. Whether it is select,poll or epoll need the kernel to notify users of the FD message, how to avoid unnecessary memory copies is very important, in this case, epoll through the kernel in the user space mmap the same memory.
12. The difference between ET and LT in Epool and its realization principle:
Epoll has 2 ways of working: LT and ET.
LT (level triggered) is the default mode of operation and supports both block and No-block sockets. In this practice, the kernel tells you whether a file descriptor is ready, and then you can io the ready fd. If you don't do anything, the kernel will continue to notify you, so this pattern is less likely to be programmed incorrectly. Traditional Select/poll are representative of this model.
ET (edge-triggered) is a high-speed mode of operation that supports only no-block sockets. In this mode, when the descriptor is never ready to be ready, the kernel tells you through Epoll. And then it will assume you know that the file descriptor is ready, and no more ready notifications are sent for that file descriptor until you do something that causes that file descriptor to be no longer in the ready state (for example, if you send, receive, or receive requests, or send received data less than a certain amount) A ewouldblock error). Note, however, that the kernel will not send more notifications (only once) if this FD is not being used for IO operations (which would cause it to become not ready again), but in the TCP protocol, the Acceleration utility of the ET mode still requires more benchmark confirmation.
Epoll only epoll_create,epoll_ctl,epoll_wait 3 system calls.
13, write a server program need to pay attention to what problems:
14, threadlocal and other synchronization mechanism comparison:
Threadlocal and all other synchronization mechanisms are to solve the multiple-thread access to the same variable, in the ordinary synchronization mechanism, by locking the object to achieve multiple threads to the same variable security access. This variable is shared by multiple threads, and using this synchronization mechanism requires careful analysis of when to read and write variables, when to lock an object, when to release the object's lasso, and so on. All of these are caused by multiple threads sharing the resource. Threadlocal from another point of view to solve the concurrent access to multithreading, threadlocal for each thread to maintain a and the thread bound variable copy, thereby isolating multiple threads of data sharing, each thread has its own copy of the variable, It is therefore not necessary to synchronize the variable. Threadlocal provides a thread-safe shared object that can encapsulate unsafe variables into threadlocal when writing multithreaded code.
Summary: Of course, threadlocal can not replace the synchronization mechanism, the two problems facing different areas. The synchronization mechanism is to synchronize multiple threads of concurrent access to the same resource, which is an effective way to communicate between multiple threads, while threadlocal is a data-sharing isolation of multiple threads, fundamentally not sharing resources (variables) between multiple threads, so of course there is no need to synchronize multiple threads. So, if you need to communicate between multiple threads, use the synchronization mechanism, and if you need to isolate a shared conflict between multiple threads, you can use threadlocal, which will greatly simplify your program, making the program easier to read and simpler.
15, memory pool, process pool, thread pool:
The idea of the
Custom memory pool is revealed by this "pool" word, the application can use the system's memory allocation call in advance to request the appropriate size of memory as a pool of memory, then the application itself allocation and release of memory can be done through this memory pool. The memory allocation function of the system needs to be invoked only when the size of the memory pool needs to be dynamically expanded, and the rest of the time the memory operation is in the control of the application. The application custom memory pool has different types depending on the applicable scenario. From a thread-safe point of view, the memory pool can be divided into single-threaded memory pools and multithreaded memory pools. Single-threaded memory pools are used by only one thread for the entire lifecycle, so there is no need to consider mutually exclusive access issues; A multithreaded memory pool may be shared by multiple threads, so you need to lock each time you allocate and release memory. In contrast, single-threaded memory pools have higher performance, while multi-threaded memory pools are more widely available. The
allocates memory unit sizes from the memory pool and can be divided into fixed memory pools and variable memory pools. A fixed memory pool is defined as the size of the memory unit allocated from the memory pool each time the application is fixed, and the variable memory pool can change the size of each allocated memory cell on demand, with a wider range of applications and lower performance than a fixed memory pool.