Guide to cainiao installation-concepts that can be remembered after you have heard of the linux Kernel

Source: Internet
Author: User

Guide to cainiao installation-concepts that can be remembered after you have heard of the linux Kernel

I plan to share it with our department. We found that some bottom-layer knowledge is stuck in one sentence. For example, we heard that JVM uses-XX:-UseBiasedLocking to cancel biased locking to improve performance, because it is only applicable to non-multi-thread high concurrency applications. Use the Digital Object cache-XX: AutoBoxCacheMax = 20000-128 ~ 127 improve performance. For the JVM and Linux kernel, the operating system does not have the concept of a system, and there is often no idea when encountering practical problems. Therefore, my internal sharing involves linux, jvm, and redis. This is a linux article. The main learning idea is knowledge. I am also a cainiao ~~ It's just a diamond-hearted cainiao, not afraid that others will know how many dishes I have.

Let me explain why I want to learn Linux kernel. I am in charge of the entire company's search engine in my previous company. On one occasion, a new set was built on a new virtual machine, with a pressure of 8000, and a NIO exception was reported, saying: too program open files. I checked that the machine was too broken and shared with many services. The memory was almost full. So there is no such problem if you change to a better machine. But what is an exceeded handle? Let's take a look at some basic concepts of the Linux kernel.

Let's take a look at the unix architecture.

A simple explanation: any computer system contains a basic set of programs that control computer hardware resources and provide a running environment. It is called an operating system. In this collection, the most important program is called the kernel and is loaded at system startup. Because it is relatively small and located at the core of the environment. A kernel interface is called a system call ). The public function library is built on the system call interface and can also be called by the system. Shell is a special application that provides an interface for running other applications.

Some operating systems allow all user programs to directly interact with the hardware, such as MS-DOS. However, the Unix-like operating system hides all the underlying details related to the physical organization of the computer before the Hu Yong application. When a program wants to use hardware resources, it must send a request to the operating system, and the kernel evaluates the request. If this resource is allowed, the kernel represents the interaction between applications and related hardware. To implement this mechanism, modern operating systems rely on special hardware features to prohibit users' programs from directly dealing with the underlying hardware, or directly accessing arbitrary physical addresses. The hardware introduces at least two different execution modes for the CPU: the non-privileged mode of the user program and the privileged mode of the kernel. Unix calls them User Mode and Kernel Mode respectively ).

Some linux commands we usually call are actually the C language functions of the corresponding kernel. For example, cat xxx | grep 'x '. The two commands are connected with |. This is called "Pipeline ". First, I would like to introduce the typical language used by boys: pipeline is a widely used means of inter-process communication. It is used to transmit messages between unrelated processes. The so-called kinship refers to the same ancestor. It can be a father or son, a brother, or an ancestor. As long as the common ancestor calls the pipe function, the open pipeline file will be shared by future generations after fork. In essence, the kernel maintains a buffer that is associated with pipeline files. Operations on pipeline files are converted into the buffer memory operations by the kernel. It can be divided into anonymous MPs queues and named MPs queues.

Here we have some concepts. The concept of a process should be clear to everyone: the execution instance of a program is called a process. The UNIX system ensures that each process has a unique numerical identifier, called the process ID, which is a non-negative number. Many linux commands are displayed. There are three main functions used for process control: fork, exec and waitpid. The fork function is used to create a new process, which is a copy of the called process and is called a sub-process. Fork returns the ID of the new Child process (a non-negative integer) to the parent process, and returns 0 to the child process. Because fork creates a new process, it is called once, but returns twice.

All threads in a process share the same address space, file descriptor, stack, and process-related attributes. Because they can access the same storage area, various threads need to take synchronization measures when accessing shared data to avoid inconsistency. Here, we all have some concepts: Why is the process overhead large and threads involve locks.

An anonymous pipeline is an unnamed one-way pipeline that transmits data between a parent process and a child process. Only two processes on the local machine can communicate with each other, rather than cross-network communication. Common commands such as linux commands.

A named pipe is a one-way or two-way pipe between processes. A name is specified during creation. Any process can use this name to open the other end of the pipe and communicate across networks.

This is a jvisualvm debugging, and the blue box part is equivalent to a named pipeline.

 

Okay. Now I want to answer the following question: Which of the following methods can be used for inter-process communication?

The anonymous and named pipelines just mentioned are regarded as one type. In addition, there are signal, message queue, shared memory, semaphores and sockets. You don't have to worry about it. In the end, you may feel suddenly enlightened. what you learned can finally be listed together.

Signal: Actually, it is short for Soft Interrupt signal. Used to notify the process of asynchronous events. At the software level, it is a simulation of the interrupt mechanism. In principle, a process receives a signal, which is the same as the processor receives an interrupt request. A signal is the only asynchronous communication mechanism in the inter-process communication mechanism. A process does not have to wait for the signal to arrive through any operation.

Processes that receive signals have different processing methods for various signals, mainly in three types:

1> a process similar to an interrupt processing program can specify a processing function for the signal to be processed.

2> ignore a signal and do not process it.

3> the default value of the system is retained for processing the signal. For this default operation, the default operation for most signals is to terminate the process. A process calls signal to specify the process's processing behavior for a signal.

Below is the window signal list

Linux also uses the kill-l command:

1) SIGHUP       2) SIGINT       3) SIGQUIT      4) SIGILL 5) SIGTRAP      6) SIGABRT      7) SIGBUS       8) SIGFPE 9) SIGKILL     10) SIGUSR1     11) SIGSEGV     12) SIGUSR213) SIGPIPE     14) SIGALRM     15) SIGTERM     17) SIGCHLD18) SIGCONT     19) SIGSTOP     20) SIGTSTP     21) SIGTTIN22) SIGTTOU     23) SIGURG      24) SIGXCPU     25) SIGXFSZ26) SIGVTALRM   27) SIGPROF     28) SIGWINCH    29) SIGIO30) SIGPWR      31) SIGSYS      34) SIGRTMIN    35) SIGRTMIN+136) SIGRTMIN+2  37) SIGRTMIN+3  38) SIGRTMIN+4  39) SIGRTMIN+540) SIGRTMIN+6  41) SIGRTMIN+7  42) SIGRTMIN+8  43) SIGRTMIN+944) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+1348) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-1352) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-956) SIGRTMAX-8  57) SIGRTMAX-7  58) SIGRTMAX-6  59) SIGRTMAX-560) SIGRTMAX-4  61) SIGRTMAX-3  62) SIGRTMAX-2  63) SIGRTMAX-164) SIGRTMAX

I often see these semaphores when running and debugging C language programs using gdb commands.

Let's look at the message queue. Message Queue provides a method to send a data block from one process to another. Each data block is considered to contain a type, and the receiving process can independently receive data structures of different types. You can send messages to avoid synchronization and blocking of named pipelines. However, like a named pipe, each data block has a maximum length limit.

Shared memory allows two unrelated processes to access the same logical memory. Shared Memory is a very effective way to share and transmit data between two running processes. The memory shared by different processes is usually the same physical memory. A process can connect the same shared memory to its own address space. All processes can access the addresses in the shared memory.

Semaphores: to prevent a series of problems caused by simultaneous access to a shared resource by multiple programs, we need a method that can be authorized by generating and using tokens, at any time, only one execution thread can access the critical area of the Code. The critical zone refers to the code that executes data updates that requires exclusive execution. Semaphores can provide such an access mechanism. Allow only one thread to access a critical section at a time, that is, the semaphore is used to coordinate access to shared resources.

Socket: This communication mechanism allows the development of the Client/Server to be performed on a local single machine or across networks. Its features include domain, type, and protocol ). Simply put, the source IP address and destination IP address, and the source port number and destination port number are combined into sockets.

The following describes the communication process, which involves some C-language functions. Don't worry, just be familiar with it. If you have studied nio, you will find that these are very common.

To enable process communication between different hosts, you must use sockets, which are created using the socket () function. If the C/S mode is required, you need to bind the server socket to the address and port and use bind (). After the preceding operation is complete, you can use listen () to listen to this port, if other programs connect, the server will call accept () to accept the application and serve it. The client calls connect () to establish a connection with the server. In this case, it uses three handshakes to establish a data link. After the connection is established, the server can communicate with the client. You can use read ()/write (), send ()/recv (), sendto ()/recvfrom () for communication () but different functions have different roles and positions. After the data is transferred, you can call close () to close the connection between the server and the client.

  

At this point, the main content of this article is gone, basically introducing a thing: Process Communication in the Linux kernel. This is the basis for learning the nio Part of any advanced programming language. The following introduces some concepts to assist in understanding.

File handle: In file I/O, to read data from a file, the application must first call the operating system function and transmit the file name, select a path to the file to open the file. This function retrieves a sequence number, that is, the file handle. This file handle is a unique identification basis for opened files. A handle is a name that you give to a file, device, socket, or pipe, so that you can remember the name of your certificate processing and hide the complexity of some caches. To put it bluntly, it is a file pointer.

File descriptor: the kernel uses file descriptors to access files. When an existing file is opened or a new file is created, the kernel returns a file descriptor. To read and write a file, you also need to use the file descriptor to specify the file to be read and written. The file descriptor is a non-negative integer. In fact, it is an index value that points to the record table that the kernel maintains for the process to open the file. When the program opens an existing file or creates a new file, the kernel returns a file descriptor to the process. In program design, some underlying programming is usually centered around the file descriptor. However, file descriptors often apply to unix, linux, and other operating systems. Traditionally, the file descriptor of the standard input is 0, the standard output is 1, and the standard error is 2.

`/letv/apps/jdk/bin/java -DappPort=4 $JAVA_OPTS -cp $PHOME/conf:$PHOME/lib/* com.letv.mms.transmission.http.VideoFullServerBootstrap $1 $3 > /dev/null 2>&1 &`

If you have deployed a java background program, you can understand the preceding shell commands. /Dev/null 2> & 1 here, 2 is the file descriptor, which outputs errors to the file.

These two concepts are similar and can be understood as one thing without too much distinction. Open files include file handles, but are not limited to file handles. Because all transactions of lnux exist in the form of files, such as shared memory, semaphores, message queues, all files are opened by memory ing, but these files do not occupy file handles. Linux Command for viewing the maximum number of file handles allowed by a process: ulimit-n

 

All right, today's concepts have been introduced, and we will go back to the initial issue: too program open files. At that time, the machine broke and the memory was almost full. Therefore, the search engine uses index files and has a lot of IO operations. Shared Memory and memory ing files are definitely not supplied, and an error is returned. The problem that has been around for two years is a bit confusing.

 

Question Run Time:

Whenever I sneeze, I wonder who misses me. Although I knew that the reason for sneezing was that I had just entered a room with dust, or the catalogue floated in the air. Why?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.