In addition to your code, the service is extended three tracing directions (UP)

Source: Internet
Author: User

The service system generally has a strict time-out, for the business unit to troubleshoot those Burr slow response, but also the infrastructure Department of the expert sitting in one of the services.

Sometimes, even if your code is working hard, it still responds slowly because it's a tough world. This article gives some examples from three directions:

The first aspect is mainly the warm-up, the more interesting two aspects see the second episode.

The first aspect, the operating system article

Preparation knowledge: "From the Apache Kafka review the file efficiently read and write" in Swap and pagecache parts. 1. Disable Swap

Linux has a very strange hobby, when the memory is not enough, to see the mood, there is a great chance not to use as the IO cache page cache back, but the cold application memory page out to disk (the specific algorithm to see the preparation of knowledge). When this memory is being accessed again, the process is stalled by putting it back in the memory (the so-called Main page fault). The slow growth of the generation, the pool of external memory, may be considered cold memory, with Cat/proc/[pid]/status to see the size of Vmswap, and then dstat to see the monitoring page in the occurrence of time.

In/etc/sysctl.conf put the following sentence, basically can eliminate swap. Set into 0 will lead to Oom, case in this, some students are set to 1, like it.

Vm.swappiness = 10

2. Speed up the page Cache flush frequency

is also a strange Linux own settings, Linux page cache mechanism is a long story (or look at the preparation of knowledge), simply said IO is not the default is not to write the disk, but write into the page cache memory, the inode dirty 30 seconds, or dirty data reached 10% available memory ( FREE+PAGECACHE-MMAP) before starting the flusher thread write disk.

Our production machine memory is at least 20G, want to ordinary hard disk 100mb/s level speed, write 2G file speed .... Fortunately, generally not up to this condition, usually by a few log files take 30 seconds to trigger, write hundreds of m at a time, spend three seconds or so.

The article said that the background brush disk thread does not block application write (2). But

The application of the write process is this:
Lock inode-> Lock page-> write page-> unlock page-> unlock inode-> lock inode page-> write inode page-> unlock inode page

And the process of flusher is this:
Lock page-> the page into the IO queue, waiting for IO to dispatch the write disk to complete the return-> unlock page

Visible, still have locks, IO Scheduler is not absolutely fair, when IO busy, application will still occur blocking.

Our approach is to use a 100MB absolute value instead of the available memory percentage for the threshold.

Join in the/etc/sysctl.conf.

Vm.dirty_background_bytes = 104857600

In the second example of a pause in the JVM, full IO-related, even if the JVM is not tuned, the light lowers the threshold, which can be greatly alleviated.

Of course, what value is optimal, must be based on machine configuration, the application of characteristics to specific analysis.

3. Network Parameters

Too many configurable places, you can refer to the Aliyun team for a good article on Linux TCP queue related parameters summary. Or that sentence, can not look at the article began to set up, must be based on their own situation.

For example, we set the CPU affinity of the network card soft interrupt queue:

At ordinary times the network card interrupts may only use one core to respond, under the big flow that nucleus will run full.
Run Irqbalance, also only used 1 cpu,12 nucleus.
Finally set their own 24 network card interrupt queue corresponding to 24 cores, the best effect ... But your situation is not necessarily the same.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.