Architecture Software Development
Abstract:During normal development, you may encounter bottlenecks in various aspects of development, such as performance and system. Have you summarized them? Let's take a look at the classification of these system bottlenecks in this article!
In Zen and the art of scaling-A koan and epigram approach, Russell
Sullivan makes a very interesting conclusion: 20 traditional system bottlenecks are common in software development. It sounds like there are 20 plots and it depends on how you plan these stories, it may be true, but only in practice can we know the ups and downs they bring to us.
One day, aurelien broszniowski sent me an email showing these bottlenecks in the list. During the next conversation, I copied the list to Russell, who organized the list.
"I really want to see this list when I was young," said Russell ". With the increasing experience, increasing projects, solving various types of problems, and constantly summarizing various lessons, you will add more things to this list. Therefore, when you read this list, you are looking back at the story fragments.
Database
- The working task memory exceeds the available RAM memory.
- Long/short Query
- Write conflict
- Join memory usage
Virtualization
- Share an HDD, disk seek death)
- On-cloud network I/O fluctuations
Programming
- Thread: deadlock, debugging, non-linear expansion, etc.
- Event-driven programming: callback () is too complex and how to store stateful data in function calls
- Lack of optimization, tracking, logs, etc.
- Unscalable single module, spof: single point of failure, non-horizontal scaling, etc.
- Stateful applications
- Design problem: the developed application runs normally only on its own machine line, or only works normally when several people test it (without going through stress testing ).
- The algorithm is too complex.
- Related services, such as DNS lookup and other services that may be blocked
- Stack space
Disk
- Access a local disk
- Random Access to disk I/O
- Disk fragmentation
- When the data written by the SSD is larger than the SSD capacity, the performance will decrease.
OS
- Fsync saturation, Linux Buffer padding (fsync flushing, Linux buffer cache filling up)
- The TCP buffer is too small.
- File descriptor restrictions
- Power Allocation)
Cache
- No memcached (Database crash)
- HTTP: headers, etags, and gzip Compression not used.
- Not making full use of browser cache
- Bytecode cache (such as PHP)
- L1/L2 cache: This is a headache. Store key and frequently accessed data in L1/L2. This involves many aspects: snappy network I/O, and column databases directly run algorithms on compressed data. Use some technologies to prevent your TLB from being destroyed. The most important idea is to closely grasp the computer architecture, involving multi-core CPU, L1/L2, shared L3, numa Ram, from dram to chip data transmission bandwidth/latency, diskpages and dirtypages of DRAM cache are TCP packets that pass through CPU <-> DRAM <-> Nic.
CPU
- CPU overload
- Content Switching-> too many threads enabled on a single core, too many Linux schedulers, and too many system calls
- Io wait-> All CPUs are waiting at the same speed
- CPU cache: cached data is a fine-grained process. In order to find the correct balance between multiple instances and data of different values, cache data consistency and heavy synchronization are maintained.
- Backplane Throughput)
Network
- Nic burst, IRQ saturation, and Soft Interrupt occupy 100% of the CPU
- DNS query
- Packet Loss
- An unexpected route exists in the Network
- Access a Network Disk
- Shared san
- Server fault-> unable to get response from service
Process
- Test Time
- Development Time
- Team size
- Budget
- Code debt
Memory
- Insufficient memory-> kill the process, switch to swap, and suspend
- Disk swap caused by insufficient memory (related to swap)
- Memory library overhead)
- Memory sharding (in Java, it is necessary to pause because the memory is recycled; in C, malloc always begins to allocate memory)