Linux server-based performance analysis and optimization (2)
I. Features of several typical applications on system resources
1.1 Web applications based on static content
One of the main features of such applications is that the majority of small files and frequent read operations, Web servers are generally Apache or Nginx, because the two HTTP servers process static resources very quickly and efficiently. A single Web service cannot support a large number of client accesses when there is a large number of concurrent requests, A load cluster system composed of multiple Web servers is required. To achieve more efficient access, you can also set up a Cache server at the frontend, that is, Cache static resource files to the operating system memory for direct read operations, because reading data directly from the memory is much more efficient than reading data from the hard disk, setting up a Cache server on the Web Front End can greatly improve the concurrent access performance. Common Cache software include Squid and Varinsh.
Although the Cache server can improve the access performance, the server requires a large amount of memory. When the system memory is sufficient, the random read Pressure on the disk can be mitigated. When the memory is too small or the memory is insufficient, the system will use virtual memory, and the use of virtual memory will increase the disk I/O. When the disk I/O increases, the CPU overhead will also increase.
When there is a high concurrency access, another problem is the network bandwidth bottleneck. If the client has a large traffic volume and the bandwidth is insufficient, the network will be blocked and access will be affected. Therefore, when building a Web-based network application, network bandwidth must also be considered.
1.2 Web applications focusing on Dynamic Content
One feature of such applications is that frequent write operations, such as Java, PHP, Perl, and CGI, can cause serious CPU resource consumption. Because the execution of dynamic programs requires compiling and reading databases, and these operations consume CPU resources, a dynamic program-based Web Application, multiple CPUs with high performance should be selected, which will greatly improve the overall performance of the system.
When Dynamic Content-based Web applications are accessed in high concurrency, the number of processes executed by the system is large. Therefore, pay attention to load distribution. Too many processes consume a large amount of system memory. If the memory is insufficient, the virtual memory will be used. The increase in the virtual memory will lead to frequent disk write operations, which will consume CPU resources, therefore, we need to seek a balance between hardware and software resources, such as configuring a large memory and a high-performance CPU. In terms of software, we can use Memcached to accelerate the access efficiency between programs and databases.
1.3 database applications
One of the main features of database applications is the consumption of memory and disk I/O, while the consumption of CPU is not very large, therefore, the most basic practice is to configure a large memory and fast read/write disk array for the database server. For example, you can select RAID level for the disk of the database server, such as RAID 5 and RAID 01. Separating Web Server from DB Server is also a common practice for optimizing database applications. If the client user's request to the database is too large, you can also consider using the database load balancing solution to improve the database access performance through software load balancing or hardware load balancing.
For tables that are too large in the database, you can consider splitting them, that is, splitting a large table into multiple small tables and then associating them through indexes. This can avoid performance problems caused by querying large tables, when the table is too large, querying and traversing the entire table will result in a sharp increase in disk read operations, resulting in read operation waiting. At the same time, the query statements in the database are complex. A large number of where clauses, order by, group by sorting statements, and so on, can easily cause CPU bottlenecks. Finally, when data is updated, a large volume of data updates or frequent updates may also result in a surge in disk write operations and a bottleneck in write operations. This should also be avoided in the program code.
In daily applications, another method can significantly improve the performance of the database server, that is, read/write splitting. Read and Write operations on the database at the same time are extremely inefficient access methods. A good practice is to meet the Read and Write pressure and requirements, create two database servers with identical structures, copy the data on the server responsible for writing to the server responsible for reading at regular intervals, and improve the overall system performance through read/write collaboration.
The cache method can also improve the performance of the database. The cache is a temporary container of the database or objects in the memory. Using the cache can greatly reduce the read operations of the database and provide data in the memory. For example, you can add a data cache layer between the Web Server and the DB Server to create copies of frequently requested objects in the system memory. In this way, data can be provided for programs without accessing the database, memcached, which is widely used today, is based on this principle.
1.4 Software Download Application
Static resource download servers are characterized by high bandwidth consumption and high storage performance requirements. When downloads are extremely high, multiple servers and multi-point servers can be used to share the download load, in terms of HTTP servers, we recommend Lighttpd HTTP servers instead of traditional Apache servers from the perspective of high performance and reduced server deployment, the reason is that Apache uses the blocking mode of I/O operations, the performance is relatively poor, the concurrency capability is limited, and Lighttpd uses the asynchronous I/O method, the processing of resource download concurrency capability far exceeds Apache.
1.5 streaming media service applications
Streaming media is mainly used in video conferencing, video on demand, distance education, and online live broadcasting. The main performance bottleneck of such applications is network bandwidth and storage system bandwidth (mainly read operations ), in the face of a massive number of users, how to ensure that users receive high-definition, smooth images, and how to maximize network bandwidth savings is the primary problem for streaming media applications.
To optimize the streaming media server, you can consider the storage policy, Transmission Policy, scheduling policy, proxy server Cache Policy, and the architecture design of the Streaming Media Server. In terms of storage, the video encoding format needs to be optimized to save space and optimize storage performance. In terms of transmission, intelligent stream technology can be used to control the transmission rate, to maximize the smoothness of watching videos. Static and Dynamic Scheduling can be used for scheduling. Management policies such as segment caching and Dynamic Caching can be used for proxy servers; in the architecture of streaming media, the memory pool and thread pool technology can be used to improve the impact of memory consumption and excessive threads on performance.