Five areas of Sybase database application system tuning2011/3/14/13:49 Source: HC IT network
Taking "the Internet banking system of a large commercial bank" as an example, this paper deals with five major areas of database application system tuning: Stress testing, application-side tuning, server-side tuning, optimization of system platform layer, optimization of application architecture, This paper introduces the problems and solutions of the author in the process of project development in detail. This paper discusses the best practice of the performance tuning of the enterprise-class Sybase database application system, which provides a general instructive reference for the work of this kind of nature.
1. Project Background and features
Based on the experience and lessons of the previous generation of online banking system, the application system has been completely redesigned and developed in 2008.
Technically, it shifted from the Java EE Technology architecture to the. NET architecture, greatly improving the application development efficiency, operational efficiency, improve the end-user experience (userexperience), business, at the same time launched a "personal network Silver", "Enterprise network Silver", "mobile banking", etc. So that the bank in the non-traditional business channels to achieve a leap-forward development, quickly catch up and surpass the peers, products, from the SYBASEASE12 series product upgrade to 15 series products, in order to take full advantage of its high performance, high reliability characteristics.
2. System Logic Architecture
, the system is divided into 3 levels, there are several key components--
Access Layer (Accesslayer): including firewalls (Firewall), Network Load Balancing devices (loadbalancer), etc.;
? application and Management (Application&managementlayer): includes static Web server (staticpageshttpservers), Multi-group Application Server (groupedapplicationservers), certificate Management Server (CA&LDAP), Sybase database server, various system management monitoring tools for different services;
Background access: Mainly access to a variety of back-end system Gateway (backendgateways), such as host transaction gateway.
Click here to view all news images
3. Business volume of the system
When measuring the business volume of online banking, there are usually two types of business indicators: one is the account transaction, that is, the transaction involving the core information of the user account (especially the balance of the account), such as "Transaction Details Inquiry", "Transfer", "payment", etc., the other is ancillary transactions such as "common linked account/payee management" Password management, financial product information, user-customized menu management, and more.
In this project, we describe the business volume of the system with the following coefficient relationship: If the "account Class volume" is 1, then the "auxiliary volume" is the total business volume of the 0.6,sybasease database system is 1.6. If not specifically indicated, this article mainly uses "the Account Class volume" ("The Total business volume") The expression way.
Several key points of the project are listed below:
When it was launched in July 2009, its business volume was about 3 million/day (4.8 million/day);
At the end of 2009, its business volume increased to 6 million/day (9.6 million/day);
At the beginning of May 2010, 10 million strokes/day (16 million/day) were reached;
At the beginning of October 2010, 15 million strokes/day (24 million/day) were reached;
The business pressure in the next 2 years is expected to be 30 million per day (48 million per day), which is likely to be reached by the end of 2011.
4. Five areas of Sybase database application system tuning
(1) Pressure test field
From the perspective of the industry, the author observes that before the application system goes online, there are usually different forms of functional testing, but the prevalence of stress tests is very low, and even with stress tests, it is often not possible to predict future system performance well. The objective result is that most large database application systems will have performance problems in the actual operation, and this bottleneck usually occurs prematurely, and the potential of the system hardware and software resources can not be fully exerted.
In addition to emphasizing the necessity of stress testing for large-scale systems, it is necessary to improve the methods of stress testing. I think that in the field of stress testing, we should pay attention to 3 key issues, namely "determination strategy of business indicators", "simulation strategy of Business Pressure" and "selection strategy of performance evaluation Index".
(2) Tuning of database application side
In order to tune the database application system, the best point is to start from the database application, and find out the application module/sql that the actual operation is inefficient.
Traditional positioning method is: In the application logic, or in the application module scheduling/master control program to add "control module/Statement" to measure each module/statement "in/out point in time", resulting in its execution time. This method is easy to understand, often used in the debugging phase of the system, and is often used for inter-layer isolation in a hierarchical application (layeredapplication) environment. However, the disadvantage is that this approach can add to the burden on the application development team and is not easily applied to the production system that is already on-line.
For performance issues with SQL modules/statements in database applications, we recommend tools that have a network sniffing (networksniffing) mechanism, such as the proactivedba of U.S. whitesands companies. It can be used for network monitoring, without affecting the application system, the calculation of each SQL statement/module "in/Out point in time", so that its execution length, find the part of the efficiency problem.
However, the above methods, which are sorted and filtered according to their actual execution time, do not necessarily identify all the SQL that might constitute a system bottleneck. There is another situation to note-some SQL, for various reasons (for example, it involves less data, simple logic, etc.), its individual execution time is not very long, very obscure, but its frequency of execution/total number of special, its cumulative memory I/O (Sybase called it logical I/O) is particularly much, Therefore, it consumes a lot of CPU resources. At this point the system usually behaves as CPU is busy and the overall throughput decreases.
In this case, the traditional means of measurement and control may be ineffective. The corresponding remedy is to make full use of the various monitoring system tables (Monitoringtables), also known as MDA tables (monitoringanddiagnosticstables), which have existed in sybasease and have been improved in version 15.
(3) Tuning of database server side
Database server-side tuning is the most important work of the database administrator (DBA), this article cannot and does not intend to repeat all aspects of Sybasease Server tuning, based on the experience in this online banking project, highlighting some key elements, especially the ASE15-related tuning skills.
3.1StatementCache tuning (especially traceflag757 and CPU busy issues)
Since the 12.5.2 version, ASE has added the Statementcache mechanism. As an extension of the traditional stored procedure (stotedprocedure) processing mechanism, it is used to hold the SQL text of the Ad hoc query (adhocsql), execution plan, in order to improve the execution efficiency of the same kind of SQL (reduce the re-compilation recompiling time of SQL).
The actual effect depends on the characteristics of the application system, some systems are not sensitive to this mechanism, some systems can get dozens of percent or even several times the performance improvement.
When this mechanism is enabled, it is generally recommended to enable the Enableliteralautoparam parameter at the same time so that similar SQL with the same statement body and only different parameters is classified as homogeneous, improving the efficiency of statementcache. The biggest lesson of this project comes from Statementcache's "size Configuration" and "Allocation policy", as well as the potential for system-level performance issues--CPU usage is high, but the cause is unclear.
3.2ProcedureCache Tuning
Since the 12.5.2 version, ASE has introduced the Statementcache mechanism and used it as part of the Procedurecache. The default procedurecache is no longer 20% of the entire cache, but is set by a separate server parameter procedurecachesize.
More procedurecache is needed than ASE12.5,ASE15. Because it employs a larger memory allocation unit, its redesigned query processing engine needs more memory to evaluate the new data access algorithms, and the ALLROWS_DSS and Allrows_mix optimize targets are also consuming more procedurecache than ALLROWS_OLTP, Not to mention the index statistics histogram (histograms), sorting Space (sortbuffers) and so on.
Therefore, although the procedurecache of ASE15 does not have to reach 20% of the entire cache in earlier versions, it is usually 2~6 times the procedurecache of ASE12.5.
(4) Tuning of the system platform layer
In recent years, the rapid development of the hardware industry has made these database administrators (DBAs) Less concerned about the system platform layer features such as memory size, network bandwidth, disk throughput, and so on, especially in the hardware environment dedicated to the database server.
It is a long time to relax and let us in the tuning of the project to take a detour!
(4.1) Tuning of SAN Storage (cpuspikes issues)
A number of "intermittent CPU utilization spike (Cpuspikes)" occurred over a period of time in this project. This is the same as the high CPU utilization caused by the Statementcache allocation policy: It can increase CPU utilization and affect the throughput of the whole system. The two also have differences: the former lasts long, with a high "objectmanagerspinlockcontention", the latter duration is short, "objectmanagerspinlockcontention" is not so high.
After we have ruled out the aforementioned SQL problem, statistical value problem, index problem, data type matching problem, Statementcache and so on, we have to enlarge the scope of the investigation--Simultaneously sampling the database server and disk system, shortening the sampling period, increasing the number of samples, comparing the normal time, Abnormal period 2 key indicators of the system.
We finally found that each "intermittent CPU utilization spike (Cpuspikes)" corresponds to the increase in the disk system "Averageservicetimes". That is, most disks are noticeably slower, about 33% of devices are twice times slower, and 11% are more than 10 times times slower! Correspondingly, the database "Logsemaphorecontention" jumped more than 20 times, "Plclatchcontention" jumped 13 times-fold!
As a result, although San (storageareanetwork) and LVM bring a lot of benefits, but also should be carefully planned. The most common is the "stripeeverythingeverywhere" of the storage design pattern, that is, all database objects are scattered on the logical volume (LV), the LV (stripping) is composed of several PV diskgroup, a SAN storage and I The/O channel is shared across multiple applications on multiple machines. The biggest disadvantage of this model is that multiple applications on multiple machines interact with each other through a shared San, making performance tuning difficult and disaster recovery (disasterrecovery) difficult. For very large database applications with high performance requirements, we recommend that you configure dedicated hardware, whether it is a SAN, a network, and so on.
(4.2) Tuning of NAS Storage
In recent years, NAS (network-attachedstorage) storage has been used more and more widely because it costs less, has easier file sharing, and requires fewer clients than a SAN.
I once had an unforgettable experience. 4 o'clock in the afternoon on a weekday, the customer's Sybase database system fails, the transaction log for 6G is about to be exhausted and cannot be purged, all data modification transactions are pending!
It turns out that Nas is different from San, DAS (directattachedstorage), it is not the host's direct attached storage device, and usually does not have the special high-speed network support of SAN, which is affected by network connectivity, stability, pressure, network performance. In high-availability, high-performance, large database applications, we do not recommend that NAS space participate in the direct operation of the database, Dump/load, Bcpin/out, let alone as a database device. Of course, 2-phase processing can be done flexibly, such as dump the database to a local or SAN, and then dump it on the NAS.
(5) Architectural level tuning
(5.1) Database/application-level split (theoretical multi-Library structure and real-world single-Library structure)
As we all know, Sybase is different from Oracle, and from the day it was born, it was possible to support multiple databases in a single server. The Sybase server can theoretically support multiple database applications in accordance with the Convention of one user database (userdatabase) application. With the rapid development of hardware technology and the re-popularity of centralized data centers, more and more users tend to run multiple database applications on an ASE server because it is easy to manage.
But there is always a two-sided, for the reliability, high performance requirements of large database applications, a single server after all, there is a reliability of mutual interference and performance bottlenecks (whether CPU, memory, or tempdb), not conducive to scheduling system maintenance windows, and further improve performance.
In fact, for the existing multi-machine systems for the OLTP-oriented relational database systems, the reliability benefits outweigh the performance gains. We recommend that: in order to achieve the best performance scalability, the database application system should first consider the multi-server/multi-database architecture at the application architecture level, that is, when necessary, a single application system should be able to easily split across multiple servers, which is the real scalability of the application architecture level!
(5.2) data (table) level splitting (sybasetext/image field and latchsleep problem)
Another major benefit of this project is the exploration of the aselatchsleep phenomenon and the methods found to improve the efficiency of sybasetext/image field modification. This is not much in the past work, but it is a long-term doubt, now summarized as follows.
Latch, also known as Spinlock, is an ASE for "internal data structures (such as Datacache, Object/indexmetadatacache.)" A concurrency access management mechanism to maintain the physical consistency of the page, only exist in the SMP environment. It is lightweight, non-transactional (lightweight&nontransactional).
Conclusion:
The significance of this paper is as follows: (1) The objective practice of the project proves once again that the Sybase/unix platform can be used to construct a national super-large database application system; (2) Large database application systems often need to be fully tuned and even make the necessary adjustments for high performance (e.g. application architecture) ; (3) The tuning work of large database application system needs not only theoretical knowledge, but also the experience of integrating theory into practice as described in this paper--"best Practice", which is commonly referred to abroad.
Five areas of Sybase database application system tuning