I. Analysis Phase
In general, there are often too many areas of concern in the system analysis phase, system functionality, availability, reliability, security requirements tend to attract most of our attention, but we must note that performance is a very important non-functional requirements, must be based on the characteristics of the system to determine its real-time requirements, response time requirements , hardware configuration, and so on. It is best to have quantifiable indicators of various needs.
On the other hand, the type of the system should be differentiated according to various requirements in the analysis phase, which distinguishes between OLTP (online transaction processing systems) and OLAP (online analytical processing systems).
Second, the design phase
The design phase can be said to be the key phase of system performance later, at this stage, there is a relationship with almost all the performance tuning process-database design.
After the completion of the database design, you can conduct a preliminary index design, good index Design can guide the coding phase to write efficient code, for the performance of the whole system lay a good foundation.
Here's what the performance requirements design phase needs to be noted:
1,
Normalization of database logic design
The normalization of database logic design is what we generally call paradigm, so we can simply understand the paradigm:
1th Spec: There are no duplicate groups or multivalued columns, which is the minimum requirement for database design.
2nd Spec: Each non-critical field must rely on the primary key and cannot rely on some part of a modular primary keyword. Eliminate partial dependencies, and in most cases the database design should be in the second paradigm.
3rd specification: A non-critical field cannot be dependent on another non-critical field. Eliminating transitive dependencies, reaching the third normal form should be the requirement of most tables in the system, unless some special tables are in effect.
Higher paradigm requirements are no longer introduced here, personally, if all reached the second normal form, most of the third normal form, the system will produce fewer columns and more tables, thus reducing data redundancy, but also conducive to performance improvement.
2,
Reasonable redundancy
It is almost impossible to design a system completely in accordance with the norm, unless the system is particularly small, it is necessary to systematically add redundancy after the normalization design.
Redundancy can be redundant databases, redundant tables, or redundant fields, and different granularity of redundancy can play a different role.
Redundancy can be increased for ease of programming, or for improved performance. From the performance point of view, the redundant database can disperse the database pressure, the redundant table can disperse the data large table's concurrent pressure, but also can speed up the special query speed, the redundant field can reduce the database table connection effectively, enhances the efficiency.
3,
Design of PRIMARY Key
Primary keys are necessary, SQL Server's primary key is a unique index at the same time, and in practical applications, we tend to select the smallest key group cooperation as the primary key, so the main key is often suitable as a clustered index of the table. The impact of the clustered index on the query is relatively large, as described in the following index.
In a table with multiple keys, the primary key selection is also more important, general selection of the total length of small keys, small keys faster, while small keys can make the primary key of the B-tree structure of the hierarchy less.
Primary key selection also pay attention to the combination of primary key field order, for the combination of primary keys, different field order primary key performance difference may be very large, you should generally choose a low repetition rate, alone or the combination of the possibility of large query field in front.
4,
Design of FOREIGN key
Foreign key as a database object, many people think the trouble without, in fact, foreign keys in most cases is very useful, the reason is:
Foreign key is the most efficient consistency maintenance method, database consistency requirements, in turn can use foreign keys, check constraints, rules constraints, triggers, client programs, generally believed that the closer the data from the method efficiency higher.
With a careful use of cascading deletes and cascading updates, cascading deletes and cascading updates as new features of SQL SERVER 2000, are reserved in 2005 and should be available. I'm talking about caution here, because cascading deletions and cascading updates have broken through the traditional definition of foreign keys, the functionality is a bit too powerful to use before you have to make sure that you have mastered its scope, otherwise, cascading deletions and cascading updates may make your data inexplicably modified or lost. cascading deletes and cascading updates from performance are more efficient than other methods.
5,
The design of the field
A field is the most basic unit of a database, and its design has a great impact on performance. The following should be noted:
A, the data type as far as possible uses the numeral type, the numeral type comparison is much faster than the character type.
B
The data type is as small as possible, and the small one here refers to the premise of satisfying the foreseeable future needs.
C
Try not to allow NULL, unless necessary, with not null+default instead.
D, less text and image, the binary field of reading and writing is relatively slow, and, the method of reading is not much, most of the best not.
E
Self-added fields should be used with caution, which is not conducive to data migration.
6,
The design of database physical storage and environment
In the design phase, we can design the physical storage, operating system environment and network environment of the database, so that our system can adapt to more users ' concurrent and larger data volume in the future.
Here you need to pay attention to the role of the file group, the application of file groups can effectively spread I/O operations to different physical hard disk, improve concurrency capabilities.
7,
System Design
The design of the whole system, especially the system structure design, has great influence on the performance, for the general OLTP system, we can choose c/S structure, three-layer C/s structure and so on, the key of the performance of different system structure is also different.
In the system design phase, some business logic should be summed up in database programming, and database programming includes database stored procedure, trigger and function. The benefit of implementing business logic with database programming is to reduce network traffic and to take advantage of the precompilation and caching capabilities of the database.
8,
The design of the index
In the design phase, you can conduct a preliminary index design based on the requirements of functionality and performance, where you need to design the index based on the estimated amount of data and the query, which may be different from the actual use in the future.
On the selection of indexes, you should change your mind:
A
Depending on the amount of data that determines which tables need to be indexed, only primary keys can be used for small amounts of data.
B
Depending on how often you decide which fields need to be indexed, select the candidate fields that are frequently indexed as join conditions, filter criteria, aggregate queries, and sorted fields.
C
Combine the fields that often appear together to make up the combined index, and the field order of the combined index is the same as the primary key, and you need to put the most commonly used fields in front of you, and put the fields with low repetition rates in front of you.
D
Do not add too many indexes to a table because indexes affect the speed of inserts and updates.
III. Coding Phase
The coding phase is the focus of this paper, because in the case of design determination, the quality of the coding almost determines the quality of the whole system.
Coding phase is the first need for all programmers have performance awareness, that is, in the implementation of the function at the same time consider the concept of performance, the database is able to carry out a set of operations, we should try to use this tool, the so-called set operation is the actual batch operation, is to minimize the large amount of data Instead, use SQL statements or stored procedures instead. About thought and consciousness, it is hard to say clearly, need to experience in the process of programming.
Here's a list of some of the things you need to be aware of during programming:
1,
Return only the data you need
Return data to the client needs at least database extraction data, network transmission data, client receive data and client processing data, and so on, if the return of unwanted data, will increase the server, network and client ineffective labor, the harm is obvious, to avoid such incidents need attention:
A, landscape, do not write select * statements, but select the fields you need.
B
In Portrait view, write a WHERE clause reasonably, and do not write an SQL statement without a where.
C
Note the WHERE clause in the SELECT INTO, because select into inserts the data into the temporary table, which locks some system tables, and if the WHERE clause returns too much data or is too slow, it can cause the system table to lock up for a long time, blocking other processes.
D, for aggregate queries, you can further qualify the returned rows with the HAVING clause.
2,
Try to do less repetitive work
This point is the same as the previous point, is to minimize the ineffective work, but this focus on the client program, you need to pay attention to the following:
A
Controlling multiple executions of the same statement, especially the multiple execution of some of the underlying data, is rarely noticed by many programmers.
B
Reducing multiple data conversions may require data conversion to be a problem of design, but the number of reductions is what programmers can do.
C
To eliminate unnecessary subqueries and join tables, subqueries are generally interpreted as an outer join, and extra connection tables incur additional overhead.
D
Merges multiple updates on the same condition of the same table, such as
UPDATE EMPLOYEE SET fname= ' haiwer ' WHERE emp_id= ' vpa30890f '
UPDATE EMPLOYEE SET lname= ' yang ' WHERE emp_id= ' vpa30890f '
The two statements should be merged into one of the following statements
UPDATE EMPLOYEE SET fname= ' haiwer ', lname= ' Yang '
WHERE emp_id= ' vpa30890f '
E
The update operation does not split into a delete operation +insert the form of the operation, although the function is the same, but the performance difference is very large.
F
Do not write meaningless queries, such as
SELECT * from EMPLOYEE WHERE 1=2
3,
Note Transactions and Locks
The transaction is the database application and the important tool, it has the atomicity, the consistency, the isolation, the persistence these four attributes, many operations we all need to use the transaction to guarantee the data correctness. In the use of transactions we need to do to avoid deadlocks, as far as possible to reduce congestion. Particular attention needs to be paid to:
A, the transaction operation process to be as small as possible, can split the transaction to split apart.
B
The transaction operation process should not have interaction, because the transaction does not end when the interaction is waiting, and many resources may be locked.
C
The transaction operation process accesses the object in the same order.
D, increasing the efficiency of each statement in a transaction, using indexes and other methods to improve the efficiency of each statement can effectively reduce the execution time of the entire transaction.
E
Try not to specify lock type and index, SQL Server allows us to specify the type and index of locks used by the statement, but in general, the lock type and index chosen by the SQL Server optimizer are optimal under the current data volume and query conditions, and we may specify only in the present case, But data volumes and data distributions will change in the future.
F
You can select the lowest isolation level (uncommitted read) at a lower isolation level, especially when reporting queries.
4,
Note the use of temporary tables and table variables
In complex systems, temporary tables and table variables are difficult to avoid, and the use of temporary tables and table variables requires attention:
A, if the statement is complex, too many connections, you can consider using temporary tables and table variables to complete the step-by-step.
B
If you need to use the same part of a large table more than once, consider staging the partial data with temporary tables and table variables.
C
If you need to synthesize data from multiple tables to form a result, consider using temporary tables and table variables to summarize the data for more than one table in a step-by-step way.
D, in other cases, you should control the use of temporary tables and table variables.
E
On the choice of temporary table and table variables, many of the statements are table variables in memory, fast, should be preferred table variables, but in actual use, this choice mainly consider the need to put in the temporary table of data, in the case of more data, temporary table speed is faster.
Current 1/3 page
123 Next read the full text