If you are in charge of a SQL Server-based project, or you have just been in touch with SQL Server, you may have to face some database performance problems, this article Article It will provide you with some useful guidance (most of which can also be used for other DBMS ).
Here, I am not going to introduce the tips for using SQL Server, nor can I provide a solution to cure all diseases. What I do is to sum up some experience-about how to form a good design. These experiences have come from the lessons I have learned over the past few years and I have seen many design errors repeated over and over again.
1. Understand your tools
Don't underestimate this. This is the most important one I have mentioned in this article. You may also see a lot of SQL ServerProgramThe staff did not master all of the T-SQL commands and those useful tools provided by SQL Server.
"What? I will waste a month to learn the SQL commands that I will never use ???", You may say this. Yes, you don't need. But you should use a weekend to browse all the T-SQL commands. Here, your task is to understand. In the future, when you design a query, you will remember: "Yes, here is a command that can fully implement the functions I need ", then, go to msdn to view the exact Syntax of this command.
2. Do not use a cursor
Let me repeat it again: Do not use a cursor. If you want to damage the performance of the entire system, they are your best choice. Most Beginners use cursors without being aware of their impact on performance. They occupy memory and use their incredible ways to lock tables. In addition, they are like snails. The worst thing is that they can make all the performance optimization your DBA can do is not done. I wonder if you know that every execution of fetch is equal to execution of a SELECT command? This means that if your cursor has 10000 records, it will execute 10000 select! If you use a set of select, update, or delete operations to complete the corresponding work, it will be much more efficient.
Beginners generally think that using a cursor is a familiar and comfortable programming method, but unfortunately, this will lead to poor performance. Obviously, the overall purpose of SQL is what you want to implement, not how to implement it.
I used a T-SQL to override a cursor-based stored procedure, the table only had 100,000 records, the original stored procedure took 40 minutes to complete the execution, the new stored procedure only takes 10 seconds. Here, I think you can see what an incompetent programmer is doing !!!
We can write a small program to obtain and process data and update the database, which is sometimes more effective. Remember: T-SQL is powerless for loops.
I will remind you again that using a cursor is not helpful. Except for DBA, I have never seen that using a cursor can effectively complete any work.
3. normalize your data tables
Why not normalize the database? There are two excuses: performance considerations and laziness. As for the second point, you will have to pay for it sooner or later. For performance issues, you don't need to optimize things that are not slow at all. I often see some programmers "de-Standardize" databases. The reason is that "the original design is too slow", but the result is that they often make the system slower. DBMS is designed to process standardized databases. Therefore, remember to design a database according to standardized requirements.
4. Do not use select *
This is not easy to do. I know it too well, because I often do it myself. However, specifying the columns you need in the SELECT statement brings the following benefits:
1. Reduce memory consumption and network bandwidth
2. You can get a safer design.
3. the query optimizer is given the opportunity to read all required columns from the index.
5. Learn about the operations you will perform on the data
Creating a robust index for your database is a matter of merit. This is simply an art. When you add an index to a table, the SELECT statement is faster, but the insert and delete statements are much slower, because creating and maintaining indexes requires a lot of additional work. Obviously, the key question here is: what kind of operations do you want to perform on this table. This problem is difficult to grasp, especially when it comes to delete and update, because these statements often include the SELECT command in the where section.
6. Do not create an index for the "gender" column
First, we must understand how indexes accelerate table access. You can think of indexes as a way to divide tables based on certain standards. If you create an index for a column like "gender", you just divide the table into two parts: male and female. What is the significance of this division when you are processing a table with 1,000,000 records? Remember: It is time-consuming to maintain indexes. When designing an index, follow the rule that the number of different content that may be contained in a column ranges from large to small, such as name, province, and gender.
VII. Use transactions
Use transactions, especially when the query is time consuming. If a problem occurs in the system, this will save your life. Generally, experienced programmers have some experience-unexpected situations that may cause the storage process to crash.
8. Be careful about deadlocks
Access your table in a certain order. If you lock table A and table B first, lock them in this order in all stored procedures. If you first lock table B in a stored procedure and then lock Table A, this may lead to a deadlock. If the lock sequence is not designed in detail in advance, the deadlock is not easy to detect.
9. Do not open large datasets
A frequently asked question is: How can I quickly add 100000 records to ComboBox? This is not correct. You cannot or do this. It's easy. Your user needs to browse 100000 records to find the desired records, and he will curse you. Here, you need a better UI, and you need to display up to 100 or 200 records for your users.
10. Do not use server-side cursors
Compared with server-side cursors, client cursors can reduce the system overhead of servers and networks and reduce the lock time.
11. query using parameters
Sometimes, I see a problem similar to this in the csdn Technology Forum: "select * from a where a. ID = 'A' B, what should I do because of an exception in single quotes query ?", The general answer is: Use two single quotes instead of single quotes. This is incorrect. In this way, you will encounter such problems on some other characters, not to mention that this will lead to serious bugs. In addition, this will make the SQL Server Buffer system unable to play its due role. Use parameter query, kettle
Pay-as-you-go. None of these problems exist.
12. Use a database with a large amount of data during program Encoding
Testing databases used by programmers in development generally have a small amount of data, but often end users have a large amount of data. Our usual practice is wrong. The reason is very simple: the hard disk is not very expensive now, but why should we wait until the performance problem is irrecoverable?
13. Do not use insert to import large amounts of data
Do not do this unless it is necessary. With UTs or BCP, you can enjoy both flexibility and speed.
14. Pay attention to timeout issues
When querying a database, the default value of the database is generally small, for example, 15 seconds or 30 seconds. Some queries run longer than this, especially when the data volume of the database keeps increasing.
15. Do not ignore the issue of simultaneously modifying the same record.
Sometimes, two users modify the same record at the same time, so that the last modifier modifies the operation of the previous modifier, and some updates will be lost. It is not difficult to deal with this situation: Create a timestamp field and check it before writing. If it is allowed, merge and modify it. If there is a conflict, the user is prompted.
16. Do not execute select max (ID) in the master table when inserting records in the detail table)
This is a common error. When two users insert data at the same time, this will cause an error. You can use scope_identity, ident_current, and @ identity. If possible, do not use @ identity, because when a trigger exists, it may cause some problems (see the discussion here ).
17. Avoid setting columns as nullable
If possible, you should avoid setting the column as nullable. The system will allocate an additional byte to each row of the nullable column, resulting in more system overhead during query. In addition, setting columns as nullable makes encoding complex, because each access to these columns must be checked first.
I am not saying that nulls is the root cause of troubles, although some people think so. I think that if you allow "NULL data" in your business rules, setting the column as nullable sometimes plays a good role. However, if you use nullable in a similar case, it's just self-defeating.
Customername1
Customeraddress1
Customeremail1
Customername2
Customeraddress2
Customeremail3
Customername1
Customeraddress2
Customeremail3
In this case, you need to normalize your table.
18. Do not use the text data type whenever possible
Do not use text unless you use text to process a large amount of data. Because it is not easy to query and slow, it will waste a lot of space if it is not used well. Generally, varchar can better process your data.
19. Try not to use temporary tables
Try not to use temporary tables unless you have. Generally, subqueries can replace temporary tables. Using a temporary table will bring about system overhead. If you program using COM +, it will bring you a lot of trouble, because COM + uses the database connection pool, temporary tables exist from beginning to end. SQL Server provides some alternatives, such as the table data type.
20. Learn to analyze and query
SQL Server Query analyzer is a good partner. You can use it to learn how queries and indexes affect performance.
21. Integrity of reference
Define the primary key, Uniqueness constraint, and foreign key, which can save a lot of time.