Database optimization practices

Last Update:2014-01-25 Source: Internet

Author: User

Tags filegroup

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Optimization skills are mainly for DBA, But I think developers should master these skills, because not every development team has a dedicated DBA.

Step 9: properly organize database file groups and files

When creating an SQL Server database, the database Server will automatically create a series of files on the file system, and each database object created later is actually stored in these files. SQL Server has the following three types of files:

1). mdf file

This is the most important data file. Each database can have only one primary data file, and all system objects are stored in the primary data file. If secondary data files are not created, all user objects (database objects created by users) are stored in the primary data file.

2). ndf File

These are secondary data files, which are optional and stored as user-created objects.

3). ldf File

These are transaction log files, ranging from one to several. They store transaction logs.

By default, the primary data file and transaction log file are automatically created when you create an SQL Server database. You can also modify the attributes of these two files, such as the storage path.

File Group

To facilitate management and achieve better performance, data files are usually grouped reasonably. When a new SQL Server database is created, the primary file group is automatically created, the primary data file is included in the primary file group, and the primary file group is also set as the default group, therefore, all newly created user objects are automatically stored in the primary file group (specifically, stored in the primary data file ).

If you want to store your user objects (tables, views, stored procedures, and functions) in secondary data files, you need:

1) create a new file group and set it as the default file group;

2) create a new data file (. ndf) and add it to the new file group created in step 1.

All the objects created later will be stored in the secondary file group.

Note: The transaction log file does not belong to any file group.

Best practices for file/file group organization

If your database is not large, the default file/file group should meet your needs. However, if your database becomes very large (assuming 1000 MB ), you can (should) Adjust the file/file group for better performance. The best practices for adjusting file/file groups are as follows:

1) The primary file group must be completely independent. It should only store system objects, and all user objects should not be placed in the primary file group. The primary file group should not be set as the default group. Separating System Objects from user objects can provide better performance;

2) If there are multiple hard disks, you can allocate each file in each file group to each hard disk. In this way, you can achieve distributed disk I/O, greatly improving data read/write speed;

3) place frequently accessed tables and their indexes in a separate file group, so that reading table data and indexes will be faster;

4) Put frequently accessed tables that contain columns of the Text and Image data type into a separate file group. It is best to put the Text and Image column data in an independent hard disk, in this way, the speed of retrieving non-Text and Image columns of the table will not be affected by the Text and Image columns;

5) Place the transaction log file on an independent hard disk. Do not share a hard disk with the data file. Log operations are write-intensive operations, therefore, it is very important to ensure good I/O performance for log writing;

6) Put the "read-only" table into an independent file group, similarly, put the "Write-only" Table separately into a file group, so that the retrieval speed of the read-only table will be faster, the update speed for writing only tables is faster;

7) do not over-use the "Automatic growth" feature of SQL Server, because the cost of automatic growth is actually very high. Set the "Automatic growth" value to a suitable value, such as a week, similarly, do not use the "auto-contract" feature too frequently. It is best to disable auto-contract, instead, manually contract the database size, or use scheduling operations to set a reasonable interval, for example, a month.

Step 10: Apply partitions to large tables

What is Table Partitioning?

Table Partitioning refers to splitting a large table into multiple small tables to avoid scanning too much data during data retrieval. This idea is based on the "divide and conquer" theory.

When your database has a large table (assuming there are millions of rows of records) and other optimization techniques are used, but the query speed is still very slow, you should consider partition the table. First, let's take a look at the partition type:

Horizontal partition: Assume that a table contains 10 million rows of records. To facilitate understanding, assume that the table has an Automatically increasing primary key field (such as id ), we can split the table into 10 independent partition tables, each partition contains 1 million rows of records, the partition must be implemented based on the value of the id field, that is, the first partition contains records with the id value from 1, the second partition contains records from 1000001-2000000, and so on. This horizontal table partitioning method is called horizontal partitioning.

Vertical partitioning: Assume that there are many columns and rows in a table, some of which are frequently accessed, and other columns are not frequently accessed. Because the table is very large and all retrieval operations are slow, you need to partition the table based on frequently accessed columns so that we can split the large table into multiple small tables, each small table is composed of several columns in a large table. This Vertical Split table method is called vertical partitioning.

Another principle of vertical partitioning is to split indexed columns without indexes, but this partitioning method requires caution, because if any query involves retrieving these two partitions, the SQL engine has to connect the two partitions, so the performance will be low.

This article mainly introduces horizontal partitioning.

Partition Best Practices

1) After partitioning a large table, place each partition in an independent file and store the file on an independent hard disk, in this way, the database engine can concurrently retrieve different data files on multiple hard disks, improving the concurrent read/write speed;

2) For historical data, you can consider Partitioning Based on the "Age" of historical data. For example, if the table stores order data, you can use the order date column as the basis for partitioning, for example, the annual order data is made into a partition.

How to partition?

Assume that the Order table contains four years (1999-2002) of Order data and has millions of records. to partition the table, take the following steps:

1) Add a file group

Run the following command to create a file group:

Alter database OrderDB add filegroup [1999]

Alter database OrderDB add file (NAME = n'000000', FILENAME

= N 'C: \ OrderDB \ 1999. ndf', SIZE = 5 MB, MAXSIZE = 100 MB, FILEGROWTH = 5 MB)

FILEGROUP [1999]

Using the preceding statement, we added a file group 1999 and added a secondary data file "C: \ OrderDB \ 1999. ndf" to this file group.

Use the above command to create three more file groups 2002, And, each of which stores sales data for one year.

2) create a partition function

A partition function is an object defining a partition point. Use the following command to create a partition function:

Create partition function FNOrderDateRange (DateTime)

Range left for values ('20180101', '20180101', '20180101 ')

The preceding Partition Function specifies:

DateTime <= records enter the first partition;

DateTime> records and <= 2000/12/31 enter the second partition;

DateTime> 2000/12/31 and <= 2001/12/31 records enter the third partition;

DateTime> 2001/12/31 records enter the fourth partition.

Range left indicates that the boundary value of the LEFT partition should be entered. For example, values smaller than or equal to should all enter the first partition, And the next value should enter the second partition. If range right is used, the boundary value and the value greater than the boundary value should all enter the RIGHT partition. Therefore, in this example, the boundary value 2000/12/31 should enter the second partition, the value smaller than this boundary value should enter the first partition.

3) create a partition scheme

The partition scheme establishes a ing relationship between the table/index partitions and the file groups that store them. The command for creating a partition scheme is as follows:

Create partition scheme OrderDatePScheme as partition FNOrderDateRange

TO ([1999], [2000], [2001], [2002])

In the preceding command, we specify:

The first partition should enter the 1999 file group;

The second partition enters the 2000 file group;

The third partition enters the 2001 file group;

The fourth partition enters the 2002 file group.

4) apply partitions to tables

So far, we have defined the necessary partition principles. Now we need to partition the table. First, use the drop index command to delete existing clustered indexes on the table. Generally, there are clustered indexes on the primary key. If the INDEX on the primary key is deleted, you can also use drop constraint to delete the primary key index. The following command deletes the primary key PK_Orders:

Alter table Orders drop constraint PK_Orders;

Create a clustered index on the partition scheme. The command is as follows:

Create unique clustered index PK_Orders ON Orders (OrderDate) ON

OrderDatePScheme (OrderDate)

Assuming that the data in the OrderDate column is unique in the table, the table will be partitioned Based on the partition scheme OrderDatePScheme, and eventually divided into four small parts, stored in four file groups. If you have any questions about how to partition, I suggest you read Microsoft's official article "partition tables and indexes in SQL Server 2005" (Address: http://msdn.microsoft.com/en-us/library/ms345146%28SQL.90%29.aspx ).

Step 2: Use the TSQL template to better manage DBMS objects (additional step)

To better manage DBMS objects (stored procedures, functions, views, triggers, etc.), a consistent structure is required. However, for some reasons (mainly time constraints ), we failed to maintain a consistent structure, so when we encountered performance problems or other reasons to re-debug the code, it was like a nightmare.

To help you better manage DBMS objects, I have created some TSQL templates that allow you to quickly develop DBMS objects with consistent structures.

If your team is responsible for checking the TSQL code written by team members, there is a "Review" section in these templates to describe the review comments.

I submit several common DBMS Object Templates, which are:

Template_StoredProcedure.txt: stored procedure template (http://www.codeproject.com/KB/database/OrganizeFilesAndPartition/Template_StoredProcedure.txt)

Template_View.txt: View template (http://www.codeproject.com/KB/database/OrganizeFilesAndPartition/Template_Trigger.txt)

Template_Trigger.txt: trigger template (http://www.codeproject.com/KB/database/OrganizeFilesAndPartition/Template_ScalarFunction.txt)

Template_ScalarFunction.txt: scalar function template (http://www.codeproject.com/KB/database/OrganizeFilesAndPartition/Template_TableValuedFunction.txt)

Emplate_TableValuedFunction.txt: Table value function template (http://www.codeproject.com/KB/database/OrganizeFilesAndPartition/Template_View.txt)

1) how to create a template?

First download the template code given above, open the SQL Server console, and click "View" * "template Browser ";

Click the "Stored Procedure" node, right-click, and select "new" * "template" in the pop-up menu to give the template an easy-to-understand name;

Right-click the newly created template and choose Edit from the shortcut menu. In the displayed window, enter the authentication information and click Connect ";

After successful connection, open the downloaded template_storedprocedure.txt file in the editor, copy the content in the file and paste it into the new template, and then click "save ".

The above is the process of creating a stored procedure template. The process of creating other DBMS objects is similar.

2) how to use a template?

After creating a template, the following shows how to use the template.

In the template browser, double-click the created stored procedure template. In the displayed authentication dialog box, enter the corresponding identity information and click Connect ";

After the connection is successful, the template will be opened in the editor, and the variables will be assigned with appropriate values;

Press Ctrl + Shift + M to specify the template value, as shown in;

Figure 1 specifies the value of the template parameter

Click "OK", select the target database in the SQL Server console, and click "execute;

If everything goes well, the stored procedure is created successfully. You can create other DBMS objects according to the preceding steps.

Summary

Optimization is a kind of "mentality". When optimizing database performance, we must first believe that performance problems can always be solved, and then we will try our best to optimize it based on experience and best practices, the most important thing is to prevent performance problems as much as possible. During development and deployment, all available technologies and experiences should be used for advance evaluation, do not try to solve the problem only when the problem occurs. During the development period, spend more than one hour implementing the best practices. In the end, it may save you hundreds of hours for fault diagnosis and troubleshooting, learn to work smartly, instead of working hard!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More