SQL Server Performance Optimization-partition (partition creation)

Source: Internet
Author: User

Compared with compression, database partitioning is more complex and cumbersome. In addition, unlike compression, a partition is an operation that requires long-term maintenance cycle change.

Partitioning is to physically cut big data into several independent small parts, so that only one or several partitions can be retrieved during the query to reduce the impact of data; in addition, for partitions in different file groups, the parallel query performance is higher than the query performance for the entire table.

In fact, SQL Server 2005 already contains the partition function. Even before SQL Server 2005, a function called "partitioned views" exists, you can union a table with the same structure in a view to achieve the same effect as the current partition table. In SQL Server 2008, the partition function has been significantly enhanced, so that we can not only partition tables and indexes,You can also lock the partition instead of the entire table..

Partition Column

Like compression, SQL Server 2008 also provides a partition wizard interface. In the Enterprise Manager, right-click the table to be partitioned and choose storage-> create partition:

All fields in the table are listed, including the field type, length, precision, and decimal places. You can select any one column as the partition column ), it is not just a number or date type. Even string columns can be partitioned alphabetically. The following types of Columns cannot be used for partitioning:Text, ntext, image, XML, timestamp, varchar (max), nvarchar (max), varbinary (max), alias, hierarchyid, spatial index, or CLR user-defined data type. In addition, if you use a calculated column as a partition column, you mustSet as persistent Column(Persisit ).

Below the list, two options are provided:

    1. Distribute to available Partition Table:
      This requires that there be another partition table in the same database, and the partition column of the table and the type of the currently selected columnCompletely consistent.
      The advantage is that when two tables are associated in the query and their associated columns are partition columns, using the same partition policy will be more efficient.
    2. Change the storage space of Non-unique indexes and unique indexes to be consistent with the index partition column.:
      In this way, all indexes in the table are partitioned together to achieve "alignment ". This is an important and troublesome option. For specific requirements, see msdn (special guidelines for partitioned indexes ).
      The advantage is that the table and the index partition are consistent. On the one hand, it is more efficient to use the index during queries, andMove into/out PartitionIt will also be more efficient.

Note: It is recommended thatUse clustered index columns as partition Columns. On the one hand, the index structure should be related to the query, so the consistency between the partition column and the index will ensure the maximum efficiency of the query; on the other hand, ensuring index alignment and clustered index alignment is a prerequisite for smooth migration and removal of partitions. Otherwise, migration may fail, migration and removal of partitions are also an important strategy for managing big data --Sliding Window(Slidewindow) basic operation of the policy. In addition, if you want to align indexes,The compression modes of all indexes and tables must be consistent..

Partitioning functions and partitioning Solutions

After the partition column is selected, if the option "allocate to available Partition Table" is not applied, select \ createPartition functionsAndPartition Scheme. The partition function specifies the partition boundary, while the partition scheme plans the file groups stored by each partition.

The wizard interface is as follows:

Left boundary indicates that the boundary value of each partition is included in the left-side partition of the boundary value, that is, the data constraint in each partition is<= Specified Boundary ValueRight boundary indicates that the boundary value of each partition is included in the partition on the right of the boundary value.<Specified boundary value.

The list below lists the existing partitions under the current partition scheme. The filegroup specifies the location where each partition is stored. If the partition is placed in different file groups on different disks, the read/write operations on different disks do not affect each other, this improves the efficiency of parallel processing of Partitioned Tables. Generally, it is safer to place all partitions in the same file group. For more information about file groups, see SQL Server filegroups.

Note: HereThe last partition has no specified boundary.Is used to save all> (left boundary) or> = (right boundary) data of the last partition boundary.

If you select a time field as the partition column, you can use the set button to implement conditional grouping:

This makes it easy to set the start time and End Time to automatically partition the table according to the specified time period, but you still need to manually specify the file group for each partition.

After creating a partition scheme, you can use estimate sotrage to estimate the number of rows and space usage of each partition. However, unless you need to plan your partition policy based on the occupied space or number of rows, generallyIt is not recommended to estimate hereBecause for empty tables, the estimated results are of course 0, and if the table already contains a large amount of data, the estimation takes a long time.

Create a partition

Through the above settings, the partition has been basically completed. At the end of the wizard, you can choose whether to create a script or execute the partition operation immediately.

We can view the script for creating partitions in different situations:

1. If the table does not have an index:

 Begin transactioncreate Partition Function [Testfunction] (  Datetime  )  As range  Left  For values  ( N' 2010-01-01t00: 00: 00'  ,  N' 2010-02-01t00: 00: 00'  ,
N' 2010-03-01t00: 00: 00' , N' 2010-04-01t00: 00: 00' , N' 2010-05-01t00: 00: 00' , N' 2010-06-01t00: 00: 00' ) Create Partition Scheme [testscheme] As Partition [Testfunction] To ( [Primary] , [Primary] , [Primary] ,
[Primary] , [Primary] , [Primary] , [Primary] ) Create clustered Index [Clusteredindex_on_testscheme_634025109502439124] On [DBO] . [Account] ( [Birthday] ) With ( Sort_in_tempdb = Off , Ignore_dup_key = Off , Drop_existing = Off , Online = Off ) On [Testscheme]( [Birthday] ) Drop Index [Clusteredindex_on_testscheme_634025109502439124] On [DBO] . [Account] With ( Online = Off ) Commit transaction

Create the Partition Function and partition scheme first, then create a clustered index on the partition column, partition according to the partition scheme, and finally Delete the index. </>

2. If the table has an index:

If no clustered index exists:

Create clustered Index [Clusteredindex_on_testscheme_634025229911990663] On [DBO] . [Account] ( [Birthday] )  With  (  Sort_in_tempdb  =  Off  ,  Ignore_dup_key  =  Off  ,  Drop_existing =  Off  ,  Online  =  Off  )  On [Testscheme] ( [Birthday] )  Drop Index [Clusteredindex_on_testscheme_634025229911990663] On [DBO] . [Account] With  (  Online =  Off  ) 

This is the same as the case where no index exists. If a clustered index exists in the table, the script changes:

 Create clustered Index [Ix_id] On [DBO] . [Account] ( [ID] ASC  )  With  (  Pad_index  =  Off  ,  Statistics_norecompute =  Off  ,  Sort_in_tempdb  =  Off  ,  Ignore_dup_key  =  Off  ,  Drop_existing  =  On  ,
Online = Off , Allow_row_locks = On , Allow_page_locks = On ) On [Testscheme] ( [Birthday] )

We can see that the original clustered index (ix_id) is rebuilt in the Partition Scheme.

If you select"Alignment Index"Option, partitions will be applied to all indexes:

 Create clustered Index [Ix_id] On [DBO] . [Account]( [ID] ASC  )  With  (  Pad_index  =  Off  ,  Statistics_norecompute  =  Off  ,  Sort_in_tempdb  =  Off  ,  Ignore_dup_key =  Off  ,  Drop_existing  =  On  ,
Online = Off , Allow_row_locks = On , Allow_page_locks = On ) On [Testscheme] ( [Birthday] ) Create nonclustered Index [Uix_birthday] On [DBO] . [Account] ( [Birthday] ASC ) With ( Pad_index = Off , Statistics_norecompute = Off , Sort_in_tempdb = Off , Ignore_dup_key = Off , Drop_existing = On ,
Online = Off , Allow_row_locks = On , Allow_page_locks = On ) On [Testscheme] ( [Birthday] ) Create nonclustered Index [Uix_name] On [DBO] . [Account] ( [Name] ASC ) With ( Pad_index = Off , Statistics_norecompute = Off , Sort_in_tempdb = Off , Ignore_dup_key = Off , Drop_existing = On ,
Online = Off , Allow_row_locks = On , Allow_page_locks = On )

The clustered index ix_id is partitioned as well as non-clustered index uix_name and uix_birthday.

Notes
    1. After a table is partitioned, it cannot be partitioned again.No method for directly canceling table partitions.
    2. If you want to view the partition status of a partitioned table and the number of rows and occupied space in each partition, you can use storage-> Management compression to view the status. You can also specify the compression mode for each partition here.
    3. If the Partition Table index is not alignedThe table cannot be switched in/out, And the Sliding Window operation cannot be performed..
    4. Partitions are actually added to each partition table, and the performance of the corresponding insert operation is also affected.
    5. Even if a partition is performed, if the query condition field is not associated with the partition column, the performance may not be improved.
Appendix: Description of partition parallel query

Because I mainly consider the efficiency of parallel query in actual operationsArticleBut some people mentioned in the comments, so the excerpt is as follows:

    1. Parallel queries certainly require multi-core support, and single-core parallel queries are impossible.
    2. In 2005, if there are more than two partitions, one thread corresponds to one partition, so if there are 10 threads but only three partitions, seven threads will be wasted.
    3. In 2008, this problem was improved, and all threads were put into all partitions. For details, see partitioning enhancements in SQL Server 2008.

Address: http://www.cnblogs.com/smjack/archive/2010/02/23/1671943.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.