SQL Server Partitioning

Source: Internet
Author: User
Tags filegroup

I. Introduction of SQL Server partitioning

In SQL Server, all tables and indexes of a database are treated as partitioned tables and indexes, and the default tables and index values contain a partition, which means that the table or index contains at least one partition. In SQL Server, data is partitioned horizontally, and multiple rows of data are mapped to a single partition. Partitioned tables or indexes, when executing a query or update, will be viewed as a single logical entity, simply by partitioning a table data into multiple tables to store, for large data volumes of the table, the table is divided into multiple queries, if only one partition data will reduce consumption and improve efficiency. It is important to note that a single index or partition of a table must reside in a database.

When using large amounts of data management, SQL Server uses partitioning to quickly access subsets of data and reduce IO productivity. At the same time, different partitions can be stored in different filegroups, and if the filegroups can be stored on different logical disks, the concurrent use of IO can be achieved to improve the efficiency. As shown in the following:

Ii. SQL Server partition creation

Although partitioning has such advantages, it cannot be used at will, and it does not say that the cumbersome zoning management, just the negative impact of cross-zoning, requires us to analyze whether it is necessary to use partitions. General partition created business features: used for statistics, less use of historical data, data self-growth, possible data redundancy, large amount of data volume insertion.

Before determining if a partition is appropriate to use, you need to understand how the partition was created, and the partition creation includes:

? New Partition function

? New Files and Filegroups

? New Partition Schema

? Create a new partition table or partition index

As shown in: The partitioning function defines how partitions are partitioned, partition schemas use partition functions and filegroups, determine partitioning schemes, and tables or indexes use partition schemas to implement partitioning. Between them is the use of relationships, a one-to-many relationship.

1 Creating a partition function

The partition function defines how the rows of a table or index are mapped to a specified partition based on the values of some columns. The partitioning function develops the partitioning method.

All data types that are valid for use as index columns can be used as partitioning columns, except for timestamp. You cannot specify ntext, text, image, XML, varchar (max), nvarchar (max), or varbinary (max) data types.

New Partition function Syntax:

CREATE PARTITION FUNCTION partition_function_name (input_parameter_type)

As RANGE [left | Right]

For VALUES ([Boundary_value [,... N]]) [;]

1.1 Left/right bounds range [Ieft | Right]

Specify left and right to determine whether the boundary value is on the left or the far side. The following example rang right partition by time:

/* NEW Partition Function */

use [partiontest];

CREATE PARTITION FUNCTION [pf_partiontest01] (datetime) as RANGE right for VALUES (' 2014-12-09 ', ' 2014-12-10 '

/* Partition function Query */

SELECT

Pf.name Partition Function name

When the case is boundary_value_on_right=1 then the "right" ELSE ' left ' END partition bounds mode

, Value partition bounds values

From Sys.partition_functions PF

Left JOIN sys.partition_range_values prv on prv.function_id = pf.function_id

ORDER by boundary_id

As shown in range right, the bounds value belongs to the left.

Attention:

1, most of the business is to use Range right, after all, the data of the late growth, the boundary is retained in the latest partition to help the boundary query; For example, the logic of partitioning the day is that the day's data exists in the partition of the day, if the partition of the day is 2014-12-08 00:00.000, range Right the 2014-12-08 00:00.000 data is attributed to the day of 2014-12-08, and RANGE left only has this limit time attributed to 2014-12-07. There is a certain difference from logic.

2, since there is a partition boundary problem, when merging partitions, it is worth discussing whether to specify whether the partition is to the left or the right merge is a question to be discussed in detail: Merging partition (merge)

2 Creating a Partition Schema

Partition schemas map partitions specified by partition functions to filegroups

Basic syntax: CREATE PARTITION SCHEME partition_scheme_name

As PARTITION partition_function_name

[All] to ({file_group_name | [PRIMARY]} [,... N]) [ ; ]

2.1 Need to create a specific file and filegroup

If it is necessary to create a specific file and filegroup before creating the partition schema, creating a new filegroup with different logical disks can improve the IO concurrency capability, while different files can improve the capacity of disaster recovery, and the other files can continue to be used if a file goes bad. Separate file storage, can also achieve different partitions independent backup, improve the data recovery rate.

The following example creates a partition function that divides a table or index into four partitions. Then create a partitioning scheme in which you specify the filegroup that owns each of the four partitions. This example assumes that a filegroup already exists in the database.

Create the following partition schema according to the new partition function "pf_partiontest01" and the default existing filegroup

/* New Partition Schema */

use [partiontest];

CREATE PARTITION SCHEME [pc_partiontest01] as PARTITION [pf_partiontest01] to (' Primary ', ' PartionTest201412 ', ' PartionTest201412 ')

/* Partition schema Query */

SELECT

Ps.name Partition_scheme,

Ds.name Filegroup,

Pf.name Partition_function,

pf.type_desc+ ': ' +case when pf.boundary_value_on_right=0 and left ' else ' rigth ' End Function_type

From Sys.partition_schemes PS

JOIN Sys.destination_data_spaces DDS on ps.data_space_id=dds.partition_scheme_id

JOIN sys.data_spaces DS on dds.data_space_id=ds.data_space_id

JOIN sys.partition_functions PF on ps.function_id=pf.function_id

The results are as follows:

A partition specifies that a filegroup is one more than a partition boundary, and the partition schema specifies which filegroup the specific partition data resides on. As shown in the following:

3 Creating a partitioned table

As shown, just make the partition schema and the partitioning column.

4 Creating a partitioned Index

Before you know the index partition, you need to understand the concept of partition index and table alignment, if the partition and table use the same partition schema and the same partitioning column, then the partition is the same as the table partition, we call alignment, and vice versa is not aligned. For partitioned table-aligned partitioned indexes, sort tables are generated at once in memory or tempdb at the time of sorting, and if not aligned with partitioned tables, the sort table is generated at the same time for each partition, so more memory space is required. For details, see misaligned and aligned index partitioning impact on performance

The following tests differ on partitioned tables on partitioned indexes.

--New test table

CREATE TABLE PartionTest01 (logid int identity (), OrderID int,salesdate datetime) on pc_partiontest01 ([salesDate])-- Specify partition scheme
--Query partition status

SELECT object_name (p.object_id) as [object_name],

Id.name as Index_name,

Ps.name Partition_scheme,

Ds.name Filegroup,

Pf.name Partition_function,

pf.type_desc+ ': ' +case when pf.boundary_value_on_right=0 then ' left ' else ' rigth ' End Function_type,

P.partition_number,

Isnull (PRV. VALUE, ') as Boundy_value,

P.rows

From Sys.indexes ID

JOIN sys.partition_schemes ps on ps.data_space_id = id.data_space_id

JOIN sys.destination_data_spaces dds on ps.data_space_id = dds.partition_scheme_id

JOIN sys.data_spaces ds on ds.data_space_id = dds.data_space_id

JOIN sys.partitions p on p.object_id = id.object_id and p.index_id = id.index_id and dds.destination_id = P.partition_numb Er

JOIN sys.partition_functions pf on ps.function_id = pf.function_id

Left JOIN sys.partition_range_values prv on prv.function_id = pf.function_id and prv.boundary_id = P.PARTITION_NUMBER-PF . boundary_value_on_right

WHERE object_name (id.object_id) = ' PartionTest01 '

4.1 Partitioning a clustered index

Test one: If a partition column is not explicitly specified in the clustered key for a non-unique clustered index partition, the partition column is added by default in the clustered index key list

Test two: For a unique clustered index partition, you must add a partition by column, as shown in the following error. New successful after adding the partitioning column, only the partitioning column is added to guarantee uniqueness on a single partition.

4.2 Partitioning a nonclustered index

Test one: Non-unique nonclustered index, which automatically contains the partition by column.

Test two: Unique nonclustered, you still need to specify the partitioning column.

Test Three: Specifies that the partitioning column is included to successfully

4.3 Conclusion

1. Creating a new unique index requires explicitly specifying the partitioning column to ensure that duplicate key values do not exist in the table.

2, new non-unique index if the partition by column is not explicitly specified, the partitioning column is automatically specified as the index key column

3. New contains non-unique nonclustered indexes and will no longer be added to the partition by column index key columns.

III. SQL Server partition management 1 split partition (split)

Splitting the partition is the addition of a new partition on the existing partition. As shown, split partition 03 into 03 and 04 partitions, the Split method first locks all the data of the old 03 partition, and then migrates the old 03 partition related data to partition 04, and finally deletes the corresponding partition data on the old 03, which consumes Io, resulting in IO log read and write is four times times the transferred data.

Therefore, on the administrative partition, the partition is generally added in advance. For example, if there is no data for the current 03 partition, we complete the addition of the 04 partition.

Add partition to specify the next use filegroup in advance. As shown in the following statement

--Adding partitions

Use [Partiontest]

ALTER PARTITION SCHEME [pc_partiontest01] NEXT used [PartionTest201412]

use [partiontest];

ALTER PARTITION FUNCTION [pf_partiontest01] () SPLIT RANGE (' 2014-12-11 ')

Indicates that the partition was added successfully:

2 Merging Partitions (merge)

After the migration of partition data, partition merging is required, that is, deleting useless partition, partition merging is same as splitting, it will cause a lot of IO consumption, so this partition data will usually be sliced out, and the partition can be merged with many partitions.

Since the merge is merging two partitions, is there a need to merge the partition boundary points to the left or merge to the right? This depends on the range mode, and if range left is merged to the right, RANG-R will be merged to the Ieft.

Partition 4, which is derived from the split partition above, now merges partition 4 with the range right partitioning method, so partition 4 is merged to the left and merged into Partition 3. The following results can be used to demonstrate this view.

use [partiontest];

ALTER PARTITION FUNCTION [pf_partiontest01] () Merge RANGE (' 2014-12-11 ')

3 Toggle partition (switch)

Partitioning data can quickly and efficiently manage subsets of data. You can use Alter TABLE ... SWITCH PARTITION. Statement to transfer a subset of data efficiently and quickly. When a partition moves, it simply modifies the relevant metadata and does not move the data. So the switching speed is very fast.

Switching partitions requires the following requirements:

1, the original table and the target table structure must be the same, and must be in the same filegroup.

2. The target table or partition must be empty.

3. If the source table has a clustered partition index, the target table is required to have the same clustered partition index.

4. All indexes of the source table must be aligned with the source table partition.

5, if the target table exists index, constraint, foreign key, requires the source table must be the same structure as the target table index, constraints, foreign keys.

6. Replication distribution cannot exist for both the target table and the source table.

7. The constraints on whether the partitioning column can be empty must be the same.

The general switching partition can be processed as follows:

1. Assign an existing table as a partition to an existing partitioned table.

ALTER TABLE [dbname]. [dbo]. [TableName] SWITCH to [dbname]. [dbo]. [Staging_tablename] PARTITION source_partition_number_expression

2. Switch partitions from one partitioned table to another partitioned table.

ALTER TABLE [dbname]. [dbo]. [TableName] SWITCH PARTITION source_partition_number_expression to [dbname]. [dbo]. [Staging_tablename] PARTITION source_partition_number_expression

3. Switch a partition to an existing table.

ALTER TABLE [dbname]. [dbo]. [TableName] SWITCH PARTITION source_partition_number_expression to [dbname]. [dbo]. [Staging_tablename]

4 $PARTITION

Returns the partition number for any specified partition function, and a set of partition column values is mapped to that partition number.

[database_name.] $PARTITION. Partition_function_name (expression)

As shown in the following example:

/* Returns a value belonging to a partition */

SELECT $PARTITION. pf_partiontest01 (' 2014-12-26 00:00:00.000 ')

/* Query all values for a partition */

SELECT * FROM [dbo]. [PARTIONTEST01]

where $PARTITION. pf_partiontest01 (salesDate) =2

5 Effects of misaligned and aligned index partitioning on performance

For partitioned table-aligned partitioned indexes, sort tables are generated at once in memory or tempdb at the time of sorting, and if not aligned with partitioned tables, the sort table is generated at the same time for each partition, so more memory space is required.

The more partitions you have, the more memory you need. The minimum size of each sorted table for each partition is 40 pages, 8 KB per page. For example, a non-aligned partitioned index with 100 partitions requires sufficient memory to sequentially sort 4,000 (40 * 100) pages at a time. If there is so much available memory, the build operation will succeed, but performance may be degraded. If there is not so much memory available, the build operation will fail. Aligning a partitioned index with 100 partitions requires only a memory that has a sort of 40 pages, because sorting is not performed at the same time.

Whether you are aligning an index or a non-aligned index, you may need more memory if SQL Server applies parallelism to the build operation on a multiprocessor computer. This is because the higher the degree of parallelism, the more memory is required. For example, if SQL Server sets the degree of parallelism to 4, a non-aligned partitioned index with 100 partitions will require enough memory for four processors to sort 4,000 pages (that is, 16,000 pages) at the same time. If the partitioned index is aligned, the required memory will be reduced as long as four processors are sorted on 40 pages (160 pages, or 4 * 40) respectively.

SQL Server Partitioning

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.