Understanding partitioned tables in SQL Server (GO)

Source: Internet
Author: User
Tags filegroup one table

Introduction

The partition table is the feature introduced in the version after SQL SERVER2005. This feature allows a logical table to be physically divided into many parts. For the previous version of SQL SERVER2005, the so-called partitioned table is just a distributed view , which is a union operation of multiple tables.

A partitioned table is logically a table and physically more than one table. This means that the partition table and the normal table are the same from the user's point of view. The concept can be as simple as the following:

For the previous version of SQL SERVER2005, there is no concept of partitioning, so-called partitioning is just a distributed view:

The partition table described in this article refers to the partitioning table feature introduced after SQL SERVER2005.

Why do I want to partition a table

Before answering the title question, it should be explained that the table partitioning feature is only available in enterprise or development , and understanding the concept of table partitioning also requires understanding the concepts of files and filegroups in SQL Server.

Partitioning a table needs to be used in a variety of scenarios. In general, the use of table partitioning is primarily used for:

    • Archiving, such as the distribution of data from 1 years ago in a sales record to a dedicated archive server
    • Easy to manage, such as dividing a large table into several small tables, backup and restore no longer need to back up the entire table, you can back up the partition separately
    • Improved usability, when a partition is kneeling, only one partition is unavailable and the other partitions are unaffected
    • Improve performance, this is often the goal of most people partitioning, the distribution of a table to different hard disks or other storage media, will greatly improve the speed of queries.

steps for partitioning a table

The definition of a partitioned table is broadly divided into three steps:

    1. Defining partition Functions
    2. Defining the partition Architecture
    3. Defining partition Tables

The relationship between partition functions, partition schemas, and partition tables is as follows:

Partitioned tables rely on partition architectures, and partition architectures depend on partition functions. It is important to note that the partitioning function is not part of a specific partition structure or partition table, but that the relationship between them is only used.

Let's look at an example of how to define a partitioned table:

Suppose we need to define the partition table structure as follows:

The first column is self-increment, OrderID is the Order ID column, and salesdate is the Order date column, which is the basis for which we need to partition.

Let's follow the three steps above to implement a partitioned table.

Defining partition Functions

The partition function is used to determine which partition the data row belongs to, by setting the boundary value in the partition function so that its partition is determined based on the value of a particular column in the row, in the above example, we can determine its different partitions by the value of salesdate. Suppose we want to define two boundary values (boundaryvalue ) is partitioned, three partitions are generated, where I set the boundary values to be 2004-01-01 and 2007-01-01, respectively, and the table in the previous example is divided into three zones based on these two boundary values:

In MSDN, the prototype for defining a partition function is as follows:

CREATE PARTITION FUNCTION partition_function_name (input_parameter_type) as RANGE [left | Right] for VALUES ([Boundary_value [,... N]]) [;]

By defining the prototype of the partitioning function, we see that there is no specific table involved. Because the partition function is not bound to a specific table. The range left and right are also visible in the previous prototype. This parameter determines whether the threshold value itself should be "left" or "correct":

Here we define the partition function according to the above parameters:

Through the system view, you can see that this partition function has been created successfully

defining the partition architecture

Defining a partition function simply means knowing how to differentiate the values of a column from a different partition. The partitioning architecture is needed to define the storage of each partition. Using a partitioning framework requires you to have a little knowledge of files and filegroups.

Let's first look at the prototype of the MSDN Partition architecture:

CREATE PARTITION SCHEME partition_scheme_nameas PARTITION partition_function_name[All "to ({file_group_name | [PRIMARY]} [,... N]) [ ; ]

From the prototype point of view, the partition architecture is only dependent on partition functions. The partition structure is responsible for assigning which filegroup each zone belongs to, and the partition function determines how to logically partition:

Create a partition structure based on the previously created partition function:

Defining partition Tables

The next step is to create a partitioned table. The table has been created to determine whether it is a partitioned table. Although in many cases you will think of partitioning a table when you find that it is already large enough, the partition table can only be specified as a partitioned table when it is created.

Add 50,000 test data to the newly created partition table partitionedtable, where salesdate is randomly generated and randomly distributed from 2001 to 2010. After adding the data, we look at the results in the following statement:

Select CONVERT (varchar (), ps.name) as Partition_scheme,p.partition_number, convert (varchar), ds2.name) as Filegroup, CONVERT (varchar), isnull (V.value, "), as Range_boundary, str (p.rows, 9) as Rowsfrom sys.indexes I Join Sys.partition_schemes PS on i.data_space_id = ps.data_space_id Join sys.destination_data_spaces Ddson ps.data_space _id = dds.partition_scheme_id Join sys.data_spaces ds2 on dds.data_space_id = ds2.data_space_id join sys.partitions p on D ds.destination_id = P.partition_numberand p.object_id = i.object_id and p.index_id = i.index_id Join SYS.PARTITION_FUNCTI ONS pf on ps.function_id = pf.function_id left JOIN sys. Partition_range_values V on pf.function_id = V.function_idand v.boundary_id = P.partition_number-pf.boundary_value_on_ Right WHERE i.object_id = object_id ('partitionedtable ') and i.index_id in (0, 1) Order by P.partition_number 

You can see the data distribution of our partitions:

Partition Table Segmentation

Partition Table segmentation. The equivalent of creating a new partition, inserting the original partition into the new partition, and then deleting the contents of the old partition, with the concept of:

Suppose I add a new split point: 2009-01-01, the concept is as follows:

We can see that if the partition is divided into partitions 3 of the content needs to be split into Partition 4, then the data needs to be copied to partition 4, and delete the corresponding data on partition 3.

This operation consumes IO very very much, and locks the contents of partition three in the process of splitting, which makes the content of partition three unusable. Not only that, the log content generated by this operation will be 4 times times the transferred data!

So if we do not want to because of this operation to the customer caused trouble and the boss burst chrysanthemum words ... It is better to build the split point in the future (i.e., to pre-establish a split point), such as 2012-01-01. The content within partition 3 is unaffected. When data is added to the next 2012, it is automatically inserted into partition 4.

Splitting an existing partition requires two steps:

1. First tell SQL Server which filegroup to put the newly established partition into

2. Create a new split point

This can be done using the following statement:

If our partitioning framework has specified next used at the time of definition, then add the split point directly.

View the long statement of the partition in front of the article: See again:

The new partition has been added!

Merging of Partitions

The merging of partitions can be regarded as the inverse operation of partition segmentation. Partition merging needs to provide a split point, which must already exist in the existing split table, or the merge will be an error

Suppose we need to merge partitions according to 2009-01-01, the concept is as follows:

You only need to use the merge parameter:

Then look at the partition information:

It is worth noting that, assuming that partition 3 and partition 4 are no longer a filegroup, which filegroup should exist after the merge? In other words, is the partition 3 merged into Partition 4 or partition 4 merged into Partition 3? This needs to see if our partitioning function defines left or right. If you define a ieft. The partition 3 on the other side is merged into Partition 4 on the other side. Instead, the partition 4 is merged into Partition 3:

Summary

This article explains how partitioned tables are used in SQL Server. Partitioned tables are a very powerful feature. Using partitioned tables is better than traditional partitioned views for less DBA management!

SQL Server Table Partitioning

The principle is similar to put a table of data in different partitions, when the query, if all in the same separate partition, you do not have a full table scan, so in these cases can improve the efficiency of the query, but if all the partitions to query all the data, the partition will not improve efficiency.

123456789Ten One A - - the - - - + - + A at - - - - - in - to --创建分区表过程一共分为三步:创建分区函数、创建分区方案、创建分区表USE [CardID]GOBEGIN TRANSACTION----创建分区函数CREATE PARTITION FUNCTION [SlotecardFunction](datetime) AS RANGE left FOR VALUES (N‘2014-03-26T00:00:00‘, N‘2014-04-26T00:00:00‘, N‘2014-05-26T00:00:00‘, N‘2014-06-26T00:00:00‘, N‘2014-07-26T00:00:00‘, N‘2014-08-26T00:00:00‘, N‘2014-09-26T00:00:00‘)--查看分区函数是否创建成功 --select * from sys.partition_functions--创建分区方案 关联到分区函数CREATE PARTITION SCHEME [Slotecard] AS PARTITION [SlotecardFunction] TO ([PRIMARY], [fg1], [fg2], [fg3], [fg4], [fg5], [fg6], [fg7]) --查看已创建的分区方案--select * from sys.partition_schemesALTER TABLE [dbo].[ak_SloteCardTimes] DROP CONSTRAINT [PK_ak_SloteCardTimes]--这里要注意一个语法,因为现在表已经存在了,那么就不能再通过CREATE TABLE的方式来创建分区表了,而是通过创建一个聚集索引的方式。但又把它删除掉。--但是,如果表上面已经有一个聚集索引呢?肯定会出错,因为一个表只能有一个聚集索引。那么该怎么办呢?--我们发现向导会这样做,先把原先的聚集索引改为非聚聚的。ALTER TABLE [dbo].[ak_SloteCardTimes] ADD  CONSTRAINT [PK_ak_SloteCardTimes] PRIMARY KEY NONCLUSTERED (    [RecordID] ASC)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY< c21/> --创建聚集索引CREATE CLUSTERED INDEX [ClusteredIndex_on_Slotecard_635317831823593750] ON [dbo].[ak_SloteCardTimes] (<br> [SloteCardTime])WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [Slotecard]([SloteCardTime]) --删除聚集索引DROP INDEX [ClusteredIndex_on_Slotecard_635317831823593750] ON [dbo].[ak_SloteCardTimes] WITH ( ONLINE = OFF ) COMMIT TRANSACTION

In addition to providing a wizard to create partitions, there is also a wizard to manage the partitions, which can be done primarily switch,merge,split these operations. You can also view the data or manually create a new partition temp table

12345678 select $PARTITION.Slotecard([SloteCardTime]) as 分区编号,count([RecordID]) as 记录数 from [ak_SloteCardTimes] group by $PARTITION.Slotecard([SloteCardTime])    -- 查询某个分区--这里我们要用到$PARTITION 函数,这个函数可以帮助我们查询某个分区的数据,还可以检索某个值所隶属的分区号。$PARTITION 函数的进一步细节可以查看MSDN--查询已分区表Order的第一个分区,代码如下:   select*from [ak_SloteCardTimes]    where $partition.Slotecard   ([SloteCardTime])=1

Understanding partitioned tables in SQL Server (GO)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.