In layman SQL Server 2008 partition functions and partition tables

Source: Internet
Author: User
Tags failover filegroup getdate new set one table



Http://www.cnblogs.com/zhijianliutang/archive/2012/10/28/2743722.html



When we have a larger amount of data, we need to split the large tables into smaller tables, and queries that only access departmental data can run faster, the rationale being that the data to be scanned becomes smaller. maintenance tasks (for example, rebuilding an index or backing up a table) can also run faster.



We can no longer get the partition by physically placing the table on multiple disk drives to split the table. If you place a table on a physical drive and you place the related table on another drive, you can improve query performance because multiple disk headers read data simultaneously when you run a query that involves inter-table connections. You can use the SQL Server filegroup to specify the disk where the table is placed.



For partitioning, there are three basic ways: hardware partitioning, horizontal partitioning, and vertical partitioning. Related scenarios can refer to SQL Books Online



Here we introduce the specific combat method of partition table:



The first step is to build the database we want to use, and most importantly, to set up multiple filegroups.



Let's start with a new set of four directories to form a filegroup, a directory to hold the main file: Primary



Three data Files directory: FG1, FG2, FG3



To build a library:


create  database  Sales on primary
(
   name=N‘Sales‘,
   filename=N‘G:\data\Primary\Sales.mdf‘,
   size=3MB,
   maxsize=100MB,
   filegrowth=10%
),
filegroup FG1
(
  NAME = N‘File1‘,   
  FILENAME = N‘G:\data\FG1\File1.ndf‘,   
  SIZE = 1MB,   
  MAXSIZE = 100MB,   
  FILEGROWTH = 10% 
),
FILEGROUP FG2   
(   
  NAME = N‘File2‘,   
  FILENAME = N‘G:\data\FG2\File2.ndf‘,   
  SIZE = 1MB,   
  MAXSIZE = 100MB,
  FILEGROWTH = 10%   
),
FILEGROUP FG3   
(   
  NAME = N‘File3‘,   
  FILENAME = N‘G:\data\FG3\File3.ndf‘,   
  SIZE = 1MB,   
  MAXSIZE = 100MB,   
  FILEGROWTH = 10%   
)   
LOG ON   
(   
  NAME = N‘Sales_Log‘,   
  FILENAME = N‘G:\data\Primary\Sales_Log.ldf‘,   
  SIZE = 1MB,   
  MAXSIZE = 100MB,   
  FILEGROWTH = 10%
)
GO


The second step: to establish a partition function, the purpose is to standardize the different data stored in different directories of the standard, simply say how to partition.


USE Sales   
GO
CREATE PARTITION FUNCTION pf_OrderDate (datetime)   
AS RANGE RIGHT   
FOR VALUES (‘2003/01/01‘, ‘2004/01/01‘) 
GO


We created a partition function for datetime data type, divided by time period
Filegroup Partition Value Range
FG1 1 (past year, 2003/01/01)
FG2 2 [2003/01/01, 2004/01/01]
FG3 3 [2004/01/01, next year]



Step three: Create a partition scheme that is associated to the partition function. The goal is to organize the partitioned functions that have been built into a set of scenarios, and the simple points are where we partition the data.


Use Sales
go
create  partition  scheme ps_OrderDate
as partition  pf_OrderDate
to(FG2,FG2,FG3)
go


It is simple to apply the partition function established in the second step to the established partition group.
Fourth step: Create a partitioned table. Creates a table and binds it to a partition scheme. We first set up two tables, one original table and another to archive the data and save the archived data.


Use Sales
go
create table Orders
(
   OrderID int identity(10000,1),
   OrderDate datetime  not null,
   CustomerID int not null,
   constraint  PK_Orders primary key(OrderID,OrderDate)
)
on ps_OrderDate(OrderDate)
go
create table OrdersHistory
(
   OrderID int identity(10000,1),
   OrderDate datetime  not null,
   CustomerID int not null,
   constraint  PK_OrdersHistory primary key(OrderID,OrderDate)
)
on ps_OrderDate(OrderDate)
go


Here, through the four steps above, we have completed a library with partitioned tables, and we are going to insert some data to test the suitability of our establishment.



First, since the January 1, 2003 is used as a differentiator, we first write the 2002 canonical data to the data table


USE Sales   
GO   
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2002/6/25‘, 1000)   
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2002/8/13‘, 1000)   
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2002/8/25‘, 1000)   
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2002/9/23‘, 1000)
GO


Likewise we write 2003 four data


USE Sales   
GO
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2003/6/25‘, 1000)
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2003/8/13‘, 1000)
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2003/8/25‘, 1000)
INSERT INTO dbo.Orders (OrderDate, CustomerID) VALUES (‘2003/9/23‘, 1000)   
GO


Let's see if this data is fully entered:






Because we have not archived data for the Ordershistory table, we are empty.



Let's do this by sub-conditional query:



1. Query a partition



Here we are going to use the $partition function. This function is explained in Books Online:


usage:
Returns the partition number for any specified partition function to which a set of partition column values will be mapped.

Grammar:
[ database_name. ] $PARTITION.partition_function_name(expression)
 
parameter:
Database_name
The name of the database containing the partitioning function.

Partition_function_name
The name of any existing partition function to which a set of partitioned column values is applied.

Expression
An expression whose data type must match or can be implicitly converted to its corresponding partition column data type. Expression can also be the name of the partition column currently participating in partition_function_name.

Return type:
Int

Remarks:
$PARTITION Returns the int value from 1 to the number of partitions of the partition function.

$PARTITION will return the partition number for any valid value, regardless of whether this value currently exists in a partitioned table or index that uses a partition function.


Let's query the first partition of the partition table order, with the following code:



We can see that the data we query is all 2002 years, that is, in the first partition we have deposited data is less than 2003, according to this inference 2003 years of data, it should exist in the second partition:






As we expected, we can group by this partition to see how many rows of data are in each partition, the code is as follows:


select $partition.pf_OrderDate(OrderDate) as Patition,COUNT(*) countRow from dbo.Orders
group by $partition.pf_OrderDate(OrderDate)


You can also use the $partition function to get the partition number of a set of partition identity column values, for example, to get 2002 of which partition, the code is as follows:






2. Archive data



If this is the beginning of 2003, then we can archive 2002 of all your transactions into the history order form Histroryorder we just established. The code is as follows:


Use salesgoalter table orders switch partition 1 to ordershistory partition 1go


Now let's look at the data for these two tables again:






At this point the Orders table has only 2003 years of data left, and the Odershistory table contains 2002 years of data.



The simple point is to import data from the first area into the first section of another partitioned table.



Of course, we can archive all the transaction data for 2003 years by the beginning of the 2004 year.


Use salesgoalter table orders switch partition 2 to ordershistory partition 2go


It is important to note that when we make the data modification according to the district, we must operate the partition table under the same partition function, and the partition structure corresponds, if not this will be an error, for example:






3. Adding partitions



When we need to add a new partition, we need to modify the partitioning scheme, for example, now that we are at the beginning of the year 2005, we need to prepare partitions for the 2005-year transaction, we need to add the partition:


USE [master]
GO
ALTER DATABASE [Sales] ADD FILEGROUP [FG4]
GO
ALTER DATABASE [Sales] ADD FILE ( NAME = N‘File4‘, FILENAME = N‘G:\data\FG4\File4.ndf‘ , SIZE = 3072KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FG4]
GO


We have created a new filegroup, and we have also modified the partitioning function and scheme as described above:


use Sales
go
alter partition scheme ps_OrderDate  next used [FG4]
alter partition function  pf_OrderDate() split range(‘2005/01/01‘)
go


Here we use the ALTER partition Scheme ps_orderdate Next used FG4 to specify the data for the new partition in that file. Here Next used FG4 Specifies the fourth filegroup we have just created. Of course we can put in the original set of filegroups, in order to prevent data chaos storage We are mostly the new set of filegroups.



Alter PARTITION function Pf_orderdate () split range (' 2005/01/01 ') represents the creation of a new partition, where split range is the key syntax for creating a new partition.



At this point, we have four partitions, where the interval is as follows:



Filegroup Partition Value Range
FG1 1 (past year, 2003/01/01)
FG2 2 [2003/01/01, 2004/01/01]
FG3 3 [2004/01/01, 2005/01/01]



FG4 4 [2004/01/01, next year]



4. Delete a partition



Delete partition is also called merge partition, simply say is two partition data to merge, for example we want to merge 2002 year partition and 2003 partition to a partition, we can use the following code:


use Sales
go
alter partition function  pf_OrderDate() merge range(‘2003/01/01‘)
go


That is, the 2003 partition point is removed, the data inside the partition is automatically merged together.



After executing the above code, the partition interval is as follows:
Filegroup Partition Value Range
FG2 1 [last year, 2004/01/01]
FG3 2 [2004/01/01, 2005/01/01]
FG2 3 [2005/01/01, next year]



Merging data from 2002 and 2003 to 2003 years later, we execute the following code:


SELECT Sales $PARTITION. Pf_orderdate (' 2003 ')


You will find that the result of the return is 1. The original return was 2, because the partition where the data was before 2002 was merged into the 2003 partition.
At this point we execute the following code:


SELECT *
FROM dbo.OrdersHistory   
WHERE $PARTITION.pf_OrderDate(OrderDate) = 2


The result is that a row of data is not returned, as is the case, because the Orderhistroy table only stores historical data for 2002 and 2003, and before the partition is merged, executing the code above will definitely query for 2003 data, but after merging the partitions, The above code actually queries the data for 2004 in the second partition.
But let's change to the following code:


SELECT *
FROM dbo.OrdersHistory   
WHERE $PARTITION.pf_OrderDate(OrderDate) = 1


8 rows of data will be queried, including 2002 and 2003, as the data for the 2002 and 2003 years after the merge partition becomes the 1th partition.
5. View Meta data



We can view our partitioning functions, partitioning schemes, boundary value points, etc. through three system views.


select * from sys.partition_functions   
select * from sys.partition_range_values
select * from sys.partition_schemes


Category: SQL SERVER good text to top focus on my collection the text fingertips flow
Follow-47
Fans-892 + plus attention70? Previous: HTTP protocol detailed (GO)
? Next article: yield keyword in C # posted @2012-10-28 16:49 Fingertip flow reading (13527) Comments (9) Edit Collection





List of commentsReply Reference#1楼2012-10-28 17:17 BirchTranscription, and the bad parts of the partition table you don't know yet. Support (0) objection (0)Reply Reference#2楼2012-10-28 17:24 BirchWhen a disk is idle, SQL Server does not detect which disk is idle, but writes the data to each partition (disk), so the original high-load disk is still high-load and cannot achieve load balancing
The good thing is to talk to you. Support (0) objection (0)Reply Reference#3楼 [Landlord] 2012-10-28 17:37 fingertip flow @ Hua Zi
Oh, I did not mention the partition table can achieve load balancing bar, the advantages I summed up to these points:
1, improve query performance: The key point is the parallel IO, of course, the real application we are generally disk array;
2. Enhanced usability: If one partition of the table fails, the data for the table on the other partitions is still available, and of course the production environment is not limited, it will be failover cluster
3, Maintenance Convenience: If a partition of the table fails, need to repair the data, repair the partition can be; Support (0) objection (0)Reply Reference#4楼2012-10-28 18:33 The 50-chord of Jin-CerseiIf sales is in one table, Salesdetail is another table, and both have foreign key associations.
So when you archive historical data from the sales table to the Saleshistory, are the associated records in Salesdetail needed to be archived, or do they affect query efficiency? Support (0) objection (0)Reply Reference#5楼2012-10-29 09:31 john23.netGood study Support (0) objection (0)Reply Reference#6楼2012-10-29 18:44 Birch@ Fingertip Flow
Quote@ Birch Oh, I did not mention the partition table can achieve load balancing bar, the advantages I summed up to these points:
1, improve query performance: The key point is the parallel IO, of course, the real application we are generally disk array;
2. Enhanced usability: If one partition of the table fails, the data for the table on the other partitions is still available, and of course the production environment is not limited, it will be failover cluster
3, Maintenance Convenience: If a partition of the table fails, you need to repair the data, repair the partition can be;

Yes, when a partition fails, only the table data of that partition cannot be queried, the table data of the partitioned partition can be queried

I just said, and did not mean to LZ, after all, LZ fans more than I!! Support (0) objection (0)Reply Reference#7楼2013-12-02 13:52 I'm in! I feel very good for a beginner who has just contacted the partition table. Support (0) objection (0)Reply Reference#8楼2014-06-27 14:20 Happy FlightLandlord, Hello! I now need to create a partitioned table in a detailed split, the original table only the primary key CompanyID, now split will have the primary key CompanyID (aggregation) and non-primary key CompanyID (non-clustered), is not I split the time still missing steps?
My current steps are:
1 、--adding filegroups
2 、--Add Files
3 、----Modify the partitioning scheme to set the new filegroup to next used
4 、--Modifying a partition function
Need to solve the problem, thank you! Support (0) objection (0)Reply Reference#9楼2016-04-26 16:48 dr_twilightAutomated partitioning scheme job
DECLARE @maxValue INT,
@secondMaxValue INT,
@differ INT,
@fileGroupName VARCHAR (200),
@fileNamePath VARCHAR (200),
@fileName VARCHAR (200),
@sql NVARCHAR (1000)


SET @fileGroupName = ' FileGroup ' +replace (replace (CONVERT (varchar, GETDATE (), 120), '-', '), ', '), ': ')
PRINT @fileGroupName
SET @sql = ' ALTER DATABASE [Sales] ADD FILEGROUP ' [email protected]
PRINT @sql
EXEC (@sql)

SET @fileNamePath = ' D:\DATA\ ' +replace (replace (CONVERT (varchar, GETDATE (), 120), '-', '), ', '), ': ', ') + '. NDF '
SET @fileName =n ' File ' +replace (replace (CONVERT (varchar, GETDATE (), 120), '-', '), ', '), ': ', ')

SET @sql = ' ALTER DATABASE [Sales] ADD FILE (name= ' [email protected]+ ' ', filename=n ' [email protected]+ ') to FILEGROUP ' + ' [email protected]
PRINT @sql
PRINT 1
EXEC (@sql)
PRINT 2

--Modify the partition scheme and use a new filegroup to hold the next new data
SET @sql = ' ALTER PARTITION SCHEME [ps_orderdate] NEXT used ' + ' [email protected]
EXEC (@sql)
--Partition architecture
PRINT 3
SELECT @maxValue =convert (Int,max (value))
From SYS. Partition_range_values PRV

SELECT @secondMaxValue = CONVERT (int,min (value))
From
(
SELECT TOP 2 * from SYS. Partition_range_values ORDER by VALUE DESC
)
PRV

SET @[email protected]-@secondMaxValue


ALTER PARTITION function Pf_orderdate ()--partition functions
SPLIT RANGE (@[email protected])





In layman SQL Server 2008 partition functions and partition tables


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.