SQL Server Partitioned Tables process massive data volumes

Source: Internet
Author: User
Tags filegroup
Are you struggling to optimize your SQL Server database? If your database contains many very large tables, the partition function can help you a lot ......


[IT expert network exclusive] Are you also struggling to optimize your SQL Server database? If your database contains many very large tables, the partition function can help you a lot, because it can split these large tables into independent file groups. This technology allows you to distribute data across different physical disks and optimize your query performance by adjusting their parallelism.

The SQL Server data table partition process is divided into three steps:

1) create a partition function

2) create a partition scheme

3) partition the table

  Step 1: Create a partition function

The partition function defines [u] how [/u], that is, how you want SQL Server to partition data. Here, we will not use a table as an example, but summarize the technology of data segmentation.

Partitions are implemented by specifying the dividing line of each partition. For example, assume that we have a mers MERs table that contains information about all the Customers of the enterprise. The customer information is identified by a unique customer number, ranging from 1 to 1000000. We can use the following Partition Function (customer_Partfunc) to divide the table into four partitions equally:

Create partition function customer_partfunc (int)
AS RANGE RIGHT
For values (250000,500 000, 750000)

The split boundary specifies four partitions. The first partition contains all records whose values are less than 250000. The second partition contains all records with values between 250000 and 499999. The third partition contains all records with values between 500000 and 749999. All other records greater than or equal to 750000 are included in the Fourth partition.

Note that the "range right" clause is used in this example. This indicates that the demarcation value is on the right side of the partition. Similarly, if the range left clause is used, the first partition contains all records whose values are less than or equal to 250000; the second partition will contain all records with values between 250001 and 500000, and so on.

  Step 2: Create a partition scheme

Once you have created a partition function that defines how to partition data, the next step is to create a partition scheme, defining [u] where [/u], that is, where you want to partition the data. This is a straightforward process. For example, if I have four file groups named from "fg1" to "fg4", the following partition scheme can be used:

Create partition scheme customer_partscheme
As partition customer_partfunc
TO (fg1, fg2, fg3, fg4)

Note that we have connected a partition function to the partitioning scheme, but we have not connected the partitioning scheme to any specific database table. This is the time to reuse the function. We can use this function to apply the Partition Scheme (or just a partition function) to any data in the database table.

 Step 3: partition the table

After creating the partition scheme, you can partition the table. This is the simplest step. You only need to add the "ON" clause in the table creation statement to specify the table partitioning scheme and the table columns to apply the partitioning scheme. You do not need to specify the partition function, because the partition function has been defined in the Partition Scheme.

For example, if you want to use the preceding partitioning scheme to create a customer table, you need to use the following Transact-SQL statement:

Create table MERs (FirstName nvarchar (40), LastName nvarchar (40), CustomerNumber int)
ON customer_partscheme (CustomerNumber)

The size of a super-large database is usually several hundred GB, and sometimes it needs to be measured in TB. However, the data volume of a single table often reaches hundreds of millions of records, and the number of records increases over time. This not only affects the database operation efficiency, but also increases the difficulty of database maintenance. In addition to the data volume of tables, different access modes may also affect performance and availability. All of these problems can be greatly improved by reasonably Partitioning Large tables. When tables and indexes become very large, partitions can divide data into smaller and easier-to-manage parts to improve system operation efficiency. If the system has multiple CPUs or multiple disk subsystems, you can achieve better performance through parallel operations. Therefore, partitioning a large table is a very efficient way to process massive data. This article describes how to create and modify a partition table and how to view the partition table through a specific example.

1 SQL Server 2005

   <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" SQL _20Server "; KeyGate_ads.ShowGgAds (this," SQL _20Server ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; border-bottom: 0px; text-deco Ration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BDTQhftmkSOC3NI3-vAOPvKjiB7bmnEG21sHaBcCNtwHAuAIQARgBIKy8kQooFDgAUMKCtur9 _____ wFgnaHfgdgFsgEPd3d3LmNuYmxvZ3MuY29tyAEB2gE7aHR0cDovL3d3dy5jbmJsb2dzLmNvbS8xMzU5MC9hcmNoaXZlLzIwMDcvMDcvMDkvODEwNzcwLmh0bWypAsAaeNpXkIM-yAKAi_ADqAMB6APXAugDjAPoAw3oAwXoAwOIBAGQBAGYBAA & num = 1 & adurl = http://www.ebackup.com.cn/shujukupro.asp&client=ca-pub-1681215984289622 "); GgKwClickStat (" SQL Server "," www.ebackup.com.cn "," afc "," 2000072864 "); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "SQL _20Server" "> SQL Server Microsoft launched SQL Server 2005 within five years. <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" _ blank "; KeyGate_ads.ShowGgAds (this," _ blank ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; border-bottom: 0 p X; text-decoration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BDTQhftmkSOC3NI3-vAOPvKjiB7bmnEG21sHaBcCNtwHAuAIQARgBIKy8kQooFDgAUMKCtur9 _____ wFgnaHfgdgFsgEPd3d3LmNuYmxvZ3MuY29tyAEB2gE7aHR0cDovL3d3dy5jbmJsb2dzLmNvbS8xMzU5MC9hcmNoaXZlLzIwMDcvMDcvMDkvODEwNzcwLmh0bWypAsAaeNpXkIM-yAKAi_ADqAMB6APXAugDjAPoAw3oAwXoAwOIBAGQBAGYBAA & num = 1 & adurl = http://www.ebackup.com.cn/shujukupro.asp&client=ca-pub-1681215984289622 "); GgKwClickStat (" Database "," www.ebackup.com.cn "," afc "," 2000072864 "); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "_ u6570_u636E_u5E93" "> Database Platform, its <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" _ blank "; KeyGate_ads.ShowGgAds (this," _ u6570_u636E ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; border-bottom: 0px; text-deco Ration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BDTQhftmkSOC3NI3-vAOPvKjiB7bmnEG21sHaBcCNtwHAuAIQARgBIKy8kQooFDgAUMKCtur9 _____ wFgnaHfgdgFsgEPd3d3LmNuYmxvZ3MuY29tyAEB2gE7aHR0cDovL3d3dy5jbmJsb2dzLmNvbS8xMzU5MC9hcmNoaXZlLzIwMDcvMDcvMDkvODEwNzcwLmh0bWypAsAaeNpXkIM-yAKAi_ADqAMB6APXAugDjAPoAw3oAwXoAwOIBAGQBAGYBAA & num = 1 & adurl = http://www.ebackup.com.cn/shujukupro.asp&client=ca-pub-1681215984289622 "); GgKwClickStat (" data "," www.ebackup.com.cn "," afc "," 2000072864 "); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "_ u6570_u636E" "> data The database engine provides a safer and more reliable storage function for relational and structured data. <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" _ u7528_u6237 "; consume (this," _ u7528_u6237 ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; border-bottom: 0px; text-deco Ration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BVtzvftmkSOC3NI3-vAOPvKjiB7TW-0247cDxBsCNtwGgnAEQAxgDIKy8kQooFDgAUIXNkPz ______ wFgnaHfgdgFsgEPd3d3LmNuYmxvZ3MuY29tyAEB2gE7aHR0cDovL3d3dy5jbmJsb2dzLmNvbS8xMzU5MC9hcmNoaXZlLzIwMDcvMDcvMDkvODEwNzcwLmh0bWyAAgGoAwHoA9cC6AOMA-gDDegDBegDA4gEAZAEAZgEAA & num = 3 & adurl = http://www.macau.com/index.php%3Foption%3Dcom_casinos%26Itemid%3D182%26task%3Dshow_details%26id%3D10%26lang%3Ds_chinese&client=ca-pub-1681215984289622 "); GgKwClickStat (" user "," www.macau.com "," afc "," 2000072864 "); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "_ u7528_u6237" "> User You can build and manage highly available and high-performance data applications for your business. In addition, SQL Server 2005 integrates the Analysis, Report, integration, and notification functions. This allows enterprises to build and deploy cost-effective BI <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" _ blank "; KeyGate_ads.ShowGgAds (this," _ blank ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; borde R-bottom: 0px; text-decoration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BtJXaftmkSOC3NI3-vAOPvKjiB9XjqmXFz8myBMCNtwGgnAEQBhgGIKy8kQooFDgAUPCpzdT6 _____ wFgnaHfgdgFoAGj2vX-A7IBD3d3dy5jbmJsb2dzLmNvbcgBAdoBO2h0dHA6Ly93d3cuY25ibG9ncy5jb20vMTM1OTAvYXJjaGl2ZS8yMDA3LzA3LzA5LzgxMDc3MC5odG1sgAIBqAMB6APXAugDjAPoAw3oAwXoAwOIBAGQBAGYBAA & num = 6 & adurl = http://www.edong.com/v8/delicatedserver/idcadd.php&client=ca-pub-1681215984289622 "); GgKwClickStat (" solution "," www.edong.com "," afc "," 2000072864 "); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "_ u89E3_u51B3_u65B9_u6848" "> Solution To help the team push data applications to various business fields through the scorecard, Dashboard, Web Services, and mobile devices. SQL Server 2005 provides innovative solutions for developers, database administrators, information workers, and decision makers, and provides more benefits from data.

It brings new features such as T-SQL enhancement, data partitioning, Service proxy and.. Net Framework integration, which greatly enhances manageability, availability, scalability, and security.

2 Implementation of table partitions

Table partitions are divided into horizontal partitions and vertical partitions. A horizontal partition divides a table into multiple tables. Each table contains the same number of columns, but fewer rows. For example, you can partition a table with billions of rows into 12 tables horizontally. Each small table indicates the data of one month in a specific year. For any query that requires data of a specific month, you only need to reference the table of the corresponding month. Vertical partitioning divides the original table into multiple tables that only contain fewer columns. Horizontal partitioning is the most common partitioning method. This article describes the implementation method of horizontal partitioning.

A common method of horizontal partitioning is to partition data horizontally based on the period and usage. For example, in this example, an SMS sending record table contains data of the last year, but only regularly accesses data of the current quarter. In this case, you can divide the data into four zones, each of which contains only one quarter of data.

2.1 Create a file group

To create a partition table, you must first create a file group, and create multiple file groups to achieve a good I/O balance. Generally, the number of file groups should be the same as the number of partitions, and these file groups are usually located on different disks. Each file group can be composed of one or more files, and each partition must be mapped to a file group. A file group can be used by multiple partitions. To better manage data (for example, for more precise backup control), the partition table should be designed so that only the data of the relevant data or logical group is located in the same file group. Use alter database to add the logical file group name:

Alter database [DeanDB] add filegroup [FG1]

DeanDB is the database name and FG1 file group name. After creating a file group, use alter database to add files to the file group:

Alter database [DeanDB] add file (NAME = n'fg1 ', FILENAME = n'c: \ DeanData \ fg1.ndf', SIZE = 3072KB, FILEGROWTH = 1024KB) to filegroup [FG1]

Similarly, four file and file groups are created, and each data storage file is placed in a different disk drive.

2.2 Create a partition function

To create a partitioned table, you must first determine the partition function. The criteria for partitioning a table are determined by the partition function. To create a Data Partition Function, you can select RANGE "LEFT |/RIGHT. Which side of the local area represents each boundary value. For example, if there are four partitions, define three boundary point values and specify whether each value is the upper boundary (LEFT) of the first partition or the lower boundary (RIGHT) [1] of the second partition. The Code is as follows:

Create partition function [SendSMSPF] (datetime) as range right for values ('20140901', '20160901', '20160301 ')

2.3 Create a partition scheme

After creating a partition function, you must associate it with the partition scheme to direct the partition to a specific file group. It defines the correspondence between the media that actually stores data and each data block. Multiple Data Tables can share the same Data Partition Function. Generally, they do not share the same Data Partition solution. Different partition schemes can be used to use the same partition function so that different data tables have the same partition conditions but are stored on different media. The code for creating a partition scheme is as follows:

Create partition scheme [SendSMSPS] as partition [SendSMSPF] TO ([FG1], [FG2], [FG3], [FG4])

2.4 Create a partition table

After you have created the Partition Function and partition scheme, you can create a partition table. The partition table is associated with the partition scheme by defining the partition key value. When a record is inserted, SQL SERVER puts the data in the corresponding partition according to the partition key value. This organically combines partition functions, partition schemes, and partition tables. The code for creating a partition table is as follows:

Create table SendSMSLog

([ID] [int] IDENTITY (1, 1) not null,

[IDNum] [nvarchar] (50) NULL,

[SendContent] [text] NULL

[SendDate] [datetime] not null,

) ON SendSMSPS (SendDate)

2.5 View partition table information

After the system runs for a period of time or imports the previous data into the partition table, we need to view the specific storage information of the data, that is, the number of records accessed by each partition, and the records accessed by that partition. We can view the Code through $ partition. SendSMSPF. The Code is as follows:

SELECT $ partition. SendSMSPF (o. SendDate)

AS [Partition Number]

, Min (o. SendDate) AS [Min SendDate]

, Max (o. SendDate) AS [Max SendDate]

, Count (*) AS [Rows In Partition]

FROM dbo. SendSMSLog AS o

Group by $ partition. SendSMSPF (o. SendDate)

Order by [Partition Number]

Run the preceding script in the query analyzer. Result 1 is displayed:

Figure 1 partition table information

2.6Maintain partitions

Partition maintenance is mainly designed to add, reduce, merge, and convert partitions in intervals. You can use the options SPLIT, MERGE, and alter table switch of alter partition function. SPLIT will add one more partition, while MEGRE will merge or reduce partitions, and SWITCH will logically convert partitions between groups.

3 Performance Comparison

We use about 26.5 million of data and storage space. 4G The performance of a single table is compared. The test environment is IBM365 and the CPU is Xeon. 2.7 GB * 2. Memory 16 GB , Hard Disk 136 GB * 2. the system platform is Windows 2003 SP1 + SQL Server 2005 SP1. Test results are shown in table 1:

Table 1: Performance Comparison between partitions and unpartitioned tables (unit: milliseconds)

Test Project partition not partitioned

1 16546 61466

2 13 33

3 20140 61546

4 17140 61000

Note:

1: retrieve the time consumed by the record of a day based on the time

2: Time consumed for inserting a single record

3: delete the time consumed by the record of a day according to the time

4: Time required to count the number of records per month

From table 1, we can see that operations on partitioned tables are faster than those on non-Partitioned Tables, because operations on partitioned tables use parallel operations of CPU and I/O, the amount of data retrieved also decreases, and the time consumed for Locating data becomes shorter.

4 Conclusion

Processing massive data has always been a headache. The separation technology is the first consideration of all designers. Whether it is separating application functions or separating data access, if <Strong class = "kgb" onmouseover = "isShowAds = false; isShowAds2 = false; isShowGg = true; InTextAds_GgLayer =" _ blank "; KeyGate_ads.ShowGgAds (this," _ blank ", event) "style =" border-right: 0px; padding-right: 0px; border-top: 0px; padding-left: 0px; font-weight: normal; padding-bottom: 0px; margin: 0px; border-left: 0px; cursor: hand; color: # 0000ff; padding-top: 0px; border-bottom: 0px; text-deco Ration: underline "onclick =" javascript: window. open ("http://pagead2.googlesyndication.com/pagead/iclk? Sa = l & ai = BAuhPftmkSOC3NI3-vAOPvKjiB6iiile4tqzABMCNtwHQyhYQAhgCIKy8kQooFDgAUK7V_ZADYJ2h34HYBaABnJmw-wOyAQ93d3cuY25ibG9ncy5jb23IAQHaATtodHRwOi8vd3d3LmNuYmxvZ3MuY29tLzEzNTkwL2FyY2hpdmUvMjAwNy8wNy8wOS84MTA3NzAuaHRtbIACAakCwBp42leQgz7IApru0wSoAwHoA9cC6AOMA-gDDegDBegDA4gEAZAEAZgEAA & num = 2 & adurl = http://www.bestengine.com.cn & client = ca-pub-1681215984289622 "); GgKwClickStat (" reasonable "," www.bestengine.com.cn "," afc ", "2000072864"); "onmouseout =" isShowGg = false; InTextAds_GgLayer = "_ u5408_u7406" "> reasonable The plan can effectively solve the problems of low operation efficiency and high maintenance cost of big data tables. The Table Partitioning function added by SQL Server 2005 allows you to partition data reasonably. When you access some data, the SQL Server Optimization engine can store data based on the data entity, find the best implementation scheme, instead of making a haystack.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.