MS SQL Server: Partitioned tables, partitioned indexes, detailed

Source: Internet
Author: User
Tags filegroup microsoft sql server 2005 truncated

1. Introduction to Partitioned Tables
The primary purpose of using partitioned tables is to improve the scalability of large tables and tables with various access patterns and toManagementOf

? Large tables: Tables with huge amounts of data.
? Access patterns: Different sets of data that need to be accessed for different purposes, access to each of these purposes can be called an access pattern.

Partitioning can be used to divide data into smaller, more manageable parts, to improve performance, and on the other hand, for systems with multiple CPUs, partitioning can be done in parallel to the table, which is very helpful for improving performance.

Note: You can onlySQLServerCreate a partition function in Enterprise Edition. Only SQL Server Enterprise Edition supports partitions.
2. Steps to create a partitioned table or partition index
Can be divided into the following steps:
1. Determine the number of partition columns and partitions
2. Determine if multiple filegroups are used
3. Create a partition function
4. Create a partition schema (schema)
5. Create a partitioned table
6. Create a partitioned index

The steps for creating a partitioned table, a partitioned index, are described in detail below.
2.1. Determine the number of partition columns and partitions
Before starting the partitioning operation, first determine the access mode of the table to be partitioned, which determines what columns are suitable for partitioning the key. For example, for sales data, the data is typically scoped to a range based on the date, and then further queries are made on that basis, so that the date can be used as a partitioned column.

After you have determined the partitioning column, you need to further determine the number of partitions, that is, how much data should be included in the partition table, and to what extent each partition's data should be scoped.

2.2. Determine if multiple filegroups are used
To help optimize performance and maintenance, you should use filegroups to separate data. In general, if you frequently operate on a partition's entire dataset, the number of filegroups is best to be the same as the number of partitions, and these filegroups should typically reside on separate disks, and with multiple CPUs, SQL Server can process multiple partitions in parallel, greatly reducing the overall time to process a large number of complex reports and analyses.

2.3. Create a partition function
Partition functions are used to define boundary conditions for partitions, and the syntax for creating partition functions is as follows:
CREATE PARTITION FUNCTION partition_function_name (input_parameter_type)
As RANGE [left | Right]
For VALUES ([Boundary_value [,... N]])
[ ; ]

Parameter description:
? Partition_function_name
is the name of the partition function. The partition function name isDatabaseMust be unique within the rules that conform to the identifiers.

? Input_parameter_type
Is the data type of the column used for partitioning. When used as a partitioning column, except for text, ntext, image, XML, timestamp, varchar (max), nvarchar (max), varbinary (max), alias data types, or CLR user-defined data types, All data types are valid.

The actual column (also known as the partition column) is specified in the Create TABLE or create INDEX statement.

? Boundary_value
Specifies the boundary value for each partition that uses Partition_function_name's partitioned table or index. If Boundary_value is empty, the partition function uses Partition_function_name to map the entire table or index to a single partition. Only one of the partition columns specified in the Create TABLE or create INDEX statement can be used.

Boundary_value is a constant expression that can reference a variable. This includes user-defined type variables, or functions, and user-defined functions. It cannot reference a Transact-SQL expression. The Boundary_value must match the data type provided in the Input_parameter_type or be implicitly convertible to that data type, and if the size and scale of the value are associated with Input_parameter_type Does not match the size and scale of the corresponding value in the implicit conversion process, the value cannot be truncated.

Attention:
If the boundary_value contains datetime or smalldatetime literal values, then for these literal values it is assumed that Us_english is the session language when calculating. It is not recommended to use this behavior. To ensure that the partition function definition has the expected behavior for all conversational languages, it is recommended to use constants that are interpreted in the same way for all language settings, such as the YYYYMMDD format, or to explicitly convert literal values to specific styles. For more information, see Writing Internationalized Transact-SQL statements. To determine the language session for the server, run the SELECT @ @LANGUAGE.

? ... n
Specifies the number of values provided by Boundary_value and cannot exceed 999. The number of partitions created equals n + 1. You do not have to list the values sequentially. If the values are not listed sequentially, the Microsoft SQL Server 2005 Database engine sorts them, creates the function, and returns a warning stating that the values are not supplied sequentially. If n includes any duplicate values, the database engine returns an error.

? Left | Right
Specifies the side (left or right) that Boundary_value [,... n] belongs to each boundary value interval when the interval value is sorted from left to right in ascending order by the database engine. If not specified, the default value is left.

Example of creating a partition function:
CREATE PARTITION FUNCTION pf_left (int)
As RANGE left
For VALUES (10, 20)
GO

CREATE PARTITION FUNCTION pf_right (int)
As RANGE left
For VALUES (10, 20)
GO

Distinction between pf_left and pf_right partition functions:
partition function partition 1 partition 2 partition 3
Pf_left <= > <= > 20
Pf_right < >= and < >= 20

2.4. Create a Partition schema (schema)
After you create a partition function, you must associate it with the partition schema (schema) to direct the partition to a specific filegroup. Define a partition architect, even if multiple partitions are in the same filegroup, you must specify a filegroup for each partition.

The syntax for creating a partition schema is as follows:
Gocreate PARTITION SCHEME Partition_scheme_name
As PARTITION partition_function_name
[All] to ({file_group_name | [PRIMARY]} [,... N])
[ ; ]

Parameters:
? Partition_scheme_name
The name of the partition scheme. The partition scheme name must be unique within the database and conform to the rules for identifiers.

? Partition_function_name
The name of the partition function that uses the partitioning scheme. partitions created by the partition function are mapped to the filegroup specified in the partition scheme. Partition_function_name must already exist in the database.

? All
Specifies that all partitions are mapped to the filegroup provided in File_group_name, or to the primary filegroup if [PRIMARY] is specified. If all is specified, only one file_group_name can be specified.

? File_group_name | [PRIMARY] [,... N]
Specifies the name of the filegroup used to hold the partition specified by Partition_function_name. File_group_name must already exist in the database.

If [PRIMARY] is specified, the partition is stored in the primary filegroup. If all is specified, only one file_group_name can be specified. The order in which partitions are assigned to filegroups starts with partition 1 and is allocated in the order listed in [,... N] of filegroups. in [,... n], you can specify the same file_group_name multiple times. If n is not sufficient to have the number of partitions specified in Partition_function_name, the CREATE partition SCHEME will fail with an error.

If partition_function_name generates fewer partitions than the number of filegroups, the first unassigned filegroup is labeled next used, and information appears that displays the named next used filegroup. If all is specified, a separate file_group_name will keep its NEXT used property for that partition_function_name. If a partition is created in the ALTER PARTITION FUNCTION statement, the NEXT used filegroup will receive another partition. To create another unallocated filegroup to own the new partition, use ALTER PARTITION SCHEME.

When you specify a primary filegroup in file_group_name[1,... n], you must separate PRIMARY as it does in [PRIMARY], because it is a keyword.

Example of creating a partition schema:
CREATE PARTITION FUNCTION myRangePF1 (int)
As RANGE left for VALUES (1, 100, 1000);
GO
CREATE PARTITION SCHEME myRangePS1
As PARTITION myRangePF1
To (TEST1FG, TEST2FG, TEST3FG, TEST4FG);
GO

2.5. Create a partitioned table
After defining the partitioning function (logical structure) and partition schema (physical structure), you can create partitioned tables to take advantage of them. The partition table defines the partition schema that should be used, and the partition schema defines the partition functions it uses. To combine the three, you must specify the columns that apply to the partition function. A range partition is always mapped to only one column in the table.

The CREATE TABLE syntax is as follows:
CREATE TABLE
[database_name. [Schema_name]. | Schema_name. ] table_name
({<column_definition> | <computed_column_definition>}
[<table_constraint>] [,... N])
[on {partition_scheme_name (partition_column_name) | filegroup
| "Default"}]
[{textimage_on {filegroup | "Default"}]
[ ; ]

Examples are as follows:
CREATE TABLE myRangePT1
(
ID int NOT NULL,
Age int,
PRIMARY KEY (ID)
) on myRangePS1 (myRangePF1)
GO


2.6. Create a partitioned index
Indexes are very effective for improving query performance, so it is generally important to consider indexing a partitioned table, indexing a partitioned table and indexing the normal table all the time, but its behavior differs from the normal index.

By default, the indexes created in the partitioned table use the same partition schema and partitioning columns as the partitioned table, so that the indexes are aligned in the table. Aligning a table to its index makes administration easier, especially for sliding window scenarios. To initiate partition switching, all indexes on the table must be aligned.

When you create an index, you can also specify a different partitioning scheme (SCHEMA) or a separate filegroup (FileGroup) to store the index so that SQL Server does not align the index with the table.

When you create an index (a partitioned index) on a partitioned table, you should be aware of the following:
? Unique index
When a unique index (clustered or nonclustered) is established, the partition column must appear in the index column. This restriction causes SQL Server to investigate only a single partition and to ensure that the new key value of the pet is in the table. If the partitioning column cannot be contained in a unique key, you must use a DML trigger instead of enforcing uniqueness.

? Non-unique index
When partitioning a non-unique clustered index, if the partitioning column is not explicitly specified in the clustered key, SQL Server adds the partitioning column to the clustered index column by default.
When partitioning a non-unique nonclustered index, SQL Server adds the partitioning column as an indexed inclusion column by default, to ensure that the index is aligned with the base table, and that SQL Server does not add a partitioning column to the index if the partitioning column already exists in the index.

3. Partitioning operations
partitions are suitable for large tables that can be scaled, so as time and environment changes, there is a need to split, merge, and move partitions.
3.1. Splitting and Merging partitions
Change the partition function by splitting or merging boundary values. By performing ALTER PARTITION function, you can split a partition of any table or index that uses a partition function into two partitions, or you can merge two partitions into one partition.

Note: Multiple tables or indexes can use the same partition function. ALTER PARTITION FUNCTION affects all of these tables or indexes in a single transaction.

The ALTER PARTITION FUNCTION syntax is as follows:
ALTER PARTITION FUNCTION partition_function_name ()
{
SPLIT RANGE (Boundary_value)
| MERGE RANGE (Boundary_value)
} [ ; ]

Parameter description:
? Partition_function_name
The name of the partition function to modify.

? SPLIT RANGE (Boundary_value)
Add a partition to the partition function. Boundary_value determines the scope of the new partition, so it must be different from the existing boundary range of the partition function. Splits an existing scope into two scopes according to the Boundary_value,microsoft SQL Server 2005 database engine. In both scopes, the scope of the new Boundary_value is considered a new partition.

Important NOTES:
The filegroup must be online and must be marked NEXT used by the partitioning scheme using this partition function to save the new partition. In the CREATE PARTITION SCHEME statement, the filegroup is assigned to the partition. If the Create PARTITION scheme statement is assigned an extra filegroup (fewer partitions are created in the Create PARTITION FUNCTION statement than the filegroup used to hold them), there is an unallocated filegroup, and the partitioning scheme marks one of the filegroups as NEXT used. The filegroup will save the new partition. If the partitioning scheme does not mark any filegroups as NEXT used, you must use ALTER PARTITION scheme to add a filegroup or specify an existing filegroup to hold the new partition. You can specify the filegroup of a saved partition to hold additional partitions. Because a partition function can participate in multiple partitioning schemes, all partition schemes that use partition functions (which you add partitions to) must have a next used filegroup. Otherwise, the ALTER PARTITION FUNCTION fails with an error indicating that one or more partition schemes for the NEXT used filegroup are missing.

? MERGE [RANGE (Boundary_value)]
Delete a partition and merge all the values that exist in the partition into one of the remaining partitions. RANGE (Boundary_value) must be an existing boundary value, and the values in the deleted partition will be merged into that value. If the filegroup originally saved Boundary_value is not used by the remaining partition and is not marked with the NEXT used property, the filegroup is removed from the partitioning scheme. The merged partition resides in a filegroup that originally did not save the boundary_value. Boundary_value is a constant expression that can reference variables (including user-defined type variables) or functions (including user-defined functions). It cannot reference a Transact-SQL expression. Boundary_value must match or be implicitly convertible to its corresponding column data type, and the size and scale of the value does not match its corresponding input_parameter_type, it will not be truncated during implicit conversion.

Example:
ALTER PARTITION SCHEME ps_historyarchive
NEXT used [PRIMARY]



Note:
The ALTER PARTITION function re-partitions any tables and indexes that use the function in a single atomic operation. However, the operation takes place offline and can be resource-intensive depending on the extent of the repartitioning.

ALTER PARTITION FUNCTION can only be used to split a partition into two partitions, or to merge two partitions into a single partition. To change the way the table is partitioned in other cases (for example, to merge 10 partitions into 5 partitions), you can try any of the following options. Depending on your system configuration, these options may vary in resource consumption:

? Create a new partitioned table using the required partition function, and then use INSERT into ... The SELECT from statement inserts data from the old table into the new table.

? Create a partitioned clustered index for the heap.
Attention:
Dropping a partitioned clustered index results in a partitioned heap.

? Delete and regenerate an existing partitioned index by using the Transact-SQL CREATE index statement with the DROP EXISTING = ON clause.

? Executes a series of ALTER PARTITION FUNCTION statements.

All filegroups affected by ALTER paritition FUNCTION must be online.
If a disabled clustered index exists in any table that uses the partition function, the ALTER PARTITION function will fail.
Microsoft SQL Server 2005 does not provide replication support for modifying partition functions. Changes to the partition functions in the publication database must be manually applied in the subscription database.

3.2. Moving partition data
You can use ALTER TABLE .... The SWITCH statement moves the subset of data quickly and efficiently in a way:
? Moving data from a table to another table;
? Add a table as a partition to an existing partitioned table;
? Switches a partition from one partitioned table to another partitioned table;
? Delete Partitions to form a single table.

When you use these scenarios to move data, this scenario can be transmitted quickly and efficiently, regardless of the size of the collection, because the operation does not physically move the data, only metadata about the storage location becomes another partition from one partition.

ALTER TABLE .... The syntax of SWITCH is as follows:
ALTER TABLE [database_name. [Schema_name]. | Schema_name. ] table_name
{
SWITCH [PARTITION Source_partition_number_expression]
to [schema_name].target_table
[PARTITION Target_partition_number_expression]
}
[ ; ]

MS SQL Server: Partitioned tables, partitioned indexes, detailed

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.