SQL Server index, table compression

Last Update:2015-08-11 Source: Internet

Author: User

Tags one table

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is an index?

An index is a data structure on a disk that is based on a table or view. Using an index can make data acquisition faster and higher in universities, and also affect some other performance, such as inserting or updating.

There are two main types of indexes: clustered and nonclustered indexes.

Dictionary directory is an index, according to pinyin query the desired word is a clustered index (physical continuous, page number and table one by one corresponds), the radicals are a nonclustered index (logical continuous, page number and directory discontinuity).

Clustered index storage records are physically contiguous, while nonclustered indexes are logically contiguous, and physical storage is not contiguous.

A clustered index can have only one table, and a non-clustered index may have more than one table.

The pros and cons of indexes

The index is used to avoid full table scans because full table scans read every data page of a table from disk, and if an index points to a data value, only a few disks are required to read.

Indexed tables take up more space in the database, and it takes longer to add, delete, and change data commands.

Storage mechanism of indexes

The catalog in the book is a word and a list of page numbers in which the index in the database is the value in the table and the list where the values are stored.

A clustered index is a new physical space in the database that holds the values he arranges, and when new data is inserted, he rearranges the entire physical storage space.

A nonclustered index contains only the columns of the nonclustered indexes in the original table and a pointer to the actual physical table.

Basic structure of the data table

When a new data table is created, the system allocates a contiguous space of 8k in the disk. When a 8k run out of time, the database pointer will automatically allocate a 8k of space, each 8k of space is called a data page, and assigned from 0-7 page number, each file's No. 0 page records the boot information called the page header, every 8 data pages by 64k form the expansion area. The combination of all data pages forms a heap.

SQL Server specifies that rows cannot span data pages, so the maximum number of records per row can only be 8k, which is why char and varchar are limited to 8k in size, and storing more than 8k of data should use the text type. In fact, the text type of the field value can not be directly entered and saved, it is to store a pointer to a number of 8k data page composed of the expansion area, the real data is actually placed in these data pages.

When to set the index

1. Define a data column for the primary key (SQL Server defaults to a clustered index for the primary key).

2. Define data columns with foreign keys

3. For frequently queried data columns

4. For data columns that need to be queried frequently within a specified range

5. Data columns that are frequently present in the WHERE clause

6, often appear in the keyword order BY, group by, distinct after the field.

Under what circumstances do not set the index

1, the query is seldom involved in the column, duplicate value more than the column.

2. Columns of text, image, bit data type

3. Frequently accessed columns

4, the regular update operation of the table, the index is generally not more than 3, up to 5. Although the speed of access is increased, it affects the update operation.

Clustered index

1. Create a clustered index using SSMS

Expand the table to create the index, right-click Index, select New index, clustered index, new indexing point, select Columns, select column, and choose Ascending or Descending, enter name, OK.

By default, a clustered index is automatically created at the same time that the primary key is generated.

2. Create a clustered index using T-SQL

Use webdbgo /* Clustered index name */ondesc) with(drop_existing=/* if present delete */

Each table or view can contain only one clustered index, because the clustered index changes the way data is stored and arranged. Whether clustered or nonclustered, the information is stored in a balance tree or B-tree, and B-trees identify and group similar data, precisely because the retrieval in the B-tree is based on key values, so the index can increase the speed of data access. The B-Tree will have a combination of similar keys, so the database engine can find the target record by searching only a few pages.

Nonclustered indexes

You can have more than one nonclustered index on each table, you can create an index on a column, or you can create an index on multiple columns that are already part of an existing index.

The SSMs creation method is the same as the T-SQL creation method:

Use webdbgo/* Clustered index name */on desc)

Add Index Options

FILLFACTOR: Used when creating an index, the data for each index page is a percentage of the index size, default 100. When the table needs to be modified frequently, it is recommended to set to 70-80, which is recommended when updating infrequently. 90.

Pad_index: Used to keep open space on each page in the Index intermediate level. The value cannot be set, and his value inherits from FillFactor.

Small example

UseWebDBGoCreateTable Ceshi--New Table(IDIntIdentity1,1)PrimaryKey, namevarchar20), Codevarchar20),[Date]Datetime)--Inserting 10w test dataDeclare@nIntSet@n=1While@n<100000BeginInsertInto Ceshi (Name,code,[Date])VALUES (‘Name‘+Cast@nAsvarchar20)),‘Code‘+Cast@nAsvarchar20)), getUTCDate ())Set@n=@n+1End--View index StatusSetStatistics IOOn--View Disk IOSetStatistics TIMEOn--Viewing SQL statements parsing compilation and execution timesSelect*FromCeshi--View index status EXECSp_helpindex CeshiSelect*From Ceshiwhere name=‘Name1‘--Ctrl+l View Execution plan clustered index scan cost 100%, consider optimization for index lookup, build nonclustered index on name-- build a nonclustered index create index name_index  Ceshi (name) -- re-view the index again. New nonclustered index exec sp_helpindex Ceshi--select * Span style= "color: #0000ff;" >from ceshi where name =

Manage Indexes

EXEC sp_helpindex Ceshi--View the indexes in the tableEXEC sp_rename ' ceshi.name_index ", Span style= "color: #ff0000;" > '

Table compression

The primary performance of SQL Server depends on disk I/O efficiency, andSQL Server 2008 provides data compression to improve disk I/O efficiency.

Table compression means reducing the disk footprint of the data, so compression can be used on heap tables, clustered indexed tables, nonclustered indexed tables, indexed views, partitioned tables.

Data types that can be compressed

smallint, int, Bigint, decimal, numeric, real, float, money, smallmoeny, bit, datetime, DateTime2, DateTimeOffset, Char, nchar, binary, rowversion.

There are two types of compression in SQL Server: Data and Backup

Row compression

Compression alters the way data is physically stored, but does not require any modifications to the code.

Row compression process: First identify the data type of each column in the table, then convert to a variable length, and finally reduce the total amount of storage space requests to actual demand.

Such as: fixed-length type int, char, nchar, etc., stored in the data page in an indefinite length (storage of real data length).

1, SSMs row compression

Table, right-click Storage, manage compression, using the same compression type for all partitions, right select row-> immediate execution, complete.

When you compress a heap table or clustered index, you do not include nonclustered indexes at the same time, so you need to operate on a separate nonclustered index.

Manage compression--- --------right-------to all partitions using the same compression type, right select row-> immediate execution, complete.

2, T-SQL row compression

--Compress on an existing table--GatheredAlterTable Ceshi RebuildWith (data_compression=Row--Non-clusteredAlterIndex New_nameon Ceshi rebuild with (Data_compression=row) -- compress when creating a table Span style= "color: #0000ff;" >create  Yasuo (id primary keyvarchar (50 "), Mail varchar (50with (Data_compression=row)

The row compression was specified when it was created and has not changed. As long as the data is inserted into the table, the row is compressed.

Page compression

Page compression enhances the functionality of row compression by performing additional steps.

Page compression steps: Row compression, prefix compression, dictionary compression.

First, for each column, you will determine a value that reduces the storage space for the values in each column. Once the value is determined, the row of the prefix value for each column is stored in the page header. All information is called compressed information and is stored under the page header. The identified value (the prefix value) is in no column and is replaced by a reference to the corresponding value in the Compression Information section.

The next step is to compress the dictionary, search the entire page instead of a single column, and the duplicate values are moved to the compression information section of the header and replaced by a reference to that value.

In SSMs, the page compression step is the same as the row compression step, except that compression is selected as page. Change the row to page in T-SQL.

Need to be aware

1. If the data kept in memory is compressed, once it is selected, it must be decompressed first.

2, when inserting new rows, the data is also row or page compression.

3. When updating or deleting, the row compression object retains the current compression level. However, page compression may need to be recalculated, depending on the amount of data that has changed.

What kind of compression do you use

Objects that require frequent updates should use row compression.

Only page compression should be used to perform read operations.

SQL Server index, table compression

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More