Brief introduction
In SQL Server, an index is an enhanced existence, which means that SQL Server can still implement functionality even without indexes. However, indexes can greatly improve query performance in most cases, especially in OLAP. To fully understand the concept of the index, you need to know a lot of the knowledge of the original rationality, including B-tree, heap, Database page, district, fill factor, fragment, filegroup, etc. a series of relevant knowledge, such knowledge to write a small book is not too. So this article does not discuss these topics in depth.
What is an index
An index is a structure that sorts the values of one or more columns in a database table, using indexes to quickly access specific information in a database table.
In the case of streamlining, indexing is a structure. In SQL Server, the storage structure of indexes and tables (referred to as clustered indexed tables) is the same, all are B-trees, and B-trees are a balanced fork tree for lookups. Understand the concept of B-tree as follows:
To understand why a B-tree is used as a structure for indexes and tables (with clustered indexes), you first need to understand the rationale behind SQL Server storage data.
In SQL Server, the smallest unit of storage is the page (page), which cannot be divided. Just as cells are no longer in biology, or atoms are the smallest unit of chemistry. This means that SQL Server reads the page, either the entire read, or none at all, without compromise.
In the case of database retrieval, disk IO scanning is the most time-consuming. Because disk scans involve many physical features, these are quite time-consuming. So the B-tree was designed to reduce the number of scans for disk. If a table or index does not use a B-tree (for tables that do not have a clustered index with heap heap storage), then finding a single data needs to be scanned across the database pages that the entire table contains. This will undoubtedly greatly aggravate the IO burden. While using B-tree for storage in SQL Server, you only need to store the root node of B-tree in memory, and after several lookups, you can find the page containing the leaf node that contains the data you need. This eliminates the overall scan and improves performance.
Here is an example to prove:
In SQL Server, if a clustered index is not established on the table, it is stored according to the heap (HEAP), assuming that I have such a table:
Now that there is no index on the table, that is, heap storage, I show a reduction in IO by adding a clustered index (in B-tree) to it: