Brief introduction
The column storage index already exists in SQL Server 2012, but only nonclustered column indexes are allowed in SQL Server 2012, which means that the column index refers to the underlying data on the original row storage index, and therefore consumes more storage space, But the biggest constraint in 2012 is that once the nonclustered columns are stored on a table, the table becomes read-only, which makes it very painful to update data every time you use a column index in the Data warehouse. The Updatable clustered column index in SQL Server 2014 solves the problem.
Can I update the clustered column storage index?
The concept of clustered column storage indexes can be likened to traditional row storage, which is both the data itself and the concept of column storage. There are many benefits to storing data in columns rather than rows.
First of all, for a large number of aggregation, scanning, grouping and other data Warehouse class query only need to read the selected columns, for the need to join more than one table of the star structure of the scene performance improvement especially
The second is that the column index can be updated, and only one of each table is needed (this is the advantage and disadvantage because no nonclustered indexes are built again) clustered column index can greatly save space
column indexes are stored in columns with the same data type in the same column, making it easier to achieve higher compression ratios
The table stored in the column consumes less storage space, so there is less IO
So what's the downside of column storage indexing?
Row storage is ideal for OLTP operations, because each clustered index key identifies a row that is stored on a physical disk and is contiguous, you can use the seek operation to complete a large number of highly selective queries, while the column store indexes each column in the same row is not physically linked, and the column store clustered index does not have " Primary key concept, so there is no seek operation, and if a large number of OLTP class queries, performance will be problematic.
The column storage index only supports scan operations, as shown in Figure 1.
Figure 1. The column storage index only supports scan operations
So how does a column index store?
Column index storage can be words too literally, or stored by columns. This process can be divided into 3 stages, first of all a bunch of rows grouped, this is called "row group", after the grouping is completed, and then split by column, and finally the column compression, as shown in Figure 2.
Figure 2. The process of column storage