"Confidante jailbait" of the pivot operator in SQL Server

Source: Internet
Author: User

In today's article I would like to discuss a special T-SQL language structure in SQL Server-the pivot operator introduced from SQL Server 2005. I often cite this and the language structure is the most dangerous one in SQL Server--and soon you'll know why. Before we go into specific problems and pitfalls, first I want to give you a basic overview of what you can accomplish with pivot in SQL Server.

Overview

The basic idea behind the pivot operator in SQL Server is that you can rotate the behavior column during a T-SQL query. The operator itself was introduced after SQL Server 2005, and is primarily used in databases based on the entity attribute value model (Attribute-EAV) in principle. The idea behind the EAM model is that you can extend the database entity without the need to modify the database schema. Therefore, all properties of the EAV Model store entity are stored in a table with key/value pairs. Let's take a look at the table below a simple key/value pair model.

CREATE TABLEeavtable (RecordIDINT  not NULL, ElementCHAR( -) not NULL, Value sql_variant not NULL,    PRIMARY KEY(RecordID, Element))GO--Insert Some RecordsINSERT  intoEavtable (RecordID, Element, Value)VALUES(1,'FirstName','Woody'),(1,'LastName','Tu'),(1,' City','Linhai'),(1,'Country',' China'),(2,'FirstName','Bill'),(2,'LastName','Gates'),(2,' City','Seattle'),(2,'Country','USA')GO

As you can see, we insert 2 database entities into the table, and each entity contains multiple attributes. Each property in the table is just an extra record. If you like to expand the entity more properties, you only insert additional records into the table, and there is no need for database schema modification-This is the "power" of the Open database architecture ...

It is obviously difficult to query such a EAV table because you are dealing with the data structure of a flat key/value pair. So you rotate the contents of the table, and the rows are rotated into columns. You can perform this rotation with your own pivot operator, or by using a traditional case expression for purely manual implementation. Before we go into pivot details, I want to show you how to do this by using T-SQL and some case expressions by hand. If you rotate manually, your T-SQL query needs to implement 3 stages:

    1. Group stage (Grouping Phase)
    2. Unfolding stage (spreading Phase)
    3. Aggregation phase (Aggregation Phase)

In the grouping phase (Grouping Phase) We compress our EAV tables for different database entities. Here we have a GROUPby in the RecordID column. In the 2nd stage, the unfolding stage (spreading Phase), we use multiple case expressions to rotate the behavior column. Finally in the aggregation phase (Aggregation Phase) we use the max expression to return different values for each row and column. Let's look at the following T-SQL code.

1 --Pivot the data with a handwritten T-SQL statement.2 --Make sure has an index of defined on the grouping column.3 SELECT4 RecordID,5     --Spreading and aggregation phase6     MAX( Case  whenElement= 'FirstName'  ThenValueEND) as 'FirstName',7     MAX( Case  whenElement= 'LastName'  ThenValueEND) as 'LastName',8     MAX( Case  whenElement= ' City'  ThenValueEND) as ' City',9     MAX( Case  whenElement= 'Country'  ThenValueEND) as 'Country'Ten  fromeavtable One GROUP  byRecordID--Grouping Phase A GO

As you can see from the code, it's easy to distinguish between each phase and how they map to T-SQL queries. Show you the results of the query, and finally we turn the row into columns.

Pivot operator

since SQL Server 2005 (almost 10 years ago!) ), Microsoft introduced the pivot operator in T-SQL. With that operator you can make the same conversion (row to column) as long as you have a native operator. It sounds simple, very promising, isn't it? The following code shows the same conversion using the native pivot operator.

1 --Perform the same query with the native PIVOT operator.2 --The grouping column is not specified explicitly, it's the remaining column3 --That isn't referenced in the spreading and aggregation elements.4 SELECT5 RecordID,6 FirstName,7 LastName,8 City ,9 CountryTen  fromeavtable OnePIVOT (MAX(Value) forElementinch(FirstName, LastName, City, country)) asT A GO

When you execute that query, you receive the same result as the picture just now. But when you look at the pivot operator syntax, you'll see a big difference when compared to the manual method:

You can only specify allocation and aggregation elements! Grouping elements cannot be explicitly defined!

The grouping element is the remaining column that you do not reference in the pivot operator. In our case, we did not reference the RecordID column in the pivot operator, so this column is used in the grouping phase (Grouping Phase) . If we subsequently modify the database schema, this brings interesting side effects, such as adding extra columns to the base table:

1 -- Add A new column to the table 2 ALTER TABLE ADD CHAR (1) 3 GO

Then we assign the value to it:

1 UPDATE SET Somedata=left(CASTasVARCHAR(1)),1)

Now when you execute the same query with the PIVOIT operator (there are non-null values in that Somedata column), you will get back completely different results, because the sort phase is now in RecordID and somedata Column (which we just added).

What happens when we re-execute the manual T-SQL query we just started writing. It still returns the same correct result. This is one of the biggest side effects of the pivot operator in SQL Server: Grouping elements cannot be explicitly defined. To overcome this problem, the best practice is to use a table expression that returns only the columns that you want. With this method, if you subsequently modify the table schema, there is no problem because the extra columns are not returned by default from the table expression. Let's look at the following code:

1 --Use a table expression to state explicitly which columns you want to2 --return from the base table. Therefore you can always control on which3 --columns The PIVOT operator is performing the grouping.4 SELECT5 RecordID,6 FirstName,7 LastName,8 City ,9 CountryTen  from One ( A     --Table Expression -     SELECTRecordID, Element, Value fromeavtable -) asT thePIVOT (MAX(Value) forElementinch(FirstName, LastName, City, country)) asT1 - GO

As you can see from the code, I pass a table expression to the pivot operator. And in the table expression, you select only the columns you want from the list. This means that later you can modify the table schema and also destroy the results of the pivot query.

Summary

I hope this article has shown you why the pivot operator is very dangerous in SQL Server. This syntax itself brings very efficient code, but as a side effect you cannot specify the grouping element directly. You should make sure that you use a table expression to define the columns that are conveyed to the pivot operator to ensure that the results are deterministic.

What's your experience with the pivot operator? Do you like it? If you don't like it, what do you want to change?

Thanks for your attention!

"Confidante jailbait" of the pivot operator in SQL Server

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.