T-SQL query Advanced--in layman view

Source: Internet
Author: User
Tags one table management studio

Introduction

Views can be thought of as virtual tables that are defined on SQL Server. The view, as its name implies, is another entry for viewing data. The general view itself does not store the actual data, but only a SELECT statement and the metadata of the table involved.

The view is simple to understand as follows:

With views, clients no longer need to know the table structure of the underlying table and the relationships between them. The view provides an interface for unified access to data.

Why use Views (view)

So it's easy to see that using a view will have the following benefits:

    • The view hides the underlying table structure, simplifying data access operations
    • Because the underlying table structure is hidden, security is greatly enhanced and the user can only see the data provided by the view
    • Use views to facilitate rights management, allowing users to view permissions rather than the underlying tables to further enhance security
    • The view provides a user-accessible interface that, when the underlying table changes, changes the view's statements to accommodate the client program that has been built on this view without being affected

Classification of Views (view)

Views can be divided into three categories in SQL

    1. Normal views (Regular view)
    2. Indexed Views (Indexed view)
    3. Split Views (partitioned view)

The following views are discussed from these view types

Normal Views (rugular view)

The normal view is defined by a SELECT statement, and the view contains only the metadata of its definition and the referenced table. The data is not actually stored. The template for creating views in MSDN is as follows:

<view_attribute> [,... N]] as select_statement [with CHECK OPTION] [;] <view_attribute>:: = {    [encryption]    [SCHEMABINDING]    [view_metadata]     

parameter is still relatively small, now explain the above parameters:

Encryption: The view is encrypted, and if you choose this option, you cannot modify it. You need to save the script when creating the view, or you can no longer modify the

SCHEMABINDING: Defines the binding to the table to which the underlying reference is referenced. With this option selected, the table referenced by the view cannot alter the schema arbitrarily (such as the data type of the column), and if the underlying table schema needs to be changed, drop or alter the view that is bound on the underlying table.

View_metadata: This is a very interesting option. As indicated by the name of this option, if not selected, the METADATA returned to the client is the METADATA of the table referenced by the view, if this option is selected, The metadata of the view is returned. In plain explanation, View_metadata can make the view look like a table. The definition of each column of the view tells the client directly, not the definition of the underlying table column referenced.

With CHECK option: This option is used to update data to make restrictions, which are explained in the section updating data by view.

Of course, creating a view in addition to conforming to the syntax rules above, there are a few rules to follow:

    • In view, the ORDER BY clause cannot be used unless there is a top keyword (if you are going to use order BY, here is a hack that uses top percent ...)
    • View name must be unique in each schema
    • View nesting can not exceed 32 layers (in fact, the actual work who nested more than two layers will be hit pp-.-)
    • Compute,compute by,into keyword is not allowed in view
    • View cannot be built on a temporary table
    • View cannot query a full-text index

Create a simple example of view:

 * from HumanResources.Employee

Once the view is established, you can access the view as you would access the table:

SELECT * from V_test_view1

In Management Studio, I prefer to use this method when I create a view, which is much more convenient:

Indexed Views (Indexed view)

Before I talked about the indexed view, I suddenly remembered a comic that I had seen before. We can not afford high-end products, but to save money to buy an iphone installed high-end is always OK:

In fact, the indexed view is also very similar, on the basis of a normal view, to create a unique clustered index for the view, then this view becomes an indexed view. Apply the above comic formula: View + Clustered index = indexed view

an indexed view can be thought of as a table-equivalent object!

Indexed views in SQL Server and materialized view in Oracle are a concept. To understand an indexed view, you must first understand the clustered index. A clustered index is simply understood as a primary key, and the data in the database is physically stored in the table in the order of the primary key, just like the Xinhua dictionary, which is set by default in the form of ABCD .... ABCD .... This is equivalent to the primary key. This avoids the whole table scan and improves performance. Therefore, there can be only one clustered index in a table.

Also for indexed views, after a clustered index is added to a view. The view is more than just the metadata of the SELECT statement and the table, and the indexed view keeps the data physically present in the database, and the data stored in the indexed view is synchronized with the underlying tables involved in the indexed view.

After understanding the principles of indexed views, we can see that indexed views have significantly improved performance for a large number of OLAP data analysis and queries. In particular, there are aggregate functions in the indexed view, involving a large number of high-cost joins, because the results of the aggregate function calculations are physically stored in the indexed view, so it is not necessary to perform aggregation operations every time after a large amount of data is used in the indexed view, which will undoubtedly greatly improve performance.

At the same time, SQL Server needs to identify the changed rows each time the table involved in the indexed view is Update,insert,delete. Enables the indexed view to synchronize data. So there's a lot of business going on with this type of OLTP, and the database needs to do a lot of synchronous operations, which can degrade performance.

after talking about the fundamentals and pros and cons of indexed views, take a look at the implementation in SQL Server:

Implementing an indexed view in SQL Server is a very, very simple thing, just to add a unique clustered index to an existing view. But SQL Server's limitations on indexed views have made many DBAs less attractive:

Like what:

    • The base table involved in the indexed view must be ANSI_NULLS set to ON
    • Indexed views must be set ANSI_NULLS and Quoted_indetifier to ON
    • Indexed views can reference only basic tables
    • Schemabinding must be set
    • You must use a full name such as Schema.viewname when defining an indexed view
    • Cannot have subqueries in indexed views
    • Avg,max,min,stdev,stdevp,var,varp These aggregate functions are not available

..................

There are more ... Do not list all, interested in the please Google.

Let me take an example to illustrate the indexed view:

Suppose we have a query in the AdventureWorks database:

SELECT p.name,s.orderqtyfrom production.product p INNER join Sales.SalesOrderDetail SON p.productid=s. ProductID

The execution plan for this query:

At this point, I build the view and create a unique clustered index on this view:

-- CREATE view V_test_indexedviewwith schemabindingasselect P.name,s.orderqty,s.salesorderdetailidfrom Production.Product p INNER Join sales.salesorderdetail sON p.productid=S.productidgo--  Create an index on a view create UNIQUE CLUSTERED index indexedview_test1on v_test_indexedview (salesorderdetailid)

Next, to apply Liu Qian's lines: witness the miracle of the moment, we re-execute the previous query:

From the above example, you can see the power of the indexed view, even if your query statement does not include the indexed view, the Query Analyzer will automatically select this view, which greatly improves performance. Of course, so powerful performance, only in SQL Server Enterprise and development (although I have seen a lot of SQL Server developers let the company out of the enterprise version of the money, with express version of the features ...)

Split Views (partitioned view)

Split view in fact, from a micro implementation, the data returned by the entire view is joined by several parallel tables (which are the same table structure for several tables, that is, columns and data types, but the stored rowset is different) to the data set obtained by the Union connection if you do not understand it, see my previous blog post.

The split view can be divided into two categories as a whole:

1. Local Split view (partitioned)

2. Distributed split Views (distributed partitioned view)

Because the local split view is only for backward compatibility with versions prior to SQL Server 2005, the distributed Split view is only described here.

The distributed split view is actually obtained by connecting several parallel datasets obtained by different data sources or the same data source, a simple concept map is as follows:

The above view obtains data from three different data source tables, each of which contains only four rows of data, resulting in the split view.

The biggest benefit of using distributed split view is to improve performance. For example, I just want to get the information of ContactID for 8, and if it is obtained through distributed view, SQL Server can scan the table 2 containing ContactID 8 very intelligently only. This avoids the whole table scan. This greatly reduces the IO operation, which improves performance.

It is important to note that the primary key between the tables involved in the distributed split view cannot be duplicated, such as table a ContactID above is 1-4, then table B's ContactID cannot be 2-8 this way.

It is also important to note that you must add a check constraint for the primary key of the distributed partitioned index, so that SQL Server's Query Analyzer knows which table to scan, and here's an example.

In the Microsoft Sample database AdventureWorks database, I deposited two tables, Employee100 and Employee200, by ContactID from the first 100 rows and 100-200 rows of data, with the following code:

- - employee100from to HumanResources.Employee ORDER by EmployeeID--*     EmployeeID from   humanresources.employee ORDER by EmployeeID) Order BY HumanResources.Employee.EmployeeID) as E

This is the time to create a distributed split view:

** from Employee200

At this point we are querying the index:

SELECT * from v_part_view_testwhere EmployeeID=

Here is the execution plan:

As you can see, in this way, the execution plan is simply scanning the Employee200, which avoids scanning all the data, which undoubtedly improves performance.

So, when you put different data tables between different servers or use a RAID5 disk array, the distributed split view will further improve query performance.

Can you use distributed split view to improve performance in all situations? That is not possible. Use this approach if the query that you are facing contains an aggregate function, especially an aggregation function that also contains distinct. Or do not add a where condition to sort. That's definitely the killer of performance. Because the aggregation function needs to scan all the tables in the distributed Split view, and then perform the union operation before the operation.

Updating data with Views (view)

Updating data from a view is not recommended. Because the view does not accept parameters. I recommend using stored procedures.

Using view to update data and to update the data in the table is exactly the same (as previously said, view can be considered a virtual table, if the indexed view is a specific table)

There are a few things to keep in mind when updating data from a view

1. There is at least one user table after the FROM clause in the view

2.View queries can only update data from one table at a time, regardless of the number of tables involved

3. Columns calculated by the expression, constant columns, the column computed by the aggregate function cannot be updated

4.Group by,having,distinct keywords cannot affect columns that cannot be updated

Let's say there is a with CHECK option on the CREATE view, and if you choose this option, the data that is updated by view must conform to the conditions that are qualified by the WHERE clause in view, such as:

I create a view:

several tips in view

1. Find the definition of the view by the view name

SELECT * from sys.sql_moduleswhere object_id=object_id (' view name ')

2. As mentioned earlier, the normal view only stores the metadata of the SELECT statement and the referenced table, and when the underlying table data changes, sometimes the metadata of the table in the view is not synchronized in time, and can be manually synchronized by the following code

EXEC Sp_refreshview View Name
Best Practices for Views (view)

This is my personal experience, welcome to add

    • Be sure to tune the SELECT statement performance in view to the best (seemingly nonsense, but the truth is nonsense ...)
    • View is best not nested, if not nested, a maximum of one layer nested
    • Can replace view with stored procedures and custom functions, try not to use view, stored procedures cache execution plan, better performance, less limit
    • On the split view, do not use aggregate functions, especially the aggregation function also contains the distinct
    • In the view, if the WHERE clause can be added to the view, do not add to the view (because the call view will return all rows and then filter, Performance Killer, if you have added order by ...)

Summary

In this paper, the three types of views are explained in detail. Each view has its own scope of use, and using it properly will improve performance by a grade, and improper use will slow down performance.

I think of a famous saying: "Everything has price,always trade-off" ...

T-SQL query Advanced--in layman view

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.