Experience in writing SQL Server High Performance Data

Source: Internet
Author: User

1.1.1 Summary

During the development process, we may encounter system performance bottlenecks from time to time, which may be caused by insufficient code efficiency or hardware or network problems, it may also be a problem with database design.

This blog post will introduce some common database performance tuning methods. In addition, to write efficient SQL code, we need to master some basic code optimization skills, we will introduce some basic optimization techniques.

Contents

Problems in code
Database performance overhead
Use stored procedures
Use Database transactions
Use SqlBulkCopy
Use table parameters

1.1.2 text

Suppose we want to design a blog system that contains a User table, which is used to store the User's account name, password, display name, registration date, and other information.

Because of the time, we have designed the User table, including the account name and password (Note: The encrypted storage of privacy information is not considered here), display name and registration date, etc, the specific design is as follows:

Copy codeThe Code is as follows:
-- ===================================================== ======
-- Author: JKhuang
-- Create date: 7/8/2012
-- Description: A table stores the user information.
-- ===================================================== ======
Create table [dbo]. [jk_users] (
-- This is the reference to Users table, it is primary key.
[ID] [bigint] IDENTITY (1, 1) not null,
[User_login] [varchar] (60) not null,
[User_pass] [varchar] (64) not null,
[User_nicename] [varchar] (50) not null,
[User_email] [varchar] (100) not null,
[User_url] [varchar] (100) not null,

-- This field get the default from function GETDATE ().
[User_registered] [datetime] not null constraint [DF_jk_users_user_registered] DEFAULT (getdate ()),
[User_activation_key] [varchar] (60) not null,
[User_status] [int] not null constraint [DF_jk_users_user_status] DEFAULT (0 )),
[Display_name] [varchar] (250) NOT NULL
)

Figure 1 Users Table Design

We have defined the Users table, which contains 10 fields, including account name, password, display name, and registration date. ID is an auto-increment primary key, user_resistered is used to record the user registration time. It sets the default value GETDATE ().

Next, we will store data to the Users table through the client code. The specific code is as follows:

Copy codeThe Code is as follows:
/// Creates a database connection.
Var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN1"]. ToString ());
Conn. Open ();

/// This is a massive SQL injection vulnerability,
/// Don't ever write your own SQL statements with string formatting!
String SQL = String. Format (
@ "Insert into jk_users (user_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key)
VALUES ('{0}', '{1}', '{2}', '{3}', '{4}', '{5 }', '{6}', '{7 }')",
UserLogin, userPass, userNicename, userEmail, userStatus, displayName, userUrl, userActivationKey );
Var cmd = new SqlCommand (SQL, conn );
Cmd. ExecuteNonQuery ();

/// Because this call to Close () is not wrapped in a try/catch/finally clause,
/// It cocould be missed if an exception occurs above. Don't do this!
Conn. Close ();

Problems in code
Above, we use the simple ADO. NET Method to write data. But do you find any problems or improvements in the code?

First, we create a database connection in the client code, which requires a certain amount of system resources. After the operation, we need to release the occupied system resources. Of course, you can manually release resources as follows:
Copy codeThe Code is as follows:
/// Creates a database connection.
Var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN1"]. ToString ());
Conn. Open ();

/// This is a massive SQL injection vulnerability,
/// Don't ever write your own SQL statements with string formatting!
String SQL = String. Format (
@ "Insert into jk_users (user_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key)
VALUES ('{0}', '{1}', '{2}', '{3}', '{4}', '{5 }', '{6}', '{7 }')",
UserLogin, userPass, userNicename, userEmail, userStatus, displayName, userUrl, userActivationKey );
Var cmd = new SqlCommand (SQL, conn );
Cmd. ExecuteNonQuery ();

/// If throws an exception on cmd dispose.
Cmd. Dispose ();
/// Conn can't be disposed.
Conn. Close ();
Conn. Dispose ();

If an exception is thrown when a SqlCommand resource is released, the resource SqlConnection after it will not be released. When an exception occurs, we can catch the exception through try/catch. Therefore, we can use finally to check whether the resource has been released no matter whether an exception occurs. The specific implementation is as follows:
Copy codeThe Code is as follows:
SqlCommand cmd = null;
SqlConnection conn = null;
Try
{
/// Creates a database connection.
Conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN1"]. ToString ());
Conn. Open ();

/// This is a massive SQL injection vulnerability,
/// Don't ever write your own SQL statements with string formatting!
String SQL = String. Format (
@ "Insert into jk_users (user_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key)
VALUES ('{0}', '{1}', '{2}', '{3}', '{4}', '{5 }', '{6}', '{7 }')",
UserLogin, userPass, userNicename, userEmail, userStatus, displayName, userUrl, userActivationKey );
Cmd = new SqlCommand (SQL, conn );
Cmd. ExecuteNonQuery ();
}
Finally
{
/// Regardless of whether there is an exception,
/// We will dispose the resource.
If (cmd! = Null) cmd. Dispose ();
If (conn! = Null) conn. Dispose ();
}

The above finally method is widely used to handle exceptions. However, to release resources more securely, we have added finally and if statements, is there a simpler way to securely release resources?
In fact, we can use the using statement to release resources. The specific implementation is as follows:
Using statement: defines a range where one or more objects will be released.
Copy codeThe Code is as follows:
String SQL = String. Format (
@ "Insert into jk_users (user_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key)
VALUES ('{0}', '{1}', '{2}', '{3}', '{4}', '{5 }', '{6}', '{7 }')",
UserLogin, userPass, userNicename, userEmail, userStatus, displayName, userUrl, userActivationKey );

/// Creates a database connection.
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN1"]. ToString ()))
Using (var cmd = new SqlCommand (SQL, conn ))
{
/// Your code here.
}

The above Code uses the using statement to release resources. Can all objects be released using the using statement?

Only the type implements the IDisposable interface and the Dispose () method can be rewritten using the using statement to release resources. Because SqlConnection and SqlCommand implement the IDisposable interface, then we can use the using statement to release resources and handle exceptions.

In the client code, we use concatenated SQL statements to write data. Because SQL statements are dynamically executed, malicious users can perform SQL injection attacks by splicing SQL statements.

You can defend against SQL injection attacks in the following ways:

• Regular Expression verification user input
• Parameterized stored procedure
• Parameterized SQL statements
• Add a new database architecture
• LINQ to SQL
Next, we will use parameterized SQL statements to defend against SQL injection attacks. You can also use other methods to defend against SQL injection attacks. The specific implementation code is as follows:
Copy codeThe Code is as follows:
/// Creates a database connection.
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN1"]. ToString ()))
{
Conn. Open ();
String SQL = string. Format (
@ "Insert into jk_users (user_login, user_pass, user_nicename, user_email,
User_status, display_name, user_url, user_activation_key )");

Using (var cmd = new SqlCommand (SQL, conn ))
{
/// Parameterized SQL to defense injection attacks
Cmd. Parameters. Add ("@ user_login", userLogin );
Cmd. Parameters. Add ("@ user_pass", userPass );
Cmd. Parameters. Add ("@ user_nicename", userNicename );
Cmd. Parameters. Add ("@ user_email", userEmail );
Cmd. Parameters. Add ("@ user_status", userStatus );
Cmd. Parameters. Add ("@ display_name", displayName );
Cmd. Parameters. Add ("@ user_url", userUrl );
Cmd. Parameters. Add ("@ user_activation_key", userActivationKey );
Cmd. ExecuteNonQuery ();
}
}

The preceding code is improved by using parameterized SQL statements and using statements. The code is more readable, and SQL injection attacks and resource release are also avoided.

Next, let's test the code execution time. First, we add the Stopwatch Method to the code. startNew () and Stopwatch. stop () to calculate the execution time of the written code. The specific code is as follows:
Copy codeThe Code is as follows:
/// Calc insert 10000 records consume time.
Var sw = Stopwatch. StartNew ();

/// Creates a database connection.
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN2"]. ToString ()))
{
Conn. Open ();
Int cnt = 0;
While (cnt ++ <10000)
{
String SQL = string. Format (@ "INSERT INTO jk_users
(User_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key)
VALUES (@ user_login, @ user_pass, @ user_nicename, @ user_email, @ user_status, @ display_name, @ user_url, @ user_activation_key )");

Using (var cmd = new SqlCommand (SQL, conn ))
{
/// Parameterized SQL to defense injection attacks
Cmd. Parameters. Add ("@ user_login", userLogin );
Cmd. Parameters. Add ("@ user_pass", userPass );
Cmd. Parameters. Add ("@ user_nicename", userNicename );
Cmd. Parameters. Add ("@ user_email", userEmail );
Cmd. Parameters. Add ("@ user_status", userStatus );
Cmd. Parameters. Add ("@ display_name", displayName );
Cmd. Parameters. Add ("@ user_url", userUrl );
Cmd. Parameters. Add ("@ user_activation_key", userActivationKey );
Cmd. ExecuteNonQuery ();
}
}
}

Sw. Stop ();
}

Above, we have written 10000 pieces of data into the database, and the execution time is 7.136 seconds (my machine is broken), so that the system performance can still meet the needs of many companies.

If the number of user requests increases, can the system meet the requirements? In fact, we should not be satisfied with the existing system performance, because we know that there is still much room for improvement in code execution efficiency.

Next, we will further introduce the code improvement methods.

Figure 2 data written to the Users table

In order for the database to get a faster write speed, we must understand the main time consumed by the database during write operations.

Database performance overhead
Connection time
When we execute conn. when you Open (), you must first establish a physical channel (such as a socket or named pipe), First shake hands with the server, and analyze the connection string information, the server must authenticate the connection and run the check to register the connection in the current transaction.

This series of operations may take one or two seconds if we execute conn each time. all Open () operations are time-consuming. To minimize the cost of opening a connection, ADO. NET is called the connection pool optimization method.

Connection Pool: reduces the number of times new connections need to be opened. As long as you call the Open () method on the connection, the pool process checks whether there are available connections in the pool. If a pool connection is available, return the connection to the caller instead of creating a new connection. When the application calls Close () or Dispose () on the connection, the pool process returns the connection to the active connection pool, instead of actually closing the connection, you can reuse it in the next Open call after the connection is returned to the pool.

Parser overhead
When we pass the SQL statement insert into… to SQL Server... It needs to parse the SQL statement, because the SQL Server parser runs fast, so the parsing time is often negligible, but we can still use the stored procedure, instead of using SQL statements to reduce the overhead of the parser.

Database Connection
To provide ACID (four features of transactions), SQL Server must ensure that all database changes are sequential. It uses locks to ensure that the insert, delete, or update operations of the database do not conflict with each other (for database locks, refer to here ).

Because most databases are oriented to multi-User environments, when we insert User tables, hundreds of users may also operate on User tables. Therefore, SQL Server must ensure that these operations are performed in an orderly manner.

Then, when SQL Server is doing all these tasks, it will generate a lock to ensure that users get meaningful results. SQL Server ensures that the database is completely predictable when each statement is executed (for example, predicting the SQL Execution Mode) and managing locks takes a certain amount of time.

Constraint Processing
When inserting data, each constraint (such as foreign keys, default values, and SQL CHECK) takes additional time to CHECK whether the data meets the constraints; to ensure that each inserted, updated, or deleted record meets the constraints, SQL Server needs to consider whether to add constraints to a table with a large amount of data.

Varchar
VARCHAR is a common type of database, but it may also lead to unexpected performance overhead. Every time we store variable-length columns, SQL Server must perform more memory management; strings can easily consume several hundred bytes of memory. If we set indexes in a VARCHAR column, SQL Server requires O (String Length) When performing B-tree search) however, the number of times the integer field is compared is limited by the memory latency and CPU frequency.

Disk IO
SQL Server writes data to the disk. First, SQL Server writes the data to the transaction log. When the backup is executed, the transaction log is merged into the permanent database file; this series of operations are completed by the background, which does not affect the speed of data query, but every thing must have its own disk space, therefore, we can allocate independent disk space to transaction logs and primary data files to reduce IO overhead. Of course, the best solution is to minimize the number of transactions.

As you can see, We optimize the database by optimizing the connection time, Parser overhead, database connection, constraint processing, Varchar, and disk IO. Next, we will further optimize the previous example.

Use stored procedures
In the preceding example, the SQL code is directly stored in the client code, so the database needs to use the parser to parse the SQL statements in the client. Therefore, we can use the stored procedure instead, reduce the time overhead of the parser. More importantly, because SQL is executed dynamically, we do not need to re-compile or release programs to modify the SQL statements in the stored procedure.

The user_registered field in the User table sets the default value (GETDATE (), so we can improve the system performance by eliminating the default value constraints of the table. In short, we need to provide the user_registered field value.

Next, let's save the default value constraints in the User table and add the stored procedure. The specific code is as follows:
Copy codeThe Code is as follows:
-- ===================================================== ======
-- Author: JKhuang
-- Create date: 08/16/2012
-- Description: Creates stored procedure to insert
-- Data into table jk_users.
-- ===================================================== ======
Alter procedure [dbo]. [SP_Insert_jk_users]
@ User_login varchar (60 ),
@ User_pass varchar (64 ),
@ User_nicename varchar (50 ),
@ User_email varchar (100 ),
@ User_url varchar (100 ),
@ User_activation_key varchar (60 ),
@ User_status int,
@ Display_name varchar (250)

AS
BEGIN
Set nocount on;

-- The stored procedure allows SQL server to avoid exactly ally all parser work
Insert into jk_users
(User_login, user_pass, user_nicename, user_email, user_status, display_name, user_url, user_activation_key, user_registered)
VALUES (@ user_login, @ user_pass, @ user_nicename, @ user_email, @ user_status, @ display_name, @ user_url, @ user_activation_key, GETDATE ());
END

We have defined the Stored Procedure SP_Insert_jk_users to insert data into the table. When we re-execute the code, we find that the data insertion time is shortened to 6.7401 seconds.

Figure 3 data write time

Use Database transactions

Think about whether data can be extended and written to the database in batches? If a delay of a period is allowed to be written to the database, we can use Transaction to delay Data Writing.

A database transaction is a logical unit in the execution process of the database management system. It consists of a limited sequence of database operations. After the SQL Server ensures that the transaction is successfully executed, the data is written to the database. Otherwise, the transaction is rolled back.

If we perform ten independent operations on the database, SQL Server needs to allocate ten lock overhead. However, if we encapsulate these operations in a transaction, therefore, SQL Server only needs to allocate the lock overhead once.
Copy codeThe Code is as follows:
/// Calc insert 10000 records consume time.
Var sw = Stopwatch. StartNew ();
/// Creates a database connection.
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN2"]. ToString ()))
{
Conn. Open ();
Int cnt = 0;
SqlTransaction trans = conn. BeginTransaction ();
While (cnt ++ <10000)
{
Using (var cmd = new SqlCommand ("SP_Insert_jk_users", conn ))
{
/// Parameterized SQL to defense injection attacks
Cmd. CommandType = CommandType. StoredProcedure;
/// Uses transcation to batch insert data.
/// To avoid lock and connection overhead.
Cmd. Transaction = trans;
Cmd. Parameters. Add ("@ user_login", userLogin );
Cmd. Parameters. Add ("@ user_pass", userPass );
Cmd. Parameters. Add ("@ user_nicename", userNicename );
Cmd. Parameters. Add ("@ user_email", userEmail );
Cmd. Parameters. Add ("@ user_status", userStatus );
Cmd. Parameters. Add ("@ display_name", displayName );
Cmd. Parameters. Add ("@ user_url", userUrl );
Cmd. Parameters. Add ("@ user_activation_key", userActivationKey );
Cmd. ExecuteNonQuery ();
}
}
/// If no exception, commit transcation.
Trans. Commit ();
}
Sw. Stop ();
}

Figure 4 data write time

Use SqlBulkCopy
The write operation is encapsulated by transactions. When we run the code again, we find that the data writing speed is greatly improved. It takes only 4.5109 seconds. Because a transaction only needs to allocate a lock resource once, reduces the time required to allocate locks and connect databases.

Of course, we can also use SqlBulkCopy to write a large amount of data. The specific implementation code is as follows:
Copy codeThe Code is as follows:
Var sw = Stopwatch. StartNew ();
/// Creates a database connection.
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN2"]. ToString ()))
{
Conn. Open ();
Using (var bulkCopy = new SqlBulkCopy (conn ))
{
/// Maping the data columns.
BulkCopy. ColumnMappings. Add ("user_login", "user_login ");
BulkCopy. ColumnMappings. Add ("user_pass", "user_pass ");
BulkCopy. ColumnMappings. Add ("user_nicename", "user_nicename ");
BulkCopy. ColumnMappings. Add ("user_email", "user_email ");
BulkCopy. ColumnMappings. Add ("user_url", "user_url ");
BulkCopy. ColumnMappings. Add ("user_registered", "user_registered ");
BulkCopy. ColumnMappings. Add ("user_activation_key", "user_activation_key ");
BulkCopy. ColumnMappings. Add ("user_status", "user_status ");
BulkCopy. ColumnMappings. Add ("display_name", "display_name ");
BulkCopy. DestinationTableName = "dbo. jk_users ";
/// Insert data into able.
BulkCopy. WriteToServer (dataRows );
}
Sw. Stop ();
}

Figure 5 data write time

Above, we use transactions and SqlBulkCopy to batch write data into the database, but in fact, every time we call cmd. the ExecuteNonQuery () method generates a round-trip message from the client application to the database. Therefore, do we want to use a method to write data when only one message is sent?

Use table parameters
If you use SQL Server 2008, it provides a new menu variable (Table Parameters) that can aggregate the entire Table data into a parameter and pass it to the stored procedure or SQL statement. Its performance overhead is to integrate the data aggregation parameters (O (data volume )).

Now, we modify the previous Code to define our table variables in SQL Server. The specific definitions are as follows:
Copy codeThe Code is as follows:
-- ===================================================== ======
-- Author: JKhuang
-- Create date: 08/16/2012
-- Description: Declares a user table paramter.
-- ===================================================== ======
Create type jk_users_bulk_insert as table (
User_login varchar (60 ),
User_pass varchar (64 ),
User_nicename varchar (50 ),
User_email varchar (100 ),
User_url varchar (100 ),
User_activation_key varchar (60 ),
User_status int,
Display_name varchar (250)
)

We have defined a table parameter jk_users_bulk_insert. Then we define a stored procedure to accept the table parameter jk_users_bulk_insert. The specific definition is as follows:
Copy codeThe Code is as follows:
-- ===================================================== ======
-- Author: JKhuang
-- Create date: 08/16/2012
-- Description: Creates a stored procedure, receive
-- A jk_users_bulk_insert argument.
-- ===================================================== ======
Create procedure sp_insert_jk_users
@ UsersTable jk_users_bulk_insert READONLY
AS

Insert into jk_users (user_login, user_pass, user_nicename, user_email, user_url,
User_activation_key, user_status, display_name, user_registered)

SELECT user_login, user_pass, user_nicename, user_email, user_url,
User_activation_key, user_status, display_name, GETDATE ()
FROM @ usersTable

Next, we call the stored procedure in the client code and pass the table as a parameter to the stored procedure.
Copy codeThe Code is as follows:
Var sw = Stopwatch. StartNew ();
Using (var conn = new SqlConnection (ConfigurationManager. ConnectionStrings ["SQLCONN2"]. ToString ()))
{
Conn. Open ();
/// Invokes the stored procedure.
Using (var cmd = new SqlCommand ("sp_insert_jk_users", conn ))
{
Cmd. CommandType = CommandType. StoredProcedure;

/// Adding a "structured" parameter allows you to insert tons of data with low overhead
Var param = new SqlParameter ("@ userTable", SqlDbType. Structured) {Value = dt };
Cmd. Parameters. Add (param );
Cmd. ExecuteNonQuery ();
}
}
Sw. Stop ();

Now, we re-execute the write operation and find that the write efficiency is the same as that of SqlBulkCopy.

1.1.3 Summary

This article uses the example of User table design in the blog system to introduce common mistakes and code defects during the design process, such as SQL injection and database resource release; then, we use some common code optimization techniques to optimize the code, and analyze the performance overhead of database writing (connection time, parser, database connection, constraint processing, VARCHAR, and disk IO ), we use stored procedures, database transactions, SqlBulkCopy, and table parameters to reduce database overhead.

[1] http://beginner-sql-tutorial.com/sql-query-tuning.htm

[2] http://www.dzone.com/links/r/ SQL _optimization_tipsquestions.html

[3] http://blackrabbitcoder.net/archive/2010/11/11/c.net-little-wonders---a-presentation.aspx

[4] http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.