1.1.1 Summary
In the development process, we occasionally encounter system performance bottlenecks, which can cause a lot of reasons, may be the code is not efficient, may be hardware or network problems, but also may be the problem of database design.
This blog post will be for some common database performance lieu methods, and, in order to write efficient SQL code, we need to master some basic code optimization techniques, so we will be introduced from some basic optimization techniques.
This article directory
Problems in the Code
Database performance Overhead
Using Stored Procedures
Using Database transactions
Using SqlBulkCopy
Using table Parameters
1.1.2 Body
Let's say we're designing a blog system that contains a user table that stores information such as the user's account name, password, display name, and registration date.
Because of the time, we have designed the user table, which includes the account name, password (note: There is no consideration of the privacy of the encrypted storage), display name and registration date, etc., specifically designed as follows:
Copy Code code as follows:
-- =============================================
--Author:jkhuang
--Create date:7/8/2012
--Description:a table stores the user information.
-- =============================================
CREATE TABLE [dbo]. [Jk_users] (
--This is the reference to Users table and it is primary key.
[ID] [bigint] IDENTITY (1,1) not NULL,
[User_login] [varchar] (a) Not NULL,
[User_pass] [varchar] () not NULL,
[User_nicename] [varchar] () not NULL,
[User_email] [varchar] (MB) Not NULL,
[User_url] [varchar] (MB) Not NULL,
--This field get the default from function GETDATE ().
[User_registered] [DateTime] Not NULL CONSTRAINT [df_jk_users_user_registered] DEFAULT (getdate ()),
[User_activation_key] [varchar] (a) Not NULL,
[User_status] [INT] Not NULL CONSTRAINT [Df_jk_users_user_status] DEFAULT ((0)),
[Display_name] [varchar] () Not NULL
)
Figure 1 The Users table design
Above, we define the users table, which contains 10 fields such as account name, password, display name, and registration date, where ID is an user_resistered primary key, which is used to record the user's registration time, and it sets the default value GETDATE ().
Next, we will store the client code implementation data into the users table with the following specific code:
Copy Code code as follows:
Creates a database connection.
var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN1"). ToString ());
Conn. Open ();
This is a massive SQL injection vulnerability,
Don ' t ever write your own SQL statements with string formatting!
String sql = String.Format (
@ "INSERT into Jk_users (User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_ Activation_key)
VALUES (' {0} ', ' {1} ', ' {2} ', ' {3} ', ' {4} ', ' {5} ', ' {6} ', ' {7} ') ',
Userlogin, Userpass, Usernicename, UserEmail, UserStatus, DisplayName, Userurl, Useractivationkey);
var cmd = new SqlCommand (SQL, conn);
Cmd. ExecuteNonQuery ();
Because this call to close () isn't wrapped in a try/catch/finally clause,
It could be missed if a exception occurs above. Don ' t do this!
Conn. Close ();
Problems in the Code
Above, we use the common but ado.net way to achieve the data write function, but do you find that the code has problems or can improve the place?
First, in the client code, we create a database connection, it requires a certain amount of system resources, when the operation is completed we need to release the occupied system resources, of course, we can manually release the resources, specifically implemented as follows:
Copy Code code as follows:
Creates a database connection.
var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN1"). ToString ());
Conn. Open ();
This is a massive SQL injection vulnerability,
Don ' t ever write your own SQL statements with string formatting!
String sql = String.Format (
@ "INSERT into Jk_users (User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_ Activation_key)
VALUES (' {0} ', ' {1} ', ' {2} ', ' {3} ', ' {4} ', ' {5} ', ' {6} ', ' {7} ') ',
Userlogin, Userpass, Usernicename, UserEmail, UserStatus, DisplayName, Userurl, Useractivationkey);
var cmd = new SqlCommand (SQL, conn);
Cmd. ExecuteNonQuery ();
If throws an exception on Cmd dispose.
Cmd. Dispose ();
Conn can ' t be disposed.
Conn. Close ();
Conn. Dispose ();
If you throw an exception when you release the SqlCommand resource, the resource SqlConnection behind it will not be released. Let's think about it. When an exception occurs, the exception can be caught by try/catch, so it is possible to use finally to check whether the resource has been released, whether or not an exception occurs, as follows:
Copy Code code as follows:
SqlCommand cmd = null;
SqlConnection conn = null;
Try
{
Creates a database connection.
conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN1"). ToString ());
Conn. Open ();
This is a massive SQL injection vulnerability,
Don ' t ever write your own SQL statements with string formatting!
String sql = String.Format (
@ "INSERT into Jk_users (User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_ Activation_key)
VALUES (' {0} ', ' {1} ', ' {2} ', ' {3} ', ' {4} ', ' {5} ', ' {6} ', ' {7} ') ',
Userlogin, Userpass, Usernicename, UserEmail, UserStatus, DisplayName, Userurl, Useractivationkey);
cmd = new SqlCommand (SQL, conn);
Cmd. ExecuteNonQuery ();
}
Finally
{
Regardless of whether there is a exception,
We'll dispose the resource.
if (cmd!= null) cmd. Dispose ();
IF (conn!= NULL) conn. Dispose ();
}
It is common to deal with exceptions through the finally, but in order to release resources more securely, allowing us to add finally and if statements, is there a more concise way to secure the release of resources?
In fact, we can use the using statement to achieve the release of resources, specifically implemented as follows:
Using statement: Defines a scope that frees one or more objects outside this scope.
Copy Code code as follows:
String sql = String.Format (
@ "INSERT into Jk_users (User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_ Activation_key)
VALUES (' {0} ', ' {1} ', ' {2} ', ' {3} ', ' {4} ', ' {5} ', ' {6} ', ' {7} ') ',
Userlogin, Userpass, Usernicename, UserEmail, UserStatus, DisplayName, Userurl, Useractivationkey);
Creates a database connection.
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN1"). ToString ()))
using (var cmd = new SqlCommand (SQL, conn))
{
Your code here.
}
The above code uses a using statement to implement the release of a resource, so is it possible for all objects to be freed using the using statement?
Only the type implements the IDisposable interface and overrides the Dispose () method to implement the resource release using the using statement, because SqlConnection and SqlCommand implement the IDisposable interface. Then we can use the using statement to implement resource release and exception handling.
In client code, we use a concatenation of SQL statements to implement data writes, because SQL statements are executed dynamically, so malicious users can execute SQL injection attacks by stitching SQL.
For SQL injection attacks, we can defend against the following ways:
• Regular expressions Validate user input
• Parameterized Stored procedures
• Parameterized SQL statements
• Adding a new schema for the database
LINQ to SQL
Next, we will defend against SQL injection attacks through parameterized SQL statements, and you can also use other methods to defend against SQL injection attacks, with the following implementation code:
Copy Code code as follows:
Creates a database connection.
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN1"). ToString ()))
{
Conn. Open ();
String sql = string. Format (
@ "INSERT into Jk_users (User_login, User_pass, User_nicename, User_email,
User_status,display_name, User_url, User_activation_key) ");
using (var cmd = new SqlCommand (SQL, conn))
{
Parameterized SQL to defense injection attacks
Cmd. Parameters.Add ("@user_login", userlogin);
Cmd. Parameters.Add ("@user_pass", Userpass);
Cmd. Parameters.Add ("@user_nicename", usernicename);
Cmd. Parameters.Add ("@user_email", useremail);
Cmd. Parameters.Add ("@user_status", userstatus);
Cmd. Parameters.Add ("@display_name", displayName);
Cmd. Parameters.Add ("@user_url", Userurl);
Cmd. Parameters.Add ("@user_activation_key", Useractivationkey);
Cmd. ExecuteNonQuery ();
}
}
The code is improved by parameterized SQL statements and using statements, and the code is now more readable and avoids problems such as SQL injection attacks and resource releases.
Next, let's simply test the execution time of the code, and first we add the method Stopwatch.startnew () and Stopwatch.stop () to the code to calculate the execution time of the write code as follows:
Copy Code code as follows:
Calc Insert 10000 records consume time.
var sw = stopwatch.startnew ();
Creates a database connection.
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN2"). ToString ()))
{
Conn. Open ();
int cnt = 0;
while (cnt++ < 10000)
{
String sql = string. Format (@ "INSERT into jk_users
(User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_activation_key)
VALUES (@user_login, @user_pass, @user_nicename, @user_email, @user_status, @display_name, @user_url, @user_activation _key) ");
using (var cmd = new SqlCommand (SQL, conn))
{
Parameterized SQL to defense injection attacks
Cmd. Parameters.Add ("@user_login", userlogin);
Cmd. Parameters.Add ("@user_pass", Userpass);
Cmd. Parameters.Add ("@user_nicename", usernicename);
Cmd. Parameters.Add ("@user_email", useremail);
Cmd. Parameters.Add ("@user_status", userstatus);
Cmd. Parameters.Add ("@display_name", displayName);
Cmd. Parameters.Add ("@user_url", Userurl);
Cmd. Parameters.Add ("@user_activation_key", Useractivationkey);
Cmd. ExecuteNonQuery ();
}
}
}
Sw. Stop ();
}
Above, we write 10,000 data to the database, execution time is 7.136 seconds (my machine is very broken), so the system performance still can satisfy many company's demand.
If the volume of user requests increases, can we guarantee that the system will meet the requirements? In fact, we should not be satisfied with the existing system performance, because we know that the execution efficiency of the code and there is a lot of room for improvement.
Next, the method of code improvement will be described further.
Figure 2 Data written to the users table
In order for the database to get faster write speed, we must understand the primary time that the database is being written.
Database Performance Overhead
Connection time
When we execute conn. Open (), first, you must establish a physical channel (such as a socket or named pipe), you must have a first handshake with the server, you must parse the connection string information, the connection must be authenticated by the server, you must run the check to enlist in the current transaction, and so on
This series of operations may take two seconds if we execute conn each time. Open () is very time-consuming to do this series of operations, in order to make the open connection cost is the lowest, ADO. NET uses the optimization method called connection pooling.
Connection pooling: Reduces the number of times a new connection needs to be opened. Whenever a user calls the Open () method on a connection, the pool process checks to see if a pool connection is available and returns the connection to the caller instead of creating a new connection, if one is available; The application calls close on the connection () or Dispose (), the pool process returns the connection to the active connection pool instead of the connection being actually closed, and after the connection is returned to the pool, it can be reused in the next Open call.
the cost of the parser
When we pass SQL statement to SQL Server INSERT INTO ... , it needs to parse the SQL statement, and because the SQL Server parser executes quickly, parsing times can be negligible, but we can still reduce the cost of the parser by using stored procedures rather than straight SQL statements.
Database Connection
To provide acid (four attributes of a transaction), SQL Server must ensure that all database changes are ordered. It is done by using a lock to ensure that the database inserts, deletes, or updates do not conflict with each other (refer to the lock on the database here).
Since most databases are user-facing environments, when we insert the user table, there may be hundreds of users working on the user table, so SQL Server must ensure that these operations are in order.
Then, when SQL Server is doing all of these things, it creates locks to ensure that users get meaningful results. SQL Server guarantees that when each statement is executed, the database is fully predictable (for example, predicting how SQL executes) and managing locks takes a certain amount of time.
Constraint handling
When inserting data, each constraint (such as a foreign key, default value, SQL check, and so on) requires additional time to detect whether the data conforms to the constraint, and because SQL Server guarantees that each inserted, updated, or deleted record is bound, We need to consider whether the constraint should be added to a table with a large amount of data.
Varchar
VarChar is a common type of database, but it can also cause unexpected performance overhead; every time we store variable-length columns, SQL Server has to do more memory management; strings can easily consume hundreds of of bytes of memory if we set the index in a varchar column, Then SQL Server performs a B-tree search, which requires an O (string-length) comparison, however, the number of integer segment comparisons is limited to memory latency and CPU frequency.
Disk IO
SQL Server will eventually write data to disk, first, SQL Server writes data to the transaction log, and when the backup is performed, the transaction log is merged into a permanent database file, which is completed in the background and does not affect the speed of the data query. But everything must have its own disk space, so we can reduce IO overhead by assigning separate disk space to the transaction log and primary data file, and of course the best solution is to minimize the number of transactions.
As you can see, we optimize the database by optimizing the join time, parser overhead, database joins, constraint handling, varchar, and disk IO, and then we'll further optimize the previous examples.
Using Stored Procedures
In the previous example, we hardcode the SQL code directly in the client code, so the database needs to resolve the SQL statements in the client using a parser, so we can use the stored procedure instead, thus reducing the time cost of the parser, and more importantly, because SQL is executed dynamically, So we modify the SQL statements in the stored procedure without recompiling and publishing the program.
The field in the user table user_registered set the default value (GETDATE ()), then we improve the performance of the system by eliminating the table default constraint, in short, we need to provide the value of the field user_registered.
Next, let's omit the default value constraint in the user table and add the stored procedure as follows:
Copy Code code as follows:
-- =============================================
--Author:jkhuang
--Create date:08/16/2012
--Description:creates stored procedure to insert
--Data into table jk_users.
-- =============================================
ALTER PROCEDURE [dbo]. [Sp_insert_jk_users]
@user_login varchar (60),
@user_pass varchar (64),
@user_nicename varchar (50),
@user_email varchar (100),
@user_url varchar (100),
@user_activation_key varchar (60),
@user_status int,
@display_name varchar (250)
As
BEGIN
SET NOCOUNT on;
--The stored procedure allows SQL Server to avoid virtually all parser work
INSERT into Jk_users
(User_login, User_pass, User_nicename, User_email, User_status,display_name, User_url, User_activation_key, User_ Registered)
VALUES (@user_login, @user_pass, @user_nicename, @user_email, @user_status, @display_name, @user_url, @user_activation _key, GETDATE ());
End
Above we defined the stored procedure sp_insert_jk_users insert data into the table, and when we rerun the code, we found that the time to insert the data was shortened to 6.7401 seconds.
Figure 3 Data Write time
Using Database transactions
Think about whether the data can be extended to the database, can it be written in batches? If the delay is allowed to be written to the database, then we can use transaction to delay data writing.
Database transaction is a logical unit in the execution of database management system, which is composed of a limited sequence of database operations. SQL Server ensures that after a successful transaction, the data is written to the database, whereas the transaction is rolled back.
If we do 10 separate operations on the database, SQL Server needs to allocate 10 lock overhead, but if you encapsulate all of these operations in one transaction, SQL Server only needs to allocate one lock overhead.
Copy Code code as follows:
Calc Insert 10000 records consume time.
var sw = stopwatch.startnew ();
Creates a database connection.
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN2"). ToString ()))
{
Conn. Open ();
int cnt = 0;
SqlTransaction trans = conn. BeginTransaction ();
while (cnt++ < 10000)
{
using (var cmd = new SqlCommand ("Sp_insert_jk_users", conn))
{
Parameterized SQL to defense injection attacks
Cmd.commandtype = CommandType.StoredProcedure;
Uses transcation to batch insert data.
To avoid lock and connection overhead.
Cmd. Transaction = trans;
Cmd. Parameters.Add ("@user_login", userlogin);
Cmd. Parameters.Add ("@user_pass", Userpass);
Cmd. Parameters.Add ("@user_nicename", usernicename);
Cmd. Parameters.Add ("@user_email", useremail);
Cmd. Parameters.Add ("@user_status", userstatus);
Cmd. Parameters.Add ("@display_name", displayName);
Cmd. Parameters.Add ("@user_url", Userurl);
Cmd. Parameters.Add ("@user_activation_key", Useractivationkey);
Cmd. ExecuteNonQuery ();
}
}
If no exception, commit transcation.
Trans.commit ();
}
Sw. Stop ();
}
Figure 4 Data Write time
Using SqlBulkCopy
By using transactions to encapsulate write operations, when we rerun the code, we find that the data writes are significantly faster, in just 4.5109 seconds, reducing the time required to allocate locks and database joins because one transaction only needs to allocate one lock resource.
Of course, we can also use sqlbulkcopy to achieve a large number of data write operations, the implementation of the code is as follows:
Copy Code code as follows:
var sw = stopwatch.startnew ();
Creates a database connection.
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN2"). ToString ()))
{
Conn. Open ();
using (var bulkcopy = new SqlBulkCopy (conn))
{
Maping the data columns.
BULKCOPY.COLUMNMAPPINGS.ADD ("User_login", "User_login");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_pass", "User_pass");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_nicename", "user_nicename");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_email", "User_email");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_url", "User_url");
BULKCOPY.COLUMNMAPPINGS.ADD ("user_registered", "user_registered");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_activation_key", "User_activation_key");
BULKCOPY.COLUMNMAPPINGS.ADD ("User_status", "user_status");
BULKCOPY.COLUMNMAPPINGS.ADD ("Display_name", "display_name");
Bulkcopy.destinationtablename = "Dbo.jk_users";
Insert data into a DataTable.
Bulkcopy.writetoserver (DataRows);
}
Sw. Stop ();
}
Figure 5 Data Write time
Above, we use transaction and SQLBULKCOPY implementation data to bulk write to the database, but in fact, every time we call CMD. The ExecuteNonQuery () method produces a round trip message from the client application to the database, so do we want to see if there is a way to do the write operation with only one message sent?
Using table Parameters
If you use SQL Server 2008, it provides a new function table variable (table Parameters) that aggregates the entire table data into a single parameter passed to a stored procedure or SQL statement. Its attention performance overhead is to assemble the data into parameters (O (data volume)).
Now, we modify the previous code to define our table variables in SQL Server, which are defined as follows:
Copy Code code as follows:
-- =============================================
--Author:jkhuang
--Create date:08/16/2012
--description:declares a user table paramter.
-- =============================================
CREATE TYPE Jk_users_bulk_insert as TABLE (
User_login varchar (60),
User_pass varchar (64),
User_nicename varchar (50),
User_email varchar (100),
User_url varchar (100),
User_activation_key varchar (60),
User_status int,
Display_name varchar (250)
)
Above, we define a table parameter Jk_users_bulk_insert, and then we define a stored procedure to accept the table parameter Jk_users_bulk_insert, which is defined as follows:
Copy Code code as follows:
-- =============================================
--Author:jkhuang
--Create date:08/16/2012
--description:creates a stored procedure, receive
--a jk_users_bulk_insert argument.
-- =============================================
CREATE PROCEDURE Sp_insert_jk_users
@usersTable Jk_users_bulk_insert READONLY
As
INSERT into Jk_users (User_login, User_pass, User_nicename, User_email, User_url,
User_activation_key, User_status, Display_name, user_registered)
SELECT User_login, User_pass, User_nicename, User_email, User_url,
User_activation_key, User_status, Display_name, GETDATE ()
From @usersTable
Next we call the stored procedure in client code and pass the table as a parameter to the stored procedure.
Copy Code code as follows:
var sw = stopwatch.startnew ();
using (var conn = new SqlConnection (configurationmanager.connectionstrings["SQLCONN2"). ToString ()))
{
Conn. Open ();
Invokes the stored procedure.
using (var cmd = new SqlCommand ("Sp_insert_jk_users", conn))
{
Cmd.commandtype = CommandType.StoredProcedure;
Adding a "structured" parameter allows to insert tons of data with low overhead
var param = new SqlParameter ("@userTable", sqldbtype.structured) {Value = dt};
Cmd. Parameters.Add (param);
Cmd. ExecuteNonQuery ();
}
}
Sw. Stop ();
Now, we re performing the write operation and found the write efficiency equivalent to SqlBulkCopy.
1.1.3 Summary
This article through the Blog System User table Design example, describes the mistakes and code flaws that we make in the design process, such as SQL injection, database resource release, and so on, using some common code optimization techniques to optimize your code and analyze the performance overhead of database writes (connection time, parser, Database connectivity, constraint processing, varchar, and disk IO, we use stored procedures, database transactions, SqlBulkCopy, and table parameters to reduce the cost of the database.
[1] http://beginner-sql-tutorial.com/sql-query-tuning.htm
[2] Http://www.dzone.com/links/r/sql_optimization_tipsquestions.html
[3] http://blackrabbitcoder.net/archive/2010/11/11/c.net-little-wonders---a-presentation.aspx
[4] http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/