This article is justEntry LevelDatabase case.
I hope you can share and discuss some typical cases.
Database Case 1: A simple user table.
Business hypothesis:
User table, 10 columns, no foreign key, 2 million data (if the data volume is large, you should consider Table sharding)
The following is a hypothetical distribution of operations (for reference only)
50% query by user ID
40% query by user name
8% query by email
1.5% modify user data, such as status and Last Logon Time
0.5% Add User Data
Operation Features: Generally, only a single data record is queried.
(If there is analysis and statistics, a synchronization database is usually used to analyze a large amount of data in that database)
(Some operations, such as user ranking and recent user operations, are generally implemented in other ways, rather than directly pressing on the user table)
(Of course, if the data size requirement is not large .... In fact, it doesn't matter what you do)
Software and hardware environment:
CPU: I5
Memory: 4 GB
OS: Windows 7x64 flagship Edition
Sqlserver 2008r2 Enterprise Edition
(Not the server environment, some configurations are not optimized)
Create a user table first and insert more than 2 million pieces of data (the ID is a clustered index and is continuously distributed. Frequent deletion of data may result in discontinuous data reduction performance. Therefore, you may choose to delete data through a false state bit)
Create Table [ DBO ] . [ User ] (
[ ID ] [ Int ] Identity (1 , 1 ) Not Null ,
[ Username ] [ Varchar ] ( 255 ) Not Null ,
[ Password ] [ Varchar ] ( 255 ) Not Null ,
[ Email ] [ Varchar ] ( 255 )Not Null ,
[ Age ] [ Smallint ] Null ,
[ Gender ] [ Smallint ] Null ,
[ Signature ] [ Varchar ] ( 255 ) Null ,
[ Createdtime ] [ Date ] Not Null ,
[ Lastactivitytime ] [ Date ] Not Null ,
[ Status ] [ Int ] Null ,
[ Usernamecode ] [ Binary ] ( 16 ) Null ,
Constraint [ Pk_user ] Primary Key Clustered
(
[ ID ] ASC
) With (Pad_index = Off , Statistics_norecompute = Off , Ignore_dup_key = Off , Allow_row_locks = On , Allow_page_locks = On ) On [ Primary ]
) On [ Primary ]
Because only three columns support conditional queries, and the ID column is the default clustered index, you only need to create two new indexes, in the username and email columns respectively (generally, indexes should be set for all columns that may appear in the where Statement)
In general, too many indexes will reduce the modification performance, and the query in this case is far more than the modification.
First, query by ID (clustered index, Id consecutive)
Declare @ Stopwatch Datetime
Declare @ Number Int
Set @ Stopwatch = Getdate ()
Set @ Number = 0
While ( @ Number < 1000 )
Begin
Select * From [ User ] Where ID = Cast ( Rand () * 2000000 As Int )
Set @ Number = @ Number + 1
End
Print Datediff (MS, @ Stopwatch , Getdate (); -- The output result of 1000 queries is 22773 milliseconds
Second, query by user name (email is the same)
1. No index, no data cache, (query once for about 15 seconds)
DBCC Dropcleanbuffers -- this statement clears the data cache.
Select * From [ User ] Where Username = ' User _ ' + Cast ( Cast ( Rand () * 2000000 As Int ) As Varchar ) -- The username rule is user_id, so it is hard to find other examples of high hit rate.
2. No index and data cache (1000 random queries, 71473 ms)
Declare @ Stopwatch Datetime
Declare @ Number Int
Set @ Stopwatch = Getdate ()
Set @ Number = 0
While ( @ Number < 1000 )
Begin
Select * From [ User ] Where Username = ' User _ ' + Cast ( Cast ( Rand () * 2000000 As Int ) As Varchar )
Set @ Number = @ Number + 1
End
Print Datediff (MS, @ Stopwatch , Getdate ());
-- The output result of 1000 queries is 71473 milliseconds
SQL Server automatically creates a non-clustered index for this column. See the following
3. Non-clustered Indexes
Create a non-clustered index on the username column (the index space occupied by the entire table is increased from 1 MB to 80 Mb, and the table space is still 461 MB, the index size is directly related to the data size in the column)
The following is an execution plan. By comparing the query by username with the query by ID, we can see that the query by username consumes about twice as much resources as the query by ID,The actual situation is many, because the user name is irregular and the length is high.
Some solutions are to generate a hashcode for username. After the hash value is optimized, it can achieve smaller size (32/64 bits) and even distribution optimization (hash is called 0 (1) ..)
Third: Modify the user status, password, final Logon Time, and balance.
In this example, only the username and email are indexed, but the two columns are considered as unmodifiable in logic.
No index is created for all columns that can be modified (the modification cost is too high)
Considering that the modification granularity is as small as possible, SQL Server 2005 and later support row locks.
4. Add new user data
Note that the table lock is applied to add new data...
Fifth: Hybrid Operation
This depends on basic skills, such as the unlock type, lock granularity, isolation level, and so on.ArticleShould be more useful, http://msdn.microsoft.com/en-us/library/ms190615.aspx
Tool simulation and testing, stress testing tool, performance testing tool, sqlserver profiler
Appendix:
1. Clear Cache
DBCC freeproccache-plan cache, removes a specific plan from the plan cache by specifying a plan Handle or SQL handle, or removes all cache entries associated with a specified resource pool. -- this can be used for freeing procedure cahce
DBCC dropcleanbuffers-removes all clean buffers from the buffer pool. -- Memory Cache
2. Re-Indexing
Alter index all on [user] rebuild
Alter index all on [user] reorganize
3. SQL Server database generation hashkey Function
Hashbytes ('md5', username)
4. query the current lock: sp_lock (sp_who, sp_who2 is also very useful, and object_name () to get the object name)