Today, I listened to the msdn webcast lecture about Data Access to entlib. At the end of the lecture, I asked two questions that I was concerned about:
- In a large application, if you need more than two sets of databases (such as SQL Server and Oracle), Can You encapsulate all the required SQL queries in the stored procedure, in this way, you only need a set of accessCodeIs there a better way to solve this problem?
- It is easy to directly use guid as the primary key when setting the primary key of a database (multiple databases are supported at the same time). But there are not many factors that affect the performance during query. Is there any better solution?
The above two questions are briefly answered by Microsoft engineers due to the time. The first question should be taken into consideration for specific applications. However, the second question must have a certain performance impact, but the impact is not big. With this problem, I did this small experiment.
Note: If you have better suggestions, we will discuss them!
Test environment:
- Dell laptop 1.5 GB
- Win XP Professional
- 512 MB DDR RAM
- SQL Server 2000 Personal Edition
Test method:
- Create a database [test_guid] with 10 fields. Use guid as the primary key and other common field types to simulate the actual usage. The SQL code for table creation is as follows:
Create Table [DBO]. [test_guid] ( [guid] [varchar] (50) Collate chinese_prc_ci_as not null, [test1] [varchar] (50) Collate chinese_prc_ci_as null, [Test2] [datetime] Null, [test3] [varchar] (50) Collate chinese_prc_ci_as null [test4] [varchar] (100) collate partition null, [test5] [varchar] (100) Collate chinese_prc_ci_as null, [test6] [varchar] (50) Collate chinese_prc_ci_as null, [test7] [text] collate chinese_prc_ci_as null, [test8] [int] Null, [test9] [int] Null ) on [primary] textimage_on [primary] go alter table [DBO]. [test_guid] With nocheck add constraint [pk_test_guid] primary key clustered ( [guid] ) on [primary] go |
- Create a database [test_iidd] with 10 fields, use iidd as the primary key, and other common field types to simulate the actual usage. The SQL code for table creation is as follows:
Create Table [DBO]. [test_iidd] ( [iidd] [numeric] (9) Identity (1, 1) not null, [test1] [varchar] (50) Collate chinese_prc_ci_as null, [Test2] [datetime] Null, [test3] [varchar] (50) Collate chinese_prc_ci_as null [test4] [varchar] (100) collate partition null, [test5] [varchar] (100) Collate chinese_prc_ci_as null, [test6] [varchar] (50) Collate chinese_prc_ci_as null, [test7] [text] collate chinese_prc_ci_as null, [test8] [int] Null, [test9] [int] Null ) on [primary] textimage_on [primary] go alter table [DBO]. [test_iidd] With nocheck add constraint [pk_test_iidd] primary key clustered ( [iidd] ) on [primary] go |
- As you can see, the first table uses a globally unique identifier (guid) as the primary key, while the second table uses a normal numeric (similar to int type) data type as the primary key, here is a brief introduction to guid:
Guid, a globally unique identifier, is often used in the COM component identifier. Because it is almost impossible to generate duplicate two values, it is often used in various fields. The specific values are as follows: as shown in the A89C9547-032B-4860-ABB5-6EAEAVE934D5, you must have seen a similar string, ^ _ ^, using the newid () function in SQL Server2000 to get a unique guid |
- Run the following two SQL statements to insert 0.1 million statements to the two tables respectively. I am concerned about the effect of large data volumes, so don't blame me for choosing 0.1 million data records when I start.
declare @ num int set @ num = 0 while (@ num <100000) begin insert into test_guid values ( newid (), 'x222222222222222222222222 ', getdate (), 'aaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccccccccccccccccc ', '479c8aad-3040-4fc5-b53a-d6af085ad38a average ', '1', '0' ) set @ num = @ num + 1 end |
Declare @ num int Set @ num = 0 While (@ num <100000) Begin Insert into test_iidd Values ( 'X22222222222222222222222222 ', Getdate (), 'Aaaaaaaaaaaaaaaaaaa ', 'Bbbbbbbbbbbbbbbbbbbbbb ', 'Cccccccccccccccccccccccccccccccc ', 'Ddddddddddddddd ', '479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a 479c8aad-3040-4fc5-b53a-d6af085ad38a ', '1 ', '0' ) Set @ num = @ num + 1 End |
- Start the test. The test code and result are as follows:
# Test 1 (guid)
------------------ declare @ times datetime set @ times = getdate () -------------------- select * From test_guid where guid = 'a89c9547-hangzhou' or guid = 'fffa8619-BC9F-4B76-ACE8-B3324105BBDE 'or guid = 'fffc26d5-hangzhou' or guid = 'fff9fa53-E115-450A-A52D-B0AET36FF539 'or guid = 'a89c9547-032b-4860-abb5-6eaeave934d5' or guid = 'fff90a0b-CB5B-446F-81FC-CFA661D03CF8' or guid = 'fff85f4a-hangzhou' or guid = 'ffff354a-ED3E-4C3A-A033-3406F229EB34 ' order by guid DESC ------------------- select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate () as millisecond ------------------- |
0 s, 0 ms, sometimes 10 Ms |
# Test 2 (iidd)
-------------------- Declare @ times datetime Set @ times = getdate () -------------------- Select * From test_iidd Where Iidd = '1' or Iidd = '2' or Iidd = '000000' or Iidd = '000000' or Iidd = '000000' or Iidd = '3' or Iidd = '8' or Iidd = '000000' Order by iidd DESC --------------------- Select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate () as millisecond --------------------- |
0 seconds, 0 milliseconds, sometimes 10 milliseconds |
- We can see that the efficiency of normal select queries is not significant in the case of 0.1 million data records.
# Test 3 (guid)
-------------------- declare @ times datetime set @ times = getdate () ------------------ select count (*) from test_guid ------------------- select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate ()) as millisecond ------------------- |
29 seconds, 28793 milliseconds, poor performance! |
# Test 4 (iidd)
-------------------- Declare @ times datetime Set @ times = getdate () -------------------- Select count (*) from test_iidd --------------------- Select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate () as millisecond --------------------- |
The first running is 3 seconds, the second running is 1 second, and the third running is 0 seconds, 50 milliseconds, my God! |
- How is this good? guids suffer a big loss when there is no WHERE clause for aggregation operations
# Test 5 (guid)
-------------------- declare @ times datetime set @ times = getdate () ------------------ select count (*) from test_guid where Test2> '2017-06-03 21:05:33. 330 ' ------------------- select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate ()) as millisecond ------------------- |
29 seconds, 29093 milliseconds, although the query only found more than 200 pieces of data, the speed has not changed! |
# Test 6 (iidd)
-------------------- declare @ times datetime set @ times = getdate () ------------------ select count (*) from test_iidd where Test2> '2017-06-03 21:05:33. 330 ' ------------------- select datediff (second, @ times, getdate () as Second, datediff (MS, @ times, getdate ()) as millisecond ------------------- |
the first operation is 2 seconds, and the second operation is 0 seconds, 160 milliseconds, which is a little slower than the case where there is no where |
- As shown in the results, the results are not ideal.
# Test (guid)
Add the Test2 column (datetime) of the table test_guid as the index column. |
Run Test 3: 0 seconds, 50 milliseconds... Run Test 5: 0 seconds, 0 milliseconds. It is very obvious. |
# Test 8 (iidd)
Add the Test2 column (datetime) of the table test_iidd as the index column. |
Run Test 4: 0 s, 40 ms Run Test 6: 0 s, 40 ms |
-
- Test 7 and Test 8 have different return values, which may cause minor differences. This can be ignored (because I tested that the difference is very small when the return values are the same)
-
- It can be seen that adding a time-type or Int-type index to the table with guid as the primary key can compensate for the performance loss caused by using guid as the primary key.
Summary:
Due to the time, this test is one-sided and superficial. We also hope that we can supplement and improve the deficiencies and omissions, after this test, I think I will perform more tests on performance. Please do it again.
This test only produces such superficial things, hoping that your precious time will not be wasted ^_^!