Design efficient stored procedures

Source: Internet
Author: User

 

Design background

Due to historical reasons, the data volume in the online database environment is huge, and many tables with more than 10 million levels or even hundreds of millions are generated. The goal is to make n associated tables use a source table as the base table. Data is moved and archived. Here we use N as an example. N is 50, and each table has 5000 million data.

Worst-performing SQL evolution objective

2. The keyname field has the same meaning and name. the first 500 pieces of data that are not in the bug02 table are retrieved from the bug01 table.

Worst performance:

?
SELECT
TOP
500 a.KeyName FROM
bug01 a LEFT
JOIN
bug02 b on
a.KeyName = b.KeyName  WHERE
(a.KeyName
not in
(select
distinct
b.KeyName From
bug02))  ORDER
BY
a.KeyName asc

The evolutionary body is revealed at the end of the article

Detailed Design

Problem: Performance Security fault tolerance

Why is the process so designed? It will be explained below

Step 1 filter source table data

There is nothing to say about this part. You can set different filtering rules based on your own business scenarios.

Step 2. source table data copy

The entry point of the program must be the source table, and the content in the extension table is expanded based on the source table as the key. Then how to do this expansion process.

First, determine some concepts about the hierarchical relationships in the 50 tables. There may be only 10 tables directly associated with the source table key.

For example, if I count the details of all libraries in the city, we use the library as the source table. The bookshelves, addresses, and member information associated with the library. In this case, the information in 3 is divided into a level table.

Bookshelves are associated with book categories, addresses are associated with street information, and member users are associated with borrowing information. Then, we will continue to divide the three users into two-level tables.

Solution 1: Use a cursor loop to process the key-related data of the source table based on the key value of the source table. Suppose we have not processed the data of the 500-hop source table in batches.

That is, traverse all nodes based on the library ID. For example, we use a level-1 Table regardless of the level-2 and level-3 tables. The number of insert operations is 500*50. Select Operation same data volume

I'm not happy with anyone, and it's hard to figure out how to traverse two-level tables and three-level tables.

Solution 2: Set the key data of the source table, store the data in the variable, and use the in expression. It seems feasible. The number of operations is reduced to 1/500. But here is the most terrible problem.

Variable Length. For example, the maximum length of a varchar cannot exceed 65535.

Solution 3: Make the source table key into a query filter pool (the SQL where Condition Statement at the underlying level of the first-level table is described in detail below). Compared with the second scheme, we seem to have increased the number of operations again.

If the hierarchy is not considered, the insert operation is 50. The Select Operation is 50*2 acceptable.

Solution 3 extension: for a large table, 50 operations are not optimistic, and the 50 may change to 50000, or.

Another problem is that when you operate these 500 records, data interference may occur. The 500 records you obtained one second ago may not necessarily be the content one second later.

Therefore, a temporary table policy is adopted.

?
12345678910111213141516171819202122232425262728        CREATE
TABLE #p     (      
        OrderID
varchar(50),          primary
key (OrderID)           );
    SET
@temp_text = 'INSERT INTO #p '+@KeyText
    --PRINT @temp_text
    EXEC
(@temp_text)              SET
@KeyText = 'SELECT OrderID FROM #p'    -- If the first-level table is associated with more operations, you can access the source table to replace the physical table with a temporary table.
    SET
@SubKeyText = 'Select Level 1 Table _ A _ joined key from level 1 Table _ A with (nolock) Where Level 1 Table _ A _ joined source table key in ('
+ @KeyText+')'          CREATE
TABLE #q     (      
        OrderID
varchar(50),          primary
key (OrderID)           );
    SET
@temp_text = 'INSERT INTO #q '+@SubKeyText
    EXEC
(@temp_text)        SET
@SubKeyText ='SELECT OrderID FROM #q'          -- If the number of operations associated with a level-1 Table is small, the data filter pool can be directly generated.
    SET
@SubKeyTextforA ='Select Level 1 Table _ B '_ from level 1 Table _ B with (nolock) Where Level 1 Table _ B' _ source table key in ('
+ @KeyText+')'    SET
@SubKeyTextforB ='Select Level 1 Table _ C _ from level 1 Table _ C with (nolock) Where Level 1 Table _ C _ off source table key in ('
+ @KeyText+')'          -- If more multi-layer operations exist, you can continue to associate the resource filter pool demo with only three layers.
SET
@THKeyTextforA =
'Select Level 2 Table _ A _ from Level 2 Table _ A with (nolock) Where Level 2 Table _ A _ join Level 1 Table key in ('
+ @SubKeyTextforA+')'

-- Step.3 table sharding

The question at this stage is how security transactions control the transaction size, how to measure how fault tolerance and how to make programs scalable and maintainable.

You can differentiate your batch range based on the business scenario. Take the bug demo as an example. 50 million-level large tables should be placed in the inner layer for processing more than 5000 transactions in batches. If there are less than 5000 transactions, they can be placed in the outermost layer.

The transaction size directly affects the performance fluctuation.

Fault Tolerance solutions you can also design your own bugs. programmers can use the third type of table exception tables to reset the tables. If the reset fails, insert the next batch and filter them directly.

?
-- Import the wrong batch order number into the exception table    Insert
into Exception table (@ extable)
SELECT OrderID FROM
#p -- @ Extable: used to store abnormal data. If an error occurs in the current batch, the information of this batch of orders will be stored in the database. @ extable: the data will be filtered for the next batch and then executed.
    SET
@KeyText = 'SELECT TOP '+CAST(@SynSize
AS VARCHAR(10))+' '+@Base_Key+'
FROM +
?
'+@BaseTable+'+
WHERE '+@Base_Key+'
not in
(
select '+@Base_Key+'
From '+@ExTable+') '

How to Make programs beautiful and maintainable

We can also use the ideas of interview objects in the stored procedure, but there is no such concept for the stored procedure, so we may wish to design our own

Used or temporary table

?
-- The primary key of the source table or the primary table associated with the secondary table
    INSERT
INTO #k VALUES
('First-Level Table _',@Base_Key,@KeyText,'')                  
-- Level 1 Table _    INSERT
INTO #k VALUES
('First-Level Table _ B',@Base_Key,@KeyText,'')                  
-- Level 1 Table _ B    INSERT
INTO #k VALUES
('Level 1 Table _ C',@Base_Key,@KeyText,'')                  
-- Level 1 Table _ c-- Indirect association of level 2 rules    -- @ Subkeytext
    INSERT
INTO #k VALUES
('Level 2 Table _','Level-2 Table _ A _ join level-1 key',@SubKeyText,'')               
-- Level 2 Table _    INSERT
INTO #k VALUES
('Second-Level Table _ B','Level-2 Table _ B _ join level-1 key',@SubKeyText,'')               
-- Level 2 Table _ B    INSERT
INTO #k VALUES
('Level 2 Table _ C','Level-2 Table _ C _ join level-1 key',@SubKeyText,'')               
-- Level 2 Table _ c-- Special processing    -- Custom operation
    INSERT
INTO #k VALUES
('Special table','Special table Association key','Custom data filtering Method','')           
          -- Processing of other auto-increment Columns
    -- Modify the order and cancel the modification history table
    INSERT
INTO #k VALUES
('Auto-incrementing table',@Base_Key,@KeyText,'Custom field')

-- Step.4 process details

The cursor loop temporary table operates on each table once.

?
12345678910111213141516171819202122232425 DECLARE
CUR_ORDERHEDER
INSENSITIVE
CURSOR
FOR SELECT
TableName,KeyName,temptext,colname FROM
#k      OPEN
CUR_ORDERHEDER     FETCH
CUR_ORDERHEDER INTO
@Cur_Table,@Cur_Key,@Cur_W,@Cur_K         WHILE @@FETCH_STATUS = 0
            BEGIN                 EXECUTE
P_Task_Sub_Synchronization                  @OutParam  = @OutParam
OUT, @OutMessage = @OutMessage
OUT,             @KeyText =  @Cur_W,@Table= @Cur_Table,@Extension=@Extension,@IsDelSource=@IsDelSource,@KeyName=@Cur_Key,@ColName=@Cur_K
                 --SET @OutMessage = @OutMessage+@OutMessage
                 --PRINT @OutMessage
                 IF @OutParam <> 0  
                     BEGIN                        SET
@OutMessage = @OutMessage + @Cur_Table +'Operation failed'                        ROLLBACK
TRAN                         -- Import the wrong batch order number into the exception table
                        Insert
into Exception table (@ extable)
SELECT OrderID FROM
#p                         DROP
TABLE #k                          DROP
TABLE #p                          DROP
TABLE #q                         RETURN                     END                 FETCH
CUR_ORDERHEDER INTO
@Cur_Table,@Cur_Key,@Cur_W,@Cur_K             END    ClOSE
CUR_ORDERHEDER     DEALLOCATE
CUR_ORDERHEDER      

-- Step.5 release resources

-- Step 6 Process

 

The two parts are not detailed.

Worst-performing SQL Evolution Process

Step 1. Not in: Do not distinc again. distinc and not in are both notorious roles. Not in, and dinstinc is a perfect addition.

SQL statement after modification:

Select top 500 A. keyname from bug01 a left join bug02 B on A. keyname = B. keyname

Where (A. keyname not in (select B. keyname from bug02 ))
Order by A. keyname ASC

Step 2. Alias. Do not underestimate the alias. Use a diagram to talk about the original SQL plan.

SQL statement after modification:

Select top 500 A. keyname from bug01 a left join bug02 B on A. keyname = B. keyname

Where (A. keyname not in (select C. keyname from bug02 C ))

Order by A. keyname ASC

Step.3 why do we need to use external services for direct filtering?

SQL statement after modification:

Select top 500 A. keyname from bug01
Where (A. keyname not in (select C. keyname from bug02 C ))
Order by A. keyname ASC

Step.4 according to the suggestions of luofer, the direct evolution of T

Select top 500 A. keyname from bug01 a limit t
Select B. keyname from bug02 B

Auto: http://www.cnblogs.com/dubing/archive/2011/11/11/2245836.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.