Small table hotspot disks with abnormal methods for conversion of large data volumes, abnormal hot spots
I encountered a strange problem.
There are two tables, which are about 0.3 billion of the Data. Table A has the A1 field, table B has the B1 field, and are updated through other tables. A1 uses the C table and B1 uses the D table. The difference is that the C table is very small and has only nearly 10 thousand rows. The D table is very large and has 0.3 billion rows. (Table C is followed by a small table, and table D is used as a large table)
Because table B has 0.3 billion million data records, we need to obtain updates from Table D's 0.3 billion data records. At the beginning, I was very worried about this. I plan to do this for table D, however, after testing, we found that table B is updated faster than table. Think twice about it. Ask the DBA and find that there is a hot disk problem after analysis.
During the update process, the index field modulo is used to update the table. Each table is divided into about 50 modulo values. Since the small table is read during the update process, the same disk is read every time, so there is a hotspot. When reading a large table, the probability of reading the same disk is very low, table B is faster than table.
Although it is a bit twisted, this is the reason. But how can this problem be solved?
The first way I think of it is to put the data keep of the small table into the cache, that is, to access the small table, use the database cache as soon as possible (see the https://www.cnblogs.com/Richardzhu/p/3437925.html)
According to the instructions in this article, I have added small tables to the memory. Although the efficiency has been improved a little, the effect is not obvious. I still have no idea why.
The second way, according to the DBA's guidance, I change the small table pctfree (can see the https://www.cnblogs.com/linjiqin/archive/2012/01/16/2323320.html) to large, to 70, this can make the table Storage more dispersed, maybe because my table is too small, it basically does not work.
In the end, I thought that since it was a problem caused by modulo splitting, why not directly split the table according to the modulo logic? In this way, the access will be more dispersed. I added a modulo field to the small table and copied 50 copies directly. The original index was added and the modulo ID was created as a composite index. Split the table again and put the table keep in the cache.
-- Table
Create table prod_attr_spec_on_mod pctfree 70 as select * from prod_attr_spec where 1 = 2;
Alter table prod_attr_spec_on_mod add TABLE_NAME VARCHAR2 (50 );
Alter table prod_attr_spec_on_mod add FUN_NAME VARCHAR2 (100 );
Alter table prod_attr_spec_on_mod add PROCESS_MOD_NUM number;
Declare
I number;
Begin
For I in 0 .. 50 loop
Insert into prod_attr_spec_on_mod
Select t .*,
'Serv _ attr', 'Get _ PRODUCT_ATTR_ID_BY_OLD_ONMOD ', I
From prod_attr_spec t;
Commit;
End loop;
End;
-- Index
Create index IDX_PRIDATTRIDPRODATTRonmod on prod_attr_spec_on_mod (attr_id, PROCESS_MOD_NUM );
-- Keep into memory
Alter table prod_attr_spec_on_mod storage (buffer_pool keep );
Alter INDEX IDX_PRIDATTRIDPRODATTRonmod STORAGE (BUFFER_POOL KEEP );
Alter table prod_attr_spec_on_mod cache;
-- Table Analysis
Begin
Dbms_stats.gather_table_stats
(Ownname => 'gj _ user ',
Tabname => 'prod _ attr_spec_on_mod ',
Estimate_percent = & gt; 100,
Method_opt => 'for all columns ',
Cascade => true,
Degree => 16
);
End;
After doing so, the efficiency is more than doubled, and the method is indeed abnormal, but it is quite practical.