Application of range Partitioning in data expiration Processing
In the database design of massive data, you may need to consider the data storage time in the database in advance, or the data expiration problem. It means that because the data volume is too large, only data with a specific time length is retained in the database. For example, for one year, the data generated one year ago needs to be archived for processing.
At this time, range partitions can play a very good role. Just like in the example above, we can make the table into one partition every month, partitions over one year can be processed according to user requirements.
The following methods are usually used to process expired data.
1. Delete
The method for processing historical data is quite simple. You can directly Delete expired partitions and data on partitions (if each partition is allocated with a tablespace, you can also delete the tablespace and directly release the disk space ), release space.
In the actual production process, to reduce human intervention errors, you can write a scheduled task to complete such operations, for example, the following example.
Or the table above, one partition per month.
Createtable sale_data
(Sale_id number (5 ),
Salesman_namevarchar2 (30 ),
Sales_amount number (10 ),
Sales_date date)
Partition by range (sales_date)
(
Partition sales_2009_1 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_2 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_3 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_4 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_5 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_6 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_7 values less than (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_8 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_9 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_10 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_11 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2009_12 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy ')),
Partition sales_2010_1 values lessthan (to_date ('2014/1/123', 'dd/MM/yyyy '))
);
Using the following stored procedure, by creating a job, we can regularly Delete the old partition one year ago.
Create or replace procedure drop_partitionas
V_part_name varchar2 (100); -- add the name prefix of the partition table.
V_over_time number; -- expiration Interval
V_err_num number; -- ora error code
V_err_msg varchar2 (100); -- error description
Begin
-- Obtain the oldest Partition
Select min (partition_name) into v_part_name
From user_tab_partitions where table_name = 'sale _ data ';
-- Obtain the time interval from the earliest partition to the current Partition
Selectmonths_between (sysdate, to_date (substr (min (partition_name), 7,8), 'yyyy-mm') into v_over_time
From user_tab_partitions where table_name = 'sale _ data ';
-- Delete the earliest Partition
If v_over_time> 12 then
Execute immediate 'alter table sale_datadrop partition '| v_part_name;
End if;
Exception
When others then
V_err_num: = sqlcode;
V_err_msg: = substr (sqlerrm, 1,100 );
Dbms_output.put_line ('execution error: '| v_err_num |' error Description: '| v_err_msg );
End drop_partition;
Create a job and execute it once a day.
Declare
Job number;
Begin
Dbms_job.submit (job, 'drop _ partition; ', sysdate, 'sysdate + 1 ');
End;
In this way, you can regularly Delete expired partition data.
2. transplanted to offline databases for Historical Data Processing
For some OLAP systems, data expiration is usually considered in business, and expired data is transplanted to another database for storage. Such databases are called offline databases or historical databases.
Switch some services to the historical database and analyze and process historical data, such as data mining and report analysis.
This database design method is very common. We need to discuss how to migrate online data to offline databases. In this case, partitions play a very important role.
The following is a specific case to illustrate this process.
We call the database used by the current business as an online database. For example, according to business requirements, online databases can only store data for one year, and all the earlier data is transferred to offline databases as historical data. We can roughly list this idea:
(1) When designing an online database, create partitions on a monthly basis for all tables that require data expiration.
(2) Each partition is created in a separate tablespace.
(3) export the tablespace of the partition and copy it to the offline database server.
(4) import the copied data to the offline database on the offline database server.
In this way, a partition data is expired.
3. Remove from database and transfer to archive
In the actual production environment, the third way to process historical data is to archive historical data. It is a compromise, and the cost is lower than that of offline databases, however, historical data can be retained. This processing method requires a large amount of storage space to store expired data.
There are two methods to archive expired data:
(1) data is stored into the database by loading a flat file or a private format file. For example, when loading a data warehouse, archive these data files to the target storage to back up the data.
For example, if a data warehouse system loads data files into the database through SQL * loader, after loading, these files can be transferred to the storage media to be archived at the same time (either a large cabinet or a tape drive ). If you need to use the data in the future, you can copy the data back and reload the data to the database for use.
This method is shown in 2-1.
(2) Export historical data from the database and archive the data. Compared with the previous method, this method obviously consumes a large amount of database resources, but it is also a method.
Export the data in the database by transferring tablespaces or tables, and then archive these files to the storage media.
2-2 shows this method.
This article is excerpted from the book "Making Oracle run faster 2-database design and optimization based on massive data.
Book details: http://blog.csdn.net/broadview2006/article/details/6650914