PostgreSQL partition table (partitioning) application examples detailed _postgresql

Source: Internet
Author: User
Tags create index postgresql system log


Objective



There are requirements in the project to vertical table, that is, according to the time interval to split the data into N tables, PostgreSQL provides the function of the partition table. A partitioned table actually divides a large logical table into several small pieces of physics, providing a number of benefits, such as:



1, query performance greatly improved
2, delete historical data faster
3, the use of less common historical data using tablespace technology to transfer to low-cost storage media
So when should we use the partition table? The official directive is: when the size of the table exceeds the physical memory size of the database server, you should use the partition table, then combine an example to record the detailed process of creating the partitioned table.



Create a partition table



First look at the requirements, now there is a log table, now need to be in the table by the Action Time field (Operation_time) partition, the following figure:









This requirement is a typical time to create a partitioned table, first look at the steps:



1. Create a parent table
2. Create n child tables, each of which is inherited Yu Fu
3. Define a rule or trigger (Trigger), redirect data into the primary table to the appropriate partition table



As shown above, the overall big step is divided into the above three, of course, there are some small optimization measures, such as for each partition, in the keyword field to create an index and so on. First look at the first step-create the parent table.



Before you create a partitioned table, you should create a "parent table", where all the partition tables inherit from it, no data in the table, no check constraints and indexes on the table, and now we create a table, but we first build a sequence:


CREATE SEQUENCE "public". T_sys_log_main_id_seq "
 INCREMENT 1
 minvalue 1
 MAXVALUE 99999999
 START 1
 CACHE 1;

ALTER TABLE "public". T_sys_log_main_id_seq "OWNER to" "Postgres";


Next, create the parent table because it is a log table, so the table name is named "T_sys_log_main":

CREATE TABLE "public". "T_sys_log_main" (
"id" int4 DEFAULT nextval ('t_sys_log_main_id_seq' :: regclass) NOT NULL,
"account_affiliation_code" varchar (100) COLLATE "default" NOT NULL,
"account_affiliation" varchar (50) COLLATE "default" NOT NULL,
"operation_time" timestamp (6) NOT NULL,
"operation_key" varchar (2) COLLATE "default" NOT NULL,
"operation_value" varchar (30) COLLATE "default" NOT NULL,
"operation_loginid" varchar (100) COLLATE "default" NOT NULL,
"operation_message" varchar (300) COLLATE "default" NOT NULL,
"operation_ip" varchar (30) COLLATE "default" NOT NULL
)
WITH (OIDS = FALSE)

;
COMMENT ON TABLE "public". "T_sys_log_main" IS 'System log table';
COMMENT ON COLUMN "public". "T_sys_log_main". "Account_affiliation_code" IS 'Code of the institution to which the account belongs';
COMMENT ON COLUMN "public". "T_sys_log_main". "Account_affiliation" IS 'Account Affiliation';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_time" IS 'operation time';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_key" IS 'operation type (key)';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_value" IS 'operation type (value)';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_loginid" IS 'operation account';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_message" IS 'operation information';
COMMENT ON COLUMN "public". "T_sys_log_main". "Operation_ip" IS 'Login Address';

ALTER TABLE "public". "T_sys_log_main" ADD PRIMARY KEY ("id");


Run the above DDL statement to create the parent table. After the success of the creation, you can create partitioned tables one after the other, because each partition table is inherited from the parent table, so the partition table does not add any fields, below we create 4 partition child tables on demand, which are used to store the log data for September, October, November, and December respectively:





CREATE TABLE T_sys_log_y2016m09
(CHECK (operation_time >= date ' 2016-09-01 ' and operation_time< date ') 2016-10-01 '))
INHERITS (t_sys_log_main);

CREATE TABLE T_SYS_LOG_Y2016M10
(CHECK (operation_time >= date ' 2016-10-01 ' and operation_time< date ') 2016-11-01 '))
INHERITS (t_sys_log_main);

CREATE TABLE T_sys_log_y2016m11
(CHECK (operation_time >= date ' 2016-11-01 ' and operation_time< date ') 2016-12-01 '))
INHERITS (t_sys_log_main);

CREATE TABLE T_sys_log_y2016m12
(CHECK (operation_time >= date ' 2016-12-01 ' and operation_time< date ' 2017-01 -01 '))
INHERITS (T_sys_log_main);


As shown above, after the run is completed, you can create 4 partition child tables, in which we add a constraint indicating that only the data for this month is allowed to be inserted, and then the index is indexed on each partition key on the 4 partitioned tables:


Create INDEX T_sys_log_y2016m09_operation_time on t_sys_log_y2016m09 (operation_time);
Create INDEX T_sys_log_y2016m10_operation_time on T_SYS_LOG_Y2016M10 (operation_time);
Create INDEX T_sys_log_y2016m11_operation_time on T_SYS_LOG_Y2016M11 (operation_time);
Create INDEX T_sys_log_y2016m12_operation_time on T_SYS_LOG_Y2016M12 (operation_time);


So far, our partition table has been created, then we need to consider the problem of data insertion, how can we get data from different dates automatically inserted into its corresponding partition child table? There are two solutions: rules (rule) and triggers (Trigger), which are more expensive than triggers, so I'm not going to do a lot of introductions here, so here's a direct introduction to the Trigger approach.



Trigger usually incorporates custom functions (function) to implement partition insertion, function is responsible for selecting inserts based on conditions, and trigger is responsible for automatic invocation of function. The function is defined first by the simple ability to insert data according to the date range:


CREATE
OR REPLACE FUNCTION sys_log_insert_trigger () RETURNS trigger as $$
BEGIN

IF (
  NEW. Operation_ Time >= date ' 2016-09-01 ' and
  NEW. Operation_time < date ' 2016-10-01 '
) THEN
  INSERT into t_sys_log_y201 6m09
VALUES
  (NEW. *);
ELSEIF (
  new operation_time >= date ' 2016-10-01 ' and
  new. Operation_time < date ' 2016-11-01 '
) THEN
  INSERT into T_SYS_LOG_Y2016M10
VALUES
  (NEW. *);
ELSEIF (
  new operation_time >= date ' 2016-11-01 ' and
  new. Operation_time < date ' 2016-12-01 '
) THEN
  INSERT into T_sys_log_y2016m11
VALUES
  (NEW. *);
ELSEIF (
  new operation_time >= date ' 2016-12-01 ' and
  new. Operation_time < date ' 2017-01-01 '
) THEN
  INSERT into T_sys_log_y2016m12
VALUES
  (NEW. *);
ELSE
  RAISE EXCEPTION ' Date out of range! ';
End
IF; return NULL;
End; $$ LANGUAGE Plpgsql;


Finally, create a trigger to perform the function you just made:


CREATE TRIGGER Sys_log_insert_trigger before insert on T_sys_log_main for each
ROW
EXECUTE PROCEDURE sys_log_ Insert_trigger ();


Here is the complete creation, the final test to see the results. To confirm that our triggers did trigger, we open the statistics switch for the stored procedure, in postgresql.conf, find track_functions, and change to all:






You can then run several test INSERT statements to see if you can insert the specified time record into its corresponding partition child table before inserting the statistics for Sys_log_insert_trigger ():






You can see that there are no statistical records, and then insert a few test data:


INSERT into T_sys_log_main VALUES

(1, ' 200022 ', ' XI ' an high-tech first secondary school campus ', ' 2016-9-8 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');

INSERT into T_sys_log_main VALUES

(1, ' 200023 ', ' XI ' an high-tech first secondary school campus ', ' 2016-9-12 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');

INSERT into T_sys_log_main VALUES

(1, ' 200024 ', ' XI ' an high-tech first secondary school campus ', ' 2016-10-8 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');

INSERT into T_sys_log_main VALUES

(1, ' 200025 ', ' XI ' an high-tech first secondary school campus ', ' 2016-11-8 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');

INSERT into T_sys_log_main VALUES

(1, ' 200026 ', ' XI ' an high-tech first secondary school campus ', ' 2016-12-8 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');

INSERT into T_sys_log_main VALUES

(1, ' 200027 ', ' XI ' an high-tech first middle School campus ', ' 2016-12-25 18:49:26.004 ', ' ', ' xx ', ' zhsz_t ', ' test! ', ' 127.0.0.1 ');




Then look at the statistics for Sys_log_insert_trigger ():






As shown above, you can see that the 6-time function was called because we inserted 6 data, and the partition table was successfully completed by the entire process that was created to the test.



Summarize



Simply record the complete steps and considerations for PostgreSQL to create a partitioned table, hoping to help friends with the same problem, the end.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.