Detailed description of the PostgreSQL Partition Table (partitioning) application instance, postgresql Partition Table
Preface
Vertical table sharding is required in the project, that is, data is split into n tables according to the time interval. PostgreSQL provides the partition table function. A partition table actually Splits a large logical table into several small physical blocks, providing many advantages, such:
1. query performance is greatly improved
2. Deleting historical data is faster
3. You can use tablespaces to transfer infrequently used historical data to low-cost storage media.
When should we use a partitioned table? The official guidance is: when the table size exceeds the physical memory size of the database server, the partition table should be used. Next, we will record the detailed process of creating a partition table with an example.
Create a partition table
First, let's take a look at the requirement. Now there is a log table, which needs to be partitioned by the operation_time Field In the table, for example:
This requirement is a typical Partition Table creation by time. First, let's take a look at the steps:
1. Create a parent table
2. Create n sub-tables. Each sub-Table inherits from the parent table.
3. Define a Rule or Trigger to redirect the data inserted to the primary table to a suitable partition table.
As shown above, the overall big steps are divided into the above three, of course, there can be some small optimization measures, such as for each partition, create an index on the keyword field and so on. First, let's take a look at Step 1 -- create a parent table.
Before creating a partition table, you should create a "parent table". All partition tables inherit from it. This table has no data and does not define any check constraints and indexes on this table, now we should create such a table first, but we should first create a sequence:
CREATE SEQUENCE "public"."t_sys_log_main_id_seq" INCREMENT 1 MINVALUE 1 MAXVALUE 99999999 START 1 CACHE 1;ALTER TABLE "public"."t_sys_log_main_id_seq" OWNER TO "postgres";
Next, create the "parent table". Because it is a log table, the table name is "t_sys_log_main ":
Create table "public ". "t_sys_log_main" ("id" int4 DEFAULT nextval ('t_sys_log_main_id_seq ': regclass) not null, "account_affiliation_code" varchar (100) COLLATE "default" not null, "account_affiliation" varchar (50) COLLATE "default" not null, "operation_time" timestamp (6) not null, "operation_key" varchar (2) COLLATE "default" not null, "operation_value" varchar (30) COLLATE "default" not null, "operation_loginid" varchar (100) COLLATE "default" not null, "operation_message" varchar (300) COLLATE "default" not null, "operation_ip" varchar (30) COLLATE "default" not null) WITH (OIDS = FALSE); comment on table "public ". "t_sys_log_main" IS 'System log table'; comment on column "public ". "t_sys_log_main ". "account_affiliation_code" IS 'account institution code '; comment on column "public ". "t_sys_log_main ". "account_affiliation" IS 'account authorization'; comment on column "public ". "t_sys_log_main ". "operation_time" IS 'Operation time'; comment on column "public ". "t_sys_log_main ". "operation_key" IS 'Operation type (key) '; comment on column "public ". "t_sys_log_main ". "operation_value" IS 'Operation type (value) '; comment on column "public ". "t_sys_log_main ". "operation_loginid" IS 'Operation account'; comment on column "public ". "t_sys_log_main ". "operation_message" IS 'Operation information'; comment on column "public ". "t_sys_log_main ". "operation_ip" IS 'logon address'; alter table "public ". "t_sys_log_main" add primary key ("id ");
Run the preceding DDL statement to create a parent table. After the creation is successful, you can create a partition table one by one. Because each partition table is inherited from the parent table, no fields are added to the partition table, next we will create four partition sub-tables as needed to store log data for September, October, November, and December respectively:
create table t_sys_log_y2016m09(CHECK (operation_time >= DATE '2016-09-01' AND operation_time< DATE '2016-10-01'))INHERITS (t_sys_log_main);create table t_sys_log_y2016m10(CHECK (operation_time >= DATE '2016-10-01' AND operation_time< DATE '2016-11-01'))INHERITS (t_sys_log_main);create table t_sys_log_y2016m11(CHECK (operation_time >= DATE '2016-11-01' AND operation_time< DATE '2016-12-01'))INHERITS (t_sys_log_main);create table t_sys_log_y2016m12(CHECK (operation_time >= DATE '2016-12-01' AND operation_time< DATE '2017-01-01'))INHERITS (t_sys_log_main);
As shown above, after the operation is complete, four partition sub-tables can be created. In the preceding statement, a constraint is added to indicate that only data of this month can be inserted, next, create an index on each partition key of the four partition tables:
create index t_sys_log_y2016m09_operation_time ON t_sys_log_y2016m09(operation_time);create index t_sys_log_y2016m10_operation_time ON t_sys_log_y2016m10(operation_time);create index t_sys_log_y2016m11_operation_time ON t_sys_log_y2016m11(operation_time);create index t_sys_log_y2016m12_operation_time ON t_sys_log_y2016m12(operation_time);
So far, our partition table has been created. Next, we need to consider the data insertion problem. How can we make the data of different dates be automatically inserted into the corresponding partition subtable? There are two solutions: Rule and Trigger, which have higher overhead than Trigger, so I will not introduce them too much here, the following describes the Trigger method.
A Trigger usually inserts partitions based on a user-defined Function. A Function selects the insert conditions and a Trigger automatically calls a Function. First, define the Function. The Function is simple, that is, insert data according to the date range:
CREATEOR REPLACE FUNCTION sys_log_insert_trigger () RETURNS TRIGGER AS $$BEGINIF ( NEW .operation_time >= DATE '2016-09-01' AND NEW .operation_time < DATE '2016-10-01') THEN INSERT INTO t_sys_log_y2016m09VALUES (NEW .*) ;ELSEIF ( NEW .operation_time >= DATE '2016-10-01' AND NEW .operation_time < DATE '2016-11-01') THEN INSERT INTO t_sys_log_y2016m10VALUES (NEW .*) ;ELSEIF ( NEW .operation_time >= DATE '2016-11-01' AND NEW .operation_time < DATE '2016-12-01') THEN INSERT INTO t_sys_log_y2016m11VALUES (NEW .*) ;ELSEIF ( NEW .operation_time >= DATE '2016-12-01' AND NEW .operation_time < DATE '2017-01-01') THEN INSERT INTO t_sys_log_y2016m12VALUES (NEW .*) ;ELSE RAISE EXCEPTION 'Date out of range!' ;ENDIF ; RETURN NULL ;END ; $$ LANGUAGE plpgsql;
Finally, create a trigger to execute the Function:
CREATE TRIGGER sys_log_insert_trigger BEFORE INSERT ON t_sys_log_mainFOR EACH ROWEXECUTE PROCEDURE sys_log_insert_trigger();
The creation is complete here. Finally, let's test the results. To confirm that our trigger is indeed triggered, we turn on the statistics switch of the stored procedure. In postgresql. conf, find track_functions and change it to all:
Next, we can run several test insert statements to see if we can insert the specified time records into the corresponding partition subtable. before inserting the data, let's take a look at the statistical information of sys_log_insert_trigger:
We can see that there is no statistical record at present, and then insert several pieces of test data:
Insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school ', '2017-9-8 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ',' 127. 0.0.1 '); insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school', '2017-9-12 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ',' 127. 0.0.1 '); insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school', '2017-10-8 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ',' 127. 0.0.1 '); insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school', '2017-11-8 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ',' 127. 0.0.1 '); insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school', '2017-12-8 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ',' 127. 0.0.1 '); insert into t_sys_log_main VALUES (1, '000000', 'xi'an high-tech first middle school junior high school', '2017-12-25 18:49:26. 004 ', '01', 'XX', 'zhsz _ t', 'test! ', '2014. 0.0.1 ');
Next, let's look at the statistical information of sys_log_insert_trigger:
For example, we can see that six functions have been called. Because we have inserted six data records, the entire process from creating a partition table to testing has been completed successfully.
Summary
Briefly record The complete steps and precautions for creating a partition table in PostgreSQL, and hope to help The End users who encounter The same problem.