Tags: sch limit cat CLI Service Type box external table batchWe want to export data from MySQL to Greenplum, follow these steps to1: Export tables from MySQL to external filesTaking Schema_name.table_name as an exampleSelectproduct_id, Number, name, English_name, Purchase_name, System_name, Bar_code, Category_one, Category_two, Category_three, parent_id, Parent_number, brand_id, supplier_id, Price, Ad_word, Give_integral, Shelf_life, FROM_ Unixtime (s
partitioned for storage. You can speed up the query for a partitioned field.Compression tableCompression is used for large AO tables and partitioned tables to conserve storage space and to increase system I/O, or you can configure compression at the field level. Application Scenarios:
Do not need to update and delete the table
When accessing a table, it is basically a full table scan and does not need to be indexed
You cannot frequently add fields to a table or modify field typ
Greenplum Common commands gpstate
Command Parameter Function gpstate-B => display brief State gpstate-c => display master image gping gpstart-D => specify data directory (default value: $ master_data_directory) gpstate-E => display the segment gpstate-F with image status issues => display the backup host details gpstate-I => display the griplum database version gpstate-M => display the image instance Synchronization status gpstate-P => display use por
Greenplum experiment to dynamically add nodes
1. I initialized a greenplum cluster on the hadoop4-hadoop6.
2. All primary and mirror are started. I connect to the master and insert records in the two tables:
Aligputf8 = # insert into pg_catalog.pg_filespace_entry values (3052,15, '/home/gpadmin1/cxf/gpdata/gpdb_p1/gp6 ');
Insert 0 1
Aligputf8 = # insert into pg_catalog.pg_filespace_entry values (3052,16, '/
Create a table. Without primarykey or uniquekey, GreenPlum uses the first column as the distribution key zwcdb # createtabletab01 (idint, namevarchar (20) by default. NOTICE: partition -- Partition
Create a table. Without primary key or unique key, GreenPlum uses the first column as the distribution key zwcdb = # create table tab01 (id int, name varchar (20); NOTICE: Table doesnt have distributed by clause
Greenplum metadata errors can also affect the data backup process, and the workaround for this error is described in this article when a backup of a data structure using PG_DUMP results in a backup failure due to a lack of distribution policy.PhenomenonWhen you use the Pg_dump command to back up the data structure of the entire Greenplum database:-f /data/dailybak/dw-nodata-$(date+%Y%m%d%H%M%-v-F-p5432 -h-
NAVICAT Premium Remote Connection GP (Greenplum) cluster and resolve Fatal:no pg_hba.conf entry for Host "172.20.20.2", User "Gpadmin", Database "Ddata" , SSL off problem
1, the use of the environment
(1) Client: MacOS version navicat Premium 12.0.19
(2) Service end: Greenplum version:postgres (greenplum Database) 4.3.8.2 build 1
2. Configuration method
(1) Add t
In Greenplum, it is difficult to obtain the field names after execution of any SQL statement. For example, when writing a common tool, use the copy command
In Greenplum, it is difficult to obtain the field names after execution of any SQL statement. For example, when writing a common tool, use the copy command
In Greenplum, it is difficult to obtain the field
Greenplum + Hadoop learning notes-11-distributed database storage and query processing, hadoop-11-
3. 1. Distributed Storage Greenplum is a distributed database system. Therefore, all its business data is physically stored in the database of all Segment instances in the cluster. In the Greenplum database, all tables are distributed, therefore, each table is slic
Log Analysis of ETL background system data is currently underway. You can view the tasks that have been running for a long time, find the jobs that have consumed a long time, and optimize the logic and database layers. this article focuses only on Database optimization (including SQL statement adjustment and greenplum table DK adjustment ). view a job that takes about 30 minutes, find the corresponding source table, and perform the following analysis:
Failed to connect Greenplum with the report tool, there is no error in the report schema. The Search_path should be set up after judgment.
1) Connection Greenplum
C:\windows\system32>psql-h 1.2.345.678-p 5432-d tpc_1-u gpuser
2) View Search_path
tpc_1# show Search_path;
3) Modify Search_path
tpc_1=# ALTER DATABASE tpc_1 set Search_path to "$user", Public, "My_schema";
4) Exit
tpc_1=# \q
5) Connecti
Tags: Big Data greenplum expansionAny distributed system will have to face the matter of expansion, or the significance of the distribution system is greatly discounted, this article on the GP to expand the preparation process, in fact, the implementation of the expansion is a very simple process, the main thing to do is to prepare.Preparing to create a host information fileCreate 6 host information files with GP Administrator OS user gpadmin:
Recent database Schema tuning, a part of the business from MySQL migrated to Greenplum up, originally MySQL with Unix_timestamp and from_unixtime two functions can achieve standard Time and Unix time of mutual conversion, Turning over the Greenplun document, no similar functions were found, so we used Python to customize these two functions, and implemented two business-related functions on the basis of these two functions, which are recorded here.1,
Greenplum Query partition table information itself is not particularly friendly, need to do table association and do the corresponding processing, in order to facilitate later maintenance, here to create two views for the DBA to directly query, very convenient.1. Create a view of the list partition table createorreplaceviewv_gp_list_partition_metaasselect Pp.parrelid::regclasstable_name,pr1.parchildrelid::regclasschild_tbl_name,pr1.parnameas partition
Before you perform a greenplum installation, you need to configure the relevant system parameters, otherwise you will be prone to unexpected errors.1. Modifying system ParametersEdit/etc/sysctl.conf, the following is the minimum configuration Kernel.shmmax =500000000kernel.shmmni =4096kernel.shmall =4000000000kernel.sem =2505120001002048KERNEL.SYSRQ =1kernel.core_uses_pid =1KERNEL.MSGMNB =65536kernel.msgmax =65536kernel.msgmni =2048net.ipv4.tcp_syncoo
interrelated SQL and so on. MySQL is more suitable for simple OLTP applications with business logic. For PostgreSQL, regardless of the business logic is simple or complex, OLTP or OLAP load, PostgreSQL can support, also have a very mature products, many well-known OLAP database products such as Greenplum, AWS Redshift, etc. are the base Developed in PostgreSQL. The query optimizer for PostgreSQL is very powerful, and for the three Table association m
databaseConnected to Database "DEVDW" as User "Gpadmin".devdw=# CREATE TABLE tab_01 (id int); Create a tableNotice:table doesn ' t has ' distributed by ' clause--Using column named ' ID ' as the greenplum Database data distribution Key for this table.Hint:the ' distributed by ' clause determines the distribution of data. Make sure column (s) chosen is the optimal data distribution key to minimize skew.CREATE TABLEdevdw=# \d using \d to view table i
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.