greenplum hadoop

Discover greenplum hadoop, include the articles, news, trends, analysis and practical advice about greenplum hadoop on alibabacloud.com

View of custom query table partitions in Greenplum

Greenplum Query partition table information itself is not particularly friendly, need to do table association and do the corresponding processing, in order to facilitate later maintenance, here to create two views for the DBA to directly query, very convenient.1. Create a view of the list partition table createorreplaceviewv_gp_list_partition_metaasselect Pp.parrelid::regclasstable_name,pr1.parchildrelid::regclasschild_tbl_name,pr1.parnameas partition

Configuring the Greenplum parameter

Before you perform a greenplum installation, you need to configure the relevant system parameters, otherwise you will be prone to unexpected errors.1. Modifying system ParametersEdit/etc/sysctl.conf, the following is the minimum configuration Kernel.shmmax =500000000kernel.shmmni =4096kernel.shmall =4000000000kernel.sem =2505120001002048KERNEL.SYSRQ =1kernel.core_uses_pid =1KERNEL.MSGMNB =65536kernel.msgmax =65536kernel.msgmni =2048net.ipv4.tcp_syncoo

PostgreSQL Advantage, MySQL database itself is not very rich, trigger and stored process support is weak, Greenplum, AWS Redshift, etc. are based on PostgreSQL developed

interrelated SQL and so on. MySQL is more suitable for simple OLTP applications with business logic. For PostgreSQL, regardless of the business logic is simple or complex, OLTP or OLAP load, PostgreSQL can support, also have a very mature products, many well-known OLAP database products such as Greenplum, AWS Redshift, etc. are the base Developed in PostgreSQL. The query optimizer for PostgreSQL is very powerful, and for the three Table association m

Greenplum Sync to Oracle

Label:Development proposed the need to synchronize from Greenplum to Oracle solution, wrote a script for scheduled processing.#!/bin/sh#copy_gp_2_ora. ShIf [$#-ne 1]; Thenecho "Usage:sh $ tablename"Exit 1FiTablename=$1Psql-h \timing offSet client_encoding= ' GB18030 ';\copy $TABLENAME to '/home/oracle/$TABLENAME. txt ' csv\qEofecho "Load dataInFile ' $TABLENAME. txt ' discardfile ' $TABLENAME. Dis 'Appendinto table $TABLENAMEFields terminated by ', 'O

Greenplum manage Indexes

Greenplum manages indexes. Considering the characteristics of OLAP systems, you should use indexes carefully and conservatively. Avoid using indexes on frequently updated columns; use B-tree indexes on highly selective columns; and use Bitmap indexes on low-selective columns. In general, using indexes in traditional databases can effectively improve data access efficiency. Especially in OLTP systems, you often only need to obtain several rows or parts

Analysis of greenplum window functions

The system is not active recently and has not been upgraded. Therefore, you can optimize the entire ETL system on a stable basis. Top 10 cost time jobs are listed on a daily basis for analysis. The top1 costtime job uses the window functions first_value and last_value. The result SQL uses first_value, the window functions are sorted twice. Use the explain Section Code It can be found that the two sort consumption is about 1.7 times that of one sort, the second sort is improved to one, and the S

Greenplum obtains the field name of an SQL result.

In greenplum, it is difficult to obtain the field names after execution of any SQL statement. For example, when writing a common tool, you can use the copy command to export an SQL statement to text, but the text exported by the name of each field does not exist, if you use your own SQL parsing, it will be too complicated. If we want to obtain these field names, we do not actually execute the SQL statement, because the execution plan has been generate

View SQL statements of all underlying nodes on the master node in greenplum

Greenplum is a distributed database with many PostgreSQL databases under it. Sometimes we need to know what the underlying node is doing. Can we directly have a view or SQL on the master node, you can view the SQL statements of each node and identify the machine and database of the port. The following describes the method: The architecture of 3.3 and 4.0 has changed, so the method is different. 1. Create the v_active_ SQL view to view the SQL: Create

Greenplum-cc-web common errors during installation of the monitoring software

Err Error:1.no pg_hba.conf entry for Host ":: 1", User "Gpmon", Database "Gpperfmon", SSL offSolve:pg_hba.conf增加: host gpperfmon gpmon ::1/128 trust (此处的trust应该为md5,否则后面会报错)2.error: In the installation workload Manaager error, follow the prompts to view the log, found a machine crontab not installedSolve:install vixie-cron3. Login Web Hint: Trust login is disabled.trust user Gpmon is not allowed to login Command CenterDescribe:Use psql-d gpperfmon-u gpmon-w Enter password t

Greenplum Database Python custom functions

The Greenplum Database (hereinafter referred to as the GP database) supports custom functions, and the following describes the custom simple functions written by Python. The clustering function is more complex and self-sensing is not suitable for the GP database.The Python custom function describes what GP can do as long as Python can handle the row-level data.Example: Python returns multiple rows for JSON processing.Create or Replace function public.

Greenplum Sync to Oracle

Developed a solution that needed to be synchronized from Greenplum to Oracle, and wrote a script for timed dispatch processing.#!/bin/sh#copy_gp_2_ora. ShIf [$#-ne 1]; Thenecho "Usage:sh $ tablename"Exit 1FiTablename=$1Psql-h \timing offSet client_encoding= ' GB18030 ';\copy $TABLENAME to '/home/oracle/$TABLENAME. txt ' csv\qEofecho "Load dataInFile ' $TABLENAME. txt ' discardfile ' $TABLENAME. Dis 'Appendinto table $TABLENAMEFields terminated by ', '

Hadoop installation times Wrong/usr/local/hadoop-2.6.0-stable/hadoop-2.6.0-src/hadoop-hdfs-project/hadoop-hdfs/target/ Findbugsxml.xml does not exist

Install times wrong: Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (site) on project Hadoop-hdfs:an Ant B Uildexception has occured:input file/usr/local/hadoop-2.6.0-stable/hadoop-2.6.0-src/hadoop-hdfs-project/ Hadoop-hdfs/target/findbugsxml.xml

Greenplum Management Mode

Greenplum management mode is the same as oracle schema. It is used to logically organize the storage containers of database objects. Different schemas do not share namespaces. Schema public is created by default after the database is created. each user has the permission to create an object in this schema. If no schema is specified, it is created here by default. You are advised to modify the default search path after creating the schema. Otherwise, y

Synchronize Greenplum to Oracle

Developed a solution to synchronize data from Greenplum to Oracle, and wrote a script for regular scheduling.#! /Bin/sh# Copy_gp_2_ora.shIf [$ #-ne 1]; thenEcho "Usage: sh $0 tablename"Exit 1FiTABLENAME = $1 Psql-h \ Timing offSet client_encoding = 'gb18030 ';\ Copy $ TABLENAME to '/home/oracle/export tablename.txt' csv\ QEOF Echo "load dataInfile 'invalid tablename.txt 'discardfile' $ TABLENAME. dis'AppendInto table $ TABLENAMEFields terminated ','Op

Greenplum partition table to view the Occupied Space

When using the greenplum database, the following two functions are used to check the space occupied by the table: pg_relation_size and pg_size_pretty. The former is used to view the data size, and the latter is the adjustment of human readable. The method is as follows: Select pg_size_pretty (pg_relation_size ('relation _ name ')); Select pg_size_pretty (pg_relation_size (OID )); However, this method is useless for partitioned tables, and the

Change the value of greenplum table dk

In the previous article, I wrote how to use gp_segment_id to determine whether the Dk value of the table has data. The following describes how to check the allocated unbalanced table by checking the amount of space occupied and re-adjust the Dk value. One of my greenplum machines, one of which has more data volumes than other nodes, shows the imbalance of data distribution, as shown in the following example: sdw16: $ Du-SH/gpdata {1, 2}/data/GP * 347g

Greenplum is_date C Language Interface

In greenplum/PostgreSQL, it is convenient to convert a string to a time. In many formats, GP can automatically identify whether the time character is used. If the time is not correct or the time is incorrect, the SQL statement reports an error. aligputf8=# select'2011-13-10 10:10:10'::date;ERROR: date/time field value out of range:"2011-13-10 10:10:10"LINE 1: select'2011-13-10 10:10:10'::date; ^HINT: Perhaps you need a different"dates

GoldenGate synchronizes Oracle Data to GreenPlum

GoldenGate synchronizes Oracle Data to GreenPlum Source endOracle 11.2.0.4 RAC 2-NodesOracle Linux 5.8 x86_64Oracle GoldenGate V11.1.1.0.0 for Oracle 11g on Linux x86-64.zip Oracle GoldenGate V11.1.1.0.0 For FlatFile on Linux 64-bit for OGG v11.1.1.0.0.zip Target endGreenplum Database 4.2.6.1 4-NodesCentOS 5.7 x86_64 Oracle GoldenGate V11.1.1.0.0 for Oracle 11g on Linux x86-64.zip Oracle GoldenGate V11.1.1.0.0 For FlatFile on Linux 64-bit for OGG v11.

Greenplum array merging intersection and row variable column, column variable row function

--1. Using the intersection function between Intersect key word groups create OR REPLACE function Array_intersect (Anyarray, Anyarray) RETURNS Anyarray as $$ Select ARRAY ( Select UNNEST ($) INTERSECT Select UNNEST ($)); $$ LANGUAGE Sql;select Array_ Intersect (array[1,2,3],array[2,3,4]);--2. Row variable column function Unnestselect UNNEST (array[1,2,3]);--3. column variable function array_agg:create Temporary table temp_test01 asselect Array_agg (c) aggtest from (Val

Greenplum Process Session Management Chapter

--1. Querying the active session under the specified library, the Procpid field represents the session proc select * from pg_stat_activity where datname = ' dbname '; --2. Interrupt query, ${procpid} indicates that the above query corresponds to the procpid, the same as select Pg_cancel_backend (${procpid}); --3. Interrupt session Connection Select Pg_terminate_backend (${procpid}); --4. If you need to terminate or disconnect a connection session in bulk, you can write a function o

Total Pages: 15 1 .... 3 4 5 6 7 .... 15 Go to: Go

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.