Mpp-greenplum database installation and simple use

Source: Internet
Author: User
Tags install openssl postgresql psql iptables node server postgresql backup

First, the introduction of the cluster

A total of 3 hosts, IP 193.168.0.93 193.168.0.94 193.168.0.95

The cluster corresponds to master and segment as follows, and 193.168.0.93 is the master node. 193.168.0.94 193.168.0.95 for segment nodes, each segment node is configured with two primary segment and two mirror segment (can also do a backup for master, not currently done)

Schema map into the next


Second, server modification (ALLHOST) 2.1 Configuration Hosts
Vi/etc/hosts
192.168.0.93 gpdb-1 mdw192.168.0.94 gpdb-2 sdw1192.168.0.95 gpdb-3 sdw2
2.2 Create users and user groups 2.2.1 Create user groups with Group ID 530
530 Gpadmin
2.2.2 Create a user, give the Gpadmin user group, and customize the user root directory
530 530 -d/home/gpadmin-s/bin/bash GPADMI
2.2.3 Authorized/home/gpadmin
Chown-r  Gpadmin:gpadmin/home/gpadmin
2.2.4 Change Password
passwd gpadmin
2.3 Shutting down the firewall

2.3.1 shutting down the default firewall

        Systemctl Stop Firewalld

2.3.2 Close Iptables

        Systemctl Stop Iptables
2.4 Modifying the network file Vi/etc/sysconfig/network
NETWORKING=yeshostname= corresponding host name
2.5 Modifying System files

2.5.1 Modifying the kernel configuration

Vi/etc/sysctl.conf

Kernel.shmmax =5000000000Kernel.shmmni=4096Kernel.shmall=4000000000Kernel.sem= - 512000  - 2048KERNEL.SYSRQ=1Kernel.core_uses_pid=1KERNEL.MSGMNB=65536Kernel.msgmax=65536Kernel.msgmni=2048net.ipv4.tcp_syncookies=1Net.ipv4.ip_forward=0net.ipv4.conf.default. Accept_source_route =0net.ipv4.tcp_tw_recycle=1Net.ipv4.tcp_max_syn_backlog=4096Net.ipv4.conf.all.arp_filter=1Net.ipv4.ip_local_port_range=1025 65535Net.core.netdev_max_backlog=10000Net.core.rmem_max=2097152Net.core.wmem_max=2097152vm.overcommit_memory=2

Execute the command sysctl-p make the modified value effective

2.5.2 Modifying a throttling configuration

Vi/etc/security/limits.conf

65536 65536 131072 131072

2.5.3 disabling SELinux

Vi/etc/selinux/config

Selinux=disabled

Iii. installation dependencies and creation of the installation directory (allhost) 3.1 Installation dependencies
Yum install-y epel--y apr-devel bison bzip2-devel cmake3 flex gcc gcc-c++ krb5-devel libcurl-devel libevent-deve L LIBKADM5 libyaml-devel libxml2-devel perl-extutils-embed python-devel Python-paramiko python-pip python-psutil Python-setuptools readline-devel xerces-c-devel zlib-develpip install lockfile conanyum install OpenSSL-devel

3.2 Creating the installation directory
mkdir/opt/-Rgpadmin:gpadmin/opt/-R gpadmin:gpadmin/usr/local/yiliao/

----------------------------------------below with gpadmin users------------------------------------------------------------

Iv. Install gpdb4.1 upload the installation package and unzip (Master)
Unzip gpdb-5.6. 0 . ZIPCD gpdb-5.6. 0
4.2 Compiling gpdb (Master) 4.2.1 Installation
./configure--with-perl--with-python--with-libxml--with-gssapi--disable-orca--prefix=/opt/greenplum/  Dbmakemake Install
4.2.2 Introducing Scripts
source/opt/greenplum/db/greenplum_path.sh

4.3 Edit node server file (Master)

(Directory:/home/gpadmin/gpconfig/)

mkdir Gpconfigvi all_hosts    mdw    sdw1    sdw2vi  seg_hosts    sdw1    sdw2
4.4 Getting through the node (Master)
Gpssh-exkeys-F all_hostsgpssh-exkeys-f seg_hosts

4.5 Packaging Distribution installation files (Master) 4.5.1 packaged distribution
TAR-CF GP5. 6. Tar db/-f/home/gpadmin/gpconfig/seg_hosts  gp5.  6. Tar =:/opt/greenplum
4.5.2 Unzip the distribution file (directory structure to and master node always) and create data directory
Gpssh-/opt/greenplum/-xf gp5. 6  -p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/  -p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/gpm2

4.5.3 Edit. Bash_profile

VI ~/.bash_profile

source/opt/greenplum/db/greenplum_path.shexport master_data_directory=/usr/local/yiliao/gpdata/ gpmaster/gpseg-1export PGPORT=5432export pgdatabase=testdb

Effective . ~/.bash_profile

4.5.4 Editing a database startup file (/home/gpadmin/gpconfig/)
Array_name="Greenplum"Machine_list_file=/home/gpadmin/gpconfig/Seg_hostsseg_prefix=Gpsegport_base=40000Declare-A data_directory= (/usr/local/yiliao/gpdata/gpp1/usr/local/yiliao/gpdata/gpp2) Master_hostname=mdwmaster_directory=/usr/local/yiliao/gpdata/Gpmastermaster_port=5432Trusted_shell=sshencoding=Unicodemirror_port_base=50000Replication_port_base=41000Mirror_replication_port_base=51000Declare-A mirror_data_directory= (/USR/LOCAL/YILIAO/GPDATA/GPM1/USR/LOCAL/YILIAO/GPDATA/GPM2)

4.5.5 Initializing the database
Gpinitsystem-c/home/gpadmin/gpconfig/gpinitsystem_config-a

4.5.6 test for installation success with Psql
Psql-'yiliaodb' createdb;

4.5 Modifying a file (Master)

cd/usr/local/yiliao/gpdata/gpmaster/gpseg-1/

VI pg_hba.conf

Host     all         yiliaodb           0.0.  0.0/0     MD5

Description

(1). Among them, "trust" and "password" for the PostgreSQL database login authentication method, "trust" to express the confidence, that is, do not need to enter a password (even if there is a password), "password" indicates the need to enter a password.

(2). Host all 127.0.0.1/31 Trust means that the local connection database does not need to enter a password (even if there is a password, no password is required after it is set to trust)

(3). Host all 0.0.0.0/0 a password is required for all host connection databases (if a password is available)

(4). Of course, the results of these two combined performances are: Trust for native links (no password required), password for non-native links required

4.6 Installation Precautions

1. The firewall needs to be shut down after each server reboot

2. Each time you need to load the GPDB environment variable to have the gpdb database command ". ~/.bash_profile"

3.gpstop Command Close gpdb

4.gpstart-a command to open gpdb

4.7 GP Basic Commands

1.Shell command:

Su- gpadmingpstart                      #正常启动gpstop                       #正常关闭gpstop  -M  fast #快速关闭gpstop  -R               #重启gpstop  -u               #重新加载配置文件

2.GP command

a) login and exit Greenplum-5432 -U gpadmin                 # exit             in GP database: \QB) View all default data tables under the Greenplum database: /c5>Select from Pg_tables; c) After entering the database, you can view the database of the database by command:
Select from pg_database; d) View users and database users in the database: \du database: \l

3. Checking and repairing Invalid segment nodes

# Status command to view state Show detailed status information of a Greenplum Database system        : - for in the Master Host system catalog:        -qshow information about mirror Segment instances:       -m Show information about the standby master configuration:       -Fdisplay The Greenplum software version information:       --R

V. Data backup and Recovery 5.1 Understanding backup Recovery for Greenplum databases

5.1.1 Parallel Backup (gp_dump)

1) GP backs up master and all active segment instances at the same time

2) The amount of time the backup consumes is not related to the number of instances in the system

3) back up all DDL files and GP -related data dictionary tables on the master host

4) Each segment backs up its own data

5) All backup files make up a full backup collection that is identified by a unique 14-digit timestamp

5.1.2 Parallel Recovery (gp_restore)

1) Identify the backup collection with the timestamp generated by the Gp_dump, restore the database objects and data to the distributed database

2) Each segment restores its own data in parallel

3) The restored GP system must be isomorphic to the backed up system

5.1.3 Non-parallel backup (Pg_dump/pg_dumpall)

1) GP still supports regular PostgreSQL backup commands Pg_dump and Pg_dumpall

2) Backup will create a large backup file with all segment data on the master host

3) Not suitable for all data backup, suitable for a small amount of data migration or backup

Ps:

The general situation does not use Pg_dump, which applies to postgresql migration to Greenplum.

Pg_dump only one single database can be backed up at a time, and role and tablespace information cannot be backed up (because this information is cluster-wide, not in a database (Per-database))

5.1.4 Non-parallel recovery (Pg_restore)

1) restore using a backup file created by Pg_dump or Pg_dumpall

2) heterogeneous system recovery with non-parallel recovery

5.1.5 four choices for backing up your database

1) Create a backup file for each instance

Can be used to migrate data between a full-library backup or a system with the same configuration;

2) Use Gpcrondump for regular backups

The Gp_dump is packaged and backed up in the background using cron to schedule the GP database;

The dispatch job calling Gpcrondump should be deployed to the master host in GP;

The Gpcrondump command also backs up objects other than databases such as database roles and server configuration

3) Create a single backup file using Pg_dump or Pg_dumpall

Can be used for a small amount of data migration or backup between different databases;

4) File System backup

Backup with operating system or other third-party tools for incremental backups

5.2 Creating GP users and databases

5.2.1 Switch to the test user and configure the environment variables

Sugptestusr

source/opt/greenplum/db/greenplum_path.sh

5.2.2 Create a database user and enter the database

' Password '  ;  CREATE DATABASE TestDB;  GRANT all privileges on the DATABASE testdb to TestDB; ALTER ROLE testdb createrole createdb;

To modify the Greenplum configuration file:

Vi/usr/local/yiliao/gpdata/gpmaster/gpseg-1/pg_hba.conf

Entering the database, entering the password mode

Psql-u testdb-d testdb-w

5.3 Backup Instances

table when testing the database:

CREATE TABLE Syn_test (IDint, Name varchar (Ten) ) distributed by (ID), insert into syn_test values (1,'AB'),(2,'DC'),(3,'DFS'),(4,'SFS'); CREATE table Syn_test2 (IDint, Cost varchar (Ten) ) distributed by (ID), insert into syn_test2 values ( One,'AA'),( A,'BB'),( -,'cc'),( the,'DD');

5.3.1 Pg_dump & Pg_restore

# Create backup directory (under root user)

Mkdir/gpbackup

Chown gpadmin.gpadmin/gpbackup/

# <1> Table-level backup recovery

Backup: pg_dump-t syn_test-fc-u testdb-w testdb-f/gpbackup/syn_test_pgdump.dmp

Recovery: pg_restore-d testdb/gpbackup/syn_test_pgdump.dmp

# <2>database Level Backup Recovery

Backup: Pg_dump testdb-fc-ugpadmin-w-f/gpbackup/testdb_pgdump.dmp

Recovery: Create an empty database before you restore

5.3.2 Pg_dumpall

Gp_dump can only back up one database at a time. To solve this problem, use the Pg_dumpall tool, which backs up each database and role, table space definition.

Super user privileges are required to execute pg_dumpall.

# <1> Export all role and tablespace

Pg_dumpall-g >/gpbackup/role_tbs.sql

# <2> Export All database

Pg_dumpall >/gpbackup/all.dmp

5.3.3 Gp_dump & Gp_restore

# Gp_dump

Gp_dump is a parallel backup tool for Greenplum, Master and segment are backed up at the same time, and backups are saved in each node. The longest backup node determines the overall backup duration.

The segment node backs up only data, while master backs up all the DDL commands and some global system tables (such as gp_segment_configuration).

When Gp_dump is executed, only segment primary and active master run the backup operation, and mirror and Stanby master do not participate in the backup.

Gp_dump runs the gp_dump_agent agent process on each segment and reports the backup status to the Gp_dump process of master.

      Backup: Gp_dump testdb--table=syn_test--gp-d=/gpbackup/--gp-r=/gpbackup/

# Gp_restore

The Gp_restore command restores the data definition and the user's data using the backup file generated by Gp_dump. In particular, when Gp_restore is required, specify the timestamp at Gp_dump backup (in Gp_dump_.rpt).

The following requirements are implemented for Gp_restore:

Gp_dump does the backup file, and the backup file is in the original location.

gpdb normal operation

gpdb Number of segment nodes (including primary and mirror) and backup consistency

The database to be restored has been created

When using Gp_dump Backup, parameters such as-s (schema only),-a (data only),--GP-C (compressed),--gp-d (alternate dumpfile location) are specified, then use Gp_rest Ore should also be added when restoring

      Recovery: gp_restore--gp-d=/gpbackup/--gp-r=/gpbackup/--gp-k=201807041720

Mpp-greenplum database installation and simple use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.