First, the introduction of the cluster
A total of 3 hosts, IP 193.168.0.93 193.168.0.94 193.168.0.95
The cluster corresponds to master and segment as follows, and 193.168.0.93 is the master node. 193.168.0.94 193.168.0.95 for segment nodes, each segment node is configured with two primary segment and two mirror segment (can also do a backup for master, not currently done)
Schema map into the next
Second, server modification (ALLHOST) 2.1 Configuration Hosts
Vi/etc/hosts
192.168.0.93 gpdb-1 mdw192.168.0.94 gpdb-2 sdw1192.168.0.95 gpdb-3 sdw2
2.2 Create users and user groups 2.2.1 Create user groups with Group ID 530
530 Gpadmin
2.2.2 Create a user, give the Gpadmin user group, and customize the user root directory
530 530 -d/home/gpadmin-s/bin/bash GPADMI
2.2.3 Authorized/home/gpadmin
Chown-r Gpadmin:gpadmin/home/gpadmin
2.2.4 Change Password
passwd gpadmin
2.3 Shutting down the firewall
2.3.1 shutting down the default firewall
Systemctl Stop Firewalld
2.3.2 Close Iptables
Systemctl Stop Iptables
2.4 Modifying the network file Vi/etc/sysconfig/network
NETWORKING=yeshostname= corresponding host name
2.5 Modifying System files
2.5.1 Modifying the kernel configuration
Vi/etc/sysctl.conf
Kernel.shmmax =5000000000Kernel.shmmni=4096Kernel.shmall=4000000000Kernel.sem= - 512000 - 2048KERNEL.SYSRQ=1Kernel.core_uses_pid=1KERNEL.MSGMNB=65536Kernel.msgmax=65536Kernel.msgmni=2048net.ipv4.tcp_syncookies=1Net.ipv4.ip_forward=0net.ipv4.conf.default. Accept_source_route =0net.ipv4.tcp_tw_recycle=1Net.ipv4.tcp_max_syn_backlog=4096Net.ipv4.conf.all.arp_filter=1Net.ipv4.ip_local_port_range=1025 65535Net.core.netdev_max_backlog=10000Net.core.rmem_max=2097152Net.core.wmem_max=2097152vm.overcommit_memory=2
Execute the command sysctl-p make the modified value effective
2.5.2 Modifying a throttling configuration
Vi/etc/security/limits.conf
65536 65536 131072 131072
2.5.3 disabling SELinux
Vi/etc/selinux/config
Selinux=disabled
Iii. installation dependencies and creation of the installation directory (allhost) 3.1 Installation dependencies
Yum install-y epel--y apr-devel bison bzip2-devel cmake3 flex gcc gcc-c++ krb5-devel libcurl-devel libevent-deve L LIBKADM5 libyaml-devel libxml2-devel perl-extutils-embed python-devel Python-paramiko python-pip python-psutil Python-setuptools readline-devel xerces-c-devel zlib-develpip install lockfile conanyum install OpenSSL-devel
3.2 Creating the installation directory
mkdir/opt/-Rgpadmin:gpadmin/opt/-R gpadmin:gpadmin/usr/local/yiliao/
----------------------------------------below with gpadmin users------------------------------------------------------------
Iv. Install gpdb4.1 upload the installation package and unzip (Master)
Unzip gpdb-5.6. 0 . ZIPCD gpdb-5.6. 0
4.2 Compiling gpdb (Master) 4.2.1 Installation
./configure--with-perl--with-python--with-libxml--with-gssapi--disable-orca--prefix=/opt/greenplum/ Dbmakemake Install
4.2.2 Introducing Scripts
source/opt/greenplum/db/greenplum_path.sh
4.3 Edit node server file (Master)
(Directory:/home/gpadmin/gpconfig/)
mkdir Gpconfigvi all_hosts mdw sdw1 sdw2vi seg_hosts sdw1 sdw2
4.4 Getting through the node (Master)
Gpssh-exkeys-F all_hostsgpssh-exkeys-f seg_hosts
4.5 Packaging Distribution installation files (Master) 4.5.1 packaged distribution
TAR-CF GP5. 6. Tar db/-f/home/gpadmin/gpconfig/seg_hosts gp5. 6. Tar =:/opt/greenplum
4.5.2 Unzip the distribution file (directory structure to and master node always) and create data directory
Gpssh-/opt/greenplum/-xf gp5. 6 -p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/ -p/usr/local/yiliao/gpdata/-p/usr/local/yiliao/gpdata/gpm2
4.5.3 Edit. Bash_profile
VI ~/.bash_profile
source/opt/greenplum/db/greenplum_path.shexport master_data_directory=/usr/local/yiliao/gpdata/ gpmaster/gpseg-1export PGPORT=5432export pgdatabase=testdb
Effective . ~/.bash_profile
4.5.4 Editing a database startup file (/home/gpadmin/gpconfig/)
Array_name="Greenplum"Machine_list_file=/home/gpadmin/gpconfig/Seg_hostsseg_prefix=Gpsegport_base=40000Declare-A data_directory= (/usr/local/yiliao/gpdata/gpp1/usr/local/yiliao/gpdata/gpp2) Master_hostname=mdwmaster_directory=/usr/local/yiliao/gpdata/Gpmastermaster_port=5432Trusted_shell=sshencoding=Unicodemirror_port_base=50000Replication_port_base=41000Mirror_replication_port_base=51000Declare-A mirror_data_directory= (/USR/LOCAL/YILIAO/GPDATA/GPM1/USR/LOCAL/YILIAO/GPDATA/GPM2)
4.5.5 Initializing the database
Gpinitsystem-c/home/gpadmin/gpconfig/gpinitsystem_config-a
4.5.6 test for installation success with Psql
Psql-'yiliaodb' createdb;
4.5 Modifying a file (Master)
cd/usr/local/yiliao/gpdata/gpmaster/gpseg-1/
VI pg_hba.conf
Host all yiliaodb 0.0. 0.0/0 MD5
Description
(1). Among them, "trust" and "password" for the PostgreSQL database login authentication method, "trust" to express the confidence, that is, do not need to enter a password (even if there is a password), "password" indicates the need to enter a password.
(2). Host all 127.0.0.1/31 Trust means that the local connection database does not need to enter a password (even if there is a password, no password is required after it is set to trust)
(3). Host all 0.0.0.0/0 a password is required for all host connection databases (if a password is available)
(4). Of course, the results of these two combined performances are: Trust for native links (no password required), password for non-native links required
4.6 Installation Precautions
1. The firewall needs to be shut down after each server reboot
2. Each time you need to load the GPDB environment variable to have the gpdb database command ". ~/.bash_profile"
3.gpstop Command Close gpdb
4.gpstart-a command to open gpdb
4.7 GP Basic Commands
1.Shell command:
Su- gpadmingpstart #正常启动gpstop #正常关闭gpstop -M fast #快速关闭gpstop -R #重启gpstop -u #重新加载配置文件
2.GP command
a) login and exit Greenplum-5432 -U gpadmin # exit in GP database: \QB) View all default data tables under the Greenplum database: /c5>Select from Pg_tables; c) After entering the database, you can view the database of the database by command:
Select from pg_database; d) View users and database users in the database: \du database: \l
3. Checking and repairing Invalid segment nodes
# Status command to view state Show detailed status information of a Greenplum Database system : - for in the Master Host system catalog: -qshow information about mirror Segment instances: -m Show information about the standby master configuration: -Fdisplay The Greenplum software version information: --R
V. Data backup and Recovery 5.1 Understanding backup Recovery for Greenplum databases
5.1.1 Parallel Backup (gp_dump)
1) GP backs up master and all active segment instances at the same time
2) The amount of time the backup consumes is not related to the number of instances in the system
3) back up all DDL files and GP -related data dictionary tables on the master host
4) Each segment backs up its own data
5) All backup files make up a full backup collection that is identified by a unique 14-digit timestamp
5.1.2 Parallel Recovery (gp_restore)
1) Identify the backup collection with the timestamp generated by the Gp_dump, restore the database objects and data to the distributed database
2) Each segment restores its own data in parallel
3) The restored GP system must be isomorphic to the backed up system
5.1.3 Non-parallel backup (Pg_dump/pg_dumpall)
1) GP still supports regular PostgreSQL backup commands Pg_dump and Pg_dumpall
2) Backup will create a large backup file with all segment data on the master host
3) Not suitable for all data backup, suitable for a small amount of data migration or backup
Ps:
The general situation does not use Pg_dump, which applies to postgresql migration to Greenplum.
Pg_dump only one single database can be backed up at a time, and role and tablespace information cannot be backed up (because this information is cluster-wide, not in a database (Per-database))
5.1.4 Non-parallel recovery (Pg_restore)
1) restore using a backup file created by Pg_dump or Pg_dumpall
2) heterogeneous system recovery with non-parallel recovery
5.1.5 four choices for backing up your database
1) Create a backup file for each instance
Can be used to migrate data between a full-library backup or a system with the same configuration;
2) Use Gpcrondump for regular backups
The Gp_dump is packaged and backed up in the background using cron to schedule the GP database;
The dispatch job calling Gpcrondump should be deployed to the master host in GP;
The Gpcrondump command also backs up objects other than databases such as database roles and server configuration
3) Create a single backup file using Pg_dump or Pg_dumpall
Can be used for a small amount of data migration or backup between different databases;
4) File System backup
Backup with operating system or other third-party tools for incremental backups
5.2 Creating GP users and databases
5.2.1 Switch to the test user and configure the environment variables
Sugptestusr
source/opt/greenplum/db/greenplum_path.sh
5.2.2 Create a database user and enter the database
' Password ' ; CREATE DATABASE TestDB; GRANT all privileges on the DATABASE testdb to TestDB; ALTER ROLE testdb createrole createdb;
To modify the Greenplum configuration file:
Vi/usr/local/yiliao/gpdata/gpmaster/gpseg-1/pg_hba.conf
Entering the database, entering the password mode
Psql-u testdb-d testdb-w
5.3 Backup Instances
table when testing the database:
CREATE TABLE Syn_test (IDint, Name varchar (Ten) ) distributed by (ID), insert into syn_test values (1,'AB'),(2,'DC'),(3,'DFS'),(4,'SFS'); CREATE table Syn_test2 (IDint, Cost varchar (Ten) ) distributed by (ID), insert into syn_test2 values ( One,'AA'),( A,'BB'),( -,'cc'),( the,'DD');
5.3.1 Pg_dump & Pg_restore
# Create backup directory (under root user)
Mkdir/gpbackup
Chown gpadmin.gpadmin/gpbackup/
# <1> Table-level backup recovery
Backup: pg_dump-t syn_test-fc-u testdb-w testdb-f/gpbackup/syn_test_pgdump.dmp
Recovery: pg_restore-d testdb/gpbackup/syn_test_pgdump.dmp
# <2>database Level Backup Recovery
Backup: Pg_dump testdb-fc-ugpadmin-w-f/gpbackup/testdb_pgdump.dmp
Recovery: Create an empty database before you restore
5.3.2 Pg_dumpall
Gp_dump can only back up one database at a time. To solve this problem, use the Pg_dumpall tool, which backs up each database and role, table space definition.
Super user privileges are required to execute pg_dumpall.
# <1> Export all role and tablespace
Pg_dumpall-g >/gpbackup/role_tbs.sql
# <2> Export All database
Pg_dumpall >/gpbackup/all.dmp
5.3.3 Gp_dump & Gp_restore
# Gp_dump
Gp_dump is a parallel backup tool for Greenplum, Master and segment are backed up at the same time, and backups are saved in each node. The longest backup node determines the overall backup duration.
The segment node backs up only data, while master backs up all the DDL commands and some global system tables (such as gp_segment_configuration).
When Gp_dump is executed, only segment primary and active master run the backup operation, and mirror and Stanby master do not participate in the backup.
Gp_dump runs the gp_dump_agent agent process on each segment and reports the backup status to the Gp_dump process of master.
Backup: Gp_dump testdb--table=syn_test--gp-d=/gpbackup/--gp-r=/gpbackup/
# Gp_restore
The Gp_restore command restores the data definition and the user's data using the backup file generated by Gp_dump. In particular, when Gp_restore is required, specify the timestamp at Gp_dump backup (in Gp_dump_.rpt).
The following requirements are implemented for Gp_restore:
Gp_dump does the backup file, and the backup file is in the original location.
gpdb normal operation
gpdb Number of segment nodes (including primary and mirror) and backup consistency
The database to be restored has been created
When using Gp_dump Backup, parameters such as-s (schema only),-a (data only),--GP-C (compressed),--gp-d (alternate dumpfile location) are specified, then use Gp_rest Ore should also be added when restoring
Recovery: gp_restore--gp-d=/gpbackup/--gp-r=/gpbackup/--gp-k=201807041720
Mpp-greenplum database installation and simple use