Install Greenplum source code in CentOS 7
Cluster composition:
One host and one slave node.
System Environment:
Operating System: CentOS 7, 64-bit, 7.4.1708 (in/etc/RedHat-release)
CPU: AMD Fx-8300 8-core
Memory: 8 GB
Hard Disk: 120 GB
GNOME: 3.22.2
Installation version:
GPDB: V5.4.1
GPORCA: V2.53.11
Prerequisites: Disable the firewall (all nodes and hosts must be disabled !!)
Use the root account to execute the following commands (disable the default firewall and possibly installed iptables at the same time, a total of two firewall programs ):
Disable the default Firewall
# Systemctl stop firewalld
Shield the default firewall (it will not be started after restart)
# Systemctl mask firewalld
Disable iptables
# Systemctl stop iptables
Disable iptables
# Systemctl disable iptables
Installation Process
1) create a private account gpdba and add it to the root user group.
All the following operations are performed using the gpdba account! If the operation fails, use the root account.
2) modify the host names of all servers (all nodes and hosts)
1) modify the hosts by using the command vi/etc/hosts.
127.0.0.1 localhost. localdomain
192.168.58.102 Master shsm002
192.168.58.104 Slave1 shsm004
Finally, enter source/etc/profile to refresh.
2) modify the network file and enter the commandVi/etc/sysconfig/network
NETWORKING = yes
HOSTNAME = Host Name
3) if the host name does not match the device name, modify it in the following format:
127.0.0.1 localhost. localdomain
IP address host name device name
Finally, run the ping command to verify the connectivity.
3) modify system files (all nodes and hosts)
1) modify the Kernel configuration
Vi/etc/sysctl. conf, add the following content:
Kernel. shmmax = 5000000000
Kernel. shmmni = 4096
Kernel. shmall = 4000000000
Kernel. sem = 250 512000 100 2048
Kernel. sysrq = 1
Kernel. core_uses_pid = 1
Kernel. msgmnb = 65536
Kernel. msgmax = 65536
Kernel. msgmni= 2048
Net. ipv4.tcp _ syncookies = 1
Net. ipv4.ip _ forward = 0
Net. ipv4.conf. default. accept_source_route = 0
Net. ipv4.tcp _ tw_recycle = 1
Net. ipv4.tcp _ max_syn_backlog = 4096
Net. ipv4.conf. all. arp_filter = 1
Net. ipv4.ip _ local_port_range = 1025 65535
Net. core. netdev_max_backlog = 10000
Net. core. rmem_max = 2097152
Net. core. wmem_max = 2097152
Vm. overcommit_memory = 2
Run the command sysctl-p to make the modification take effect.
2) modify the restriction Configuration
Vi/etc/security/limits. conf
Add the following content:
* Soft nofile 65536
* Hard nofile 65536
* Soft nproc 131072
* Hard nproc 131072
3) Disable SELINUX
Vi/etc/selinux/config, and modify the value of SELINUX to "disabled. After modification, it is as follows:
# This file controls the state of SELinux on the system.
# SELINUX = can take one of these three values:
# Enforcing-SELinux security policy is enforced.
# Permissive-SELinux prints warnings instead of enforcing.
# Disabled-No SELinux policy is loaded.
SELINUX = disabled
# SELINUXTYPE = can take one of these two values:
# Targeted-Targeted processes are protected,
# Mls-Multi Level Security protection.
SELINUXTYPE = targeted
3) install GPORCA dependencies (all nodes and hosts)
1) install cmake (3.10.2)
Download:
$ Wget http://www.cmake.org/files/v3.10/cmake-3.10.2.tar.gz
Decompress:
$ Tar xzf cmake-3.10.2.tar.gz
Go to the decompressed directory:
$ Cmake-3.10.2 cd
The configure command:
To view detailed configuration options, run the following command:
$./Configure -- help
Run the configuration command (install to the directory/usr/cmake ):
$./Configure -- prefix =/usr/cmake
Compile:
$ Make
Installation:
# Make install
Final verification:
$/Usr/cmake/bin/cmake-version
The output is similar to the following:
Cmake version 3.10.2
Edit the/etc/profile file, add cmake to the environment variable definition, and add the following content:
### CMAKE 3.10 ###
Export PATH =/usr/cmake/bin: $ PATH
2) install gp-xerces
Use the gpdba account to decompress the source code package, enter the decompressed directory, and execute the following command.
Mkdir build
Cd build
../Configure -- prefix =/usr/local # install it in the/usr/local directory
(Note: if an error occurs, use the root account to execute the following make command)
Make
Make install
3) install re2c (1.0.3)
Go to the http://re2c.org/install/install.html page to download your desired version
Installing re2c is required when ninja is configured
$./Configure -- prefix =/usr/local
(Note: Use the root account to execute the following make command; if the user is not in the root user group)
$ Make
$ Make install
4) install Ninja
You can download: https://github.com/ninja-build/ninja.git with git
After downloading the file, go to the ninja directory and run the following command:
./Configure. py -- bootstrap
Since the final result is only a binary file ninja, use the root account to copy the ninja file to the/usr/bin directory (the/usr/bin directory has been configured and defined in the environment variable PATH)
Installation is not necessary because the only required file is the resulting ninja binary. however, to enable features like Bash completion and Emacs and Vim editing modes, some files in misc/must be copied to appropriate locations.
Note: first install all the dependencies on the host, and then use the scp command to remotely copy the installation package or compressed package to other nodes to execute the installation one by one.
4) install GPORCA
: Https://github.com/greenplum-db/gporca
Install GPORCA (dependent version for GPDB-5.4.1, 2.53.11)
Use the gpdba account to decompress the source code package, enter the decompressed directory, and execute the following command.
Cmake-GNinja-H.-Bbuild
Ninja install-C build
View the version information of the ORCA on which GPDB depends:/gpdb-5.4.1/depends/conanfile_orca.txt File
[Requires]
Orca/v2.53.11 @ gpdb/stable
After the installation is complete, go to the/gporca/build directory and run the ctest command to check
If the output is similar to the following:
100% tests passed, 0 tests failed out of 119
Total Test time (real) = 195.48 sec
This indicates that the compilation is successful.
[Delete the old GPORCA]
Go to the source file directory and execute the command
Rm-rf build /*
Rm-rf/usr/local/include/naucrates
Rm-rf/usr/local/include/gpdbcost
Rm-rf/usr/local/include/gpopt
Rm-rf/usr/local/include/gpos
Rm-rf/usr/local/lib/libnaucrates. so *
Rm-rf/usr/local/lib/libgpdbcost. so *
Rm-rf/usr/local/lib/libgpopt. so *
Rm-rf/usr/local/lib/libgpos. so *
5) install GPDB (select version 5.4.1)
1) Use the root account to install Dependencies
Sudo yum install-y epel-release
Sudo yum install-y apr-devel bison bzip2-devel cmake3 flex gcc-c ++ krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel perl-ExtUtils-Embed python-devel python- paramiko python-pip python-psutil python-setuptools readline-devel xerces-c-devel zlib-devel
# Install lockfile with pip because the yum package 'python-pip 'is too old (0.8 ).
Sudo pip install lockfile conan
2) download the source code file, decompress it, compile and install it.
Use the gpdba account to enter the directory of the downloaded and decompressed source files and run the command (the path after prefix/usr/gpdb is the installation directory)
./Configure -- with-perl -- with-python -- with-libxml -- with-gssapi -- prefix =/usr/gpdb
If ORCA is not installed, you can use:./configure -- with-perl -- with-python -- with-libxml -- with-gssapi -- disable-orca -- prefix =/usr/gpdb.
Then execute make
Make-j8
Last install
Make-j8 install
3) Distribution
First, create an ssh password-free connection between servers.
Create the/usr/gpdb-conf directory and create the host list file hostlist in this directory. The file content is as follows:
Master
Salve1
Then, create seg_hosts in the gpdb-conf directory. The file content is as follows:
Slave1
Refresh the configuration of greenplum_path
Source/usr/gpdb/greenplum_path.sh
Gpssh exchange key
Gpssh-exkeys-f/usr/gpdb-conf/hostlist
Finally, compress and package the successfully installed folders.
Gtar-cvf/home/gpdba/ gpdb-install-binary-5.4.1.tar/usr/gpdb
Copy the gpscp command to another node (or ssh before scp)
Gpscp-f/usr/gpdb-conf/seg_hosts/home/gpdba/gpdb-install-binary-5.4.1.tar =:/usr
Use gpssh to connect to the host and slave nodes, decompress the tar file, and the installation path is the same as that of the host.
Gpssh-f/usr/gpdb-conf/hostlist
After the master node connects to the slave node, it is normal to execute all commands with n outputs.
Decompress the file
Gtar-xvf gpdb-install-binary-5.4.1.tar
Create a database working directory
Cd/home/gpdba/gpdata
Mkdir gpdatap1 gpdatap2 gpdatam1 gpdatam2 gpmaster
4) initialize the database (on the master host)
Configure the bash_profile environment variable
Vi. bash_profile
Modify as follows:
#. Bash_profile
# Get the aliases and functions
If [-f ~ /. Bashrc]; then
.~ /. Bashrc
Fi
# User specific environment and startup programs
PATH = $ PATH: $ HOME/. local/bin: $ HOME/bin
Export PATH
# Greenplum Database
Source/usr/gpdb/greenplum_path.sh
Export MASTER_DATA_DIRECTORY =/home/gpdba/gpdata/gpmaster/gpseg-1
Export: PGPORT = 2346
Export PGDATABASE = testDB
After saving, the refresh takes effect:
.~ /. Bash_profile
Configure database startup parameters
Copy the/usr/gpdb/docs/cli_help/gpconfigs/gpinitsystem_config file to the/usr/gpdb-conf directory and edit the file. Keep the following content:
# File name: gpinitsystem_config
# Configuration file needed by the gpinitsystem
######################################## ########
#### REQUIRED PARAMETERS
######################################## ########
#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME = "Greenplum Data Platform"
#### Naming convention for utility-generated data directories.
SEG_PREFIX = gpseg
#### Base number by which primary segment port numbers
#### Are calculated.
PORT_BASE = 40000
#### File system location (s) where primary segment data directories
#### Will be created. The number of locations in the list dictate
#### The number of primary segments that will get created per
#### Physical host (if multiple addresses for a host are listed in
#### The hostfile, the number of segments will be spread evenly enabled SS
#### The specified interface addresses ).
Declare-a DATA_DIRECTORY = (/data1/primary/data1/primary/data1/primary/data2/primary/data2/primary/data2/primary)
#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME = mdw
#### File system location where the master data directory
#### Will be created.
MASTER_DIRECTORY =/data/master
#### Port number for the master instance.
MASTER_PORT = 5432
#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL = ssh
#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS = 8
#### Default server-side character set encoding.
ENCODING = UNICODE
######################################## ########
#### OPTIONAL MIRROR PARAMETERS
######################################## ########
#### Base number by which mirror segment port numbers
#### Are calculated.
# Pai_port_base = 50000
#### Base number by which primary file replication port
#### Numbers are calculated.
# REPLICATION_PORT_BASE = 41000
#### Base number by which mirror file replication port
#### Numbers are calculated.
# Pai_replication_port_base = 51000
#### File system location (s) where mirror segment data directories
#### Will be created. The number of mirror locations must equal
#### Number of primary locations as specified in
#### DATA_DIRECTORY parameter.
# Declare-a pai_data_directory = (/data1/mirror/data1/mirror/data1/mirror/data2/mirror/data2/mirror/data2/mirror)
######################################## ########
#### OTHER OPTIONAL PARAMETERS
######################################## ########
#### Create a database of this name after initialization.
# DATABASE_NAME = name_of_database
#### Specify the location of the host address file here instead
#### With the-h option of gpinitsystem.
# MACHINE_LIST_FILE =/home/gpadmin/gpconfigs/hostfile_gpinitsystem
Finally, execute the command to start initialization:
Gpinitsystem-c/usr/gpdb-conf/gpinitsystem_config-
NOTE: If initialization fails again, run the following command to reset the environment:
Query and disable the ipvs process for configuring the specified port
Delete generated unfinished database files (possibly all node servers),/home/gpdba/gpdata/gpmaster/gpseg-1 folder.
6) troubleshooting
Error:
[Gpdba @ shsm002 ~] $ Gpssh-exkeys-f/usr/gpdb-conf/hostlist
Error: unable to import module: version conflict: '/usr/lib64/python2.7/site-packages/psutil/_ psutil_linux.so' C extension module was built for another version of psutil (different than 2.2.1)
Solution: reinstall psutil. Sudo pip install psutil = 2.2.1
Error:
20180129: 23: 40: 43: gpinitsystem: shsm002: gpdba-[FATAL]:-Found indication of postmaster process on port 2345 on Master host Script Exiting!
Solution: Disable the process that occupies port 2345.
First query the process
$ Lsof-I: 2345
Command pid user fd type device size/OFF NODE NAME
Ipvs 10738 gpadmin 3u IPv4 264510 0t0 TCP *: ipvs (LISTEN)
Ipvs 10738 gpadmin 4u IPv6 264511 0t0 TCP *: ipvs (LISTEN)
Then kill the process
$ Kill-9 10738
Error:
20180207: 00: 14: 09: 005166 gpinitsystem: shsm002: gpdba-[INFO]:-Building the Master instance database, please wait...
20180207: 00: 14: 17: 005166 gpinitsystem: shsm002: gpdba-[INFO]:-Starting the Master in admin mode
20180207: 00: 14: 23: gpinitsystem: shsm002: gpdba-[FATAL]:-Unknown host shsm004 Script Exiting!
20180207: 00: 14: 23: 005166 gpinitsystem: shsm002: gpdba-[WARN]:-Script has left Greenplum Database in an incomplete state
Cause: the hostname is inconsistent with the host name behind the user account @, and there is no shsm004 In the hosts definition. Add it.
Solution: Modify the hosts file. Each line is recorded as the IP address Host Name Domain Name. Put the hostname value shsm004 in the Domain Name field and save it. You can use the ping command to ping.
Error:
20180207: 00: 05: 00: 003516 gpinitsystem: shsm002: gpdba-[INFO]:-Checking Master host
20180207: 00: 05: 00: 003516 gpinitsystem: shsm002: gpdba-[WARN]:-Have lock file/tmp/. s. PGSQL.2346.lock but no process running on port 2346
20180207: 00: 05: 00: gpinitsystem: shsm002: gpdba-[FATAL]:-Found indication of postmaster process on port 2346 on Master host Script Exiting!
Solution: delete the file rm/tmp/. s. PGSQL.2346.lock.
Error:
[Gpdba @ shsm002 ~] $/Bin/bash/home/gpdba/gpAdminLogs/backout_gpinitsystem_gpdba_20180207_225128
[FATAL]:-Not on original master host Master, backout script exiting!
Solution: if you do not use this script to clear the intermediate data, you can directly Delete the unfinished database files under the gpdata directory.
Error:
20180207: 23: 39: 31: 028691 gpcreateseg. sh: shsm002: gpdba-[INFO] [1]:-Start Function PROCESS_QE
20180207: 23: 39: 31: 028691 gpcreateseg. sh: shsm002: gpdba-[INFO] [1]:-Processing segment Slave1
/Usr/gpdb/bin/S: error while loading shared libraries: libgpopt. so.3: cannot open shared object file: No such file or directory
No data was returned by command ""/usr/gpdb/bin/postgres "-V"
The program "ipvs" is needed by initdb but was either not found in the same directory as "/usr/gpdb/bin/initdb" or failed unexpectedly.
Check your installation; "postgres-V" may have more information.
/Usr/gpdb/bin/S: error while loading shared libraries: libgpopt. so.3: cannot open shared object file: No such file or directory
No data was returned by command ""/usr/gpdb/bin/postgres "-V"
The program "ipvs" is needed by initdb but was either not found in the same directory as "/usr/gpdb/bin/initdb" or failed unexpectedly.
Check your installation; "postgres-V" may have more information.
Cat:/home/gpdba/gpdata/gpdatap1/gpseg0.initdb: No such file or directory
Cat:/home/gpdba/gpdata/gpdatap2/gpseg1.initdb: No such file or directory
Solution: Modify the/usr/gpdb/greenplum_path.sh file and add libgpopt. the directory where the so.3 file is located is defined in the environment variable LD_LIBRARY_PATH. Then, execute the source command to refresh the file (you may need to manually refresh it every time you open the terminal command line before restarting the computer ). The modified file content is as follows:
GPHOME =/usr/gpdb
# Replace with symlink path if it is present and correct
If [-h $ {GPHOME}/../greenplum-db]; then
GPHOME_BY_SYMLINK = '(cd $ {GPHOME}/../greenplum-db/& pwd-P )'
If [x "$ {GPHOME_BY_SYMLINK}" = x "$ {GPHOME}"]; then
GPHOME = '(cd $ {GPHOME}/../greenplum-db/& pwd-L )'/.
Fi
Unset GPHOME_BY_SYMLINK
Fi
# Setup PYTHONHOME
If [-x $ GPHOME/ext/python/bin/python]; then
PYTHONHOME = "$ GPHOME/ext/python"
Fi
PYTHONPATH = $ GPHOME/lib/python
PATH = $ GPHOME/bin: $ PYTHONHOME/bin: $ PATH
LD_LIBRARY_PATH = $ GPHOME/lib:/usr/local/lib :$ {LD_LIBRARY_PATH -}
Export LD_LIBRARY_PATH
OPENSSL_CONF = $ GPHOME/etc/openssl. cnf
Export GPHOME
Export PATH
Export PYTHONPATH
Export PYTHONHOME
Export OPENSSL_CONF
Error:
20180208: 01: 57: 59: 012804 gpinitsystem: shsm002: gpdba-[INFO]:-Start Function CREATE_DATABASE
Psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm. c: 1513)
20180208: 01: 58: 00: 012804 gpinitsystem: shsm002: gpdba-[INFO]:-Start Function ERROR_CHK
20180208: 01: 58: 00: 012804 gpinitsystem: shsm002: gpdba-[INFO]:-End Function ERROR_CHK
20180208: 01: 58: 00: 012804 gpinitsystem: shsm002: gpdba-[INFO]:-Start Function ERROR_EXIT
20180208: 01: 58: 00: gpinitsystem: shsm002: gpdba-[FATAL]:-Failed to complete create database testDB Script Exiting!
Solution: Disable and disable the firewall (all firewall programs)
Run the following command:
# Systemctl stop firewalld
# Systemctl mask firewalld
# Systemctl stop iptables
# Systemctl disable iptables
Another method is provided for reference: shared_buffers is too large. For how to allocate shared_buffers based on the number of memory and segment nodes, you can go to the official website to find out, usually 2 GB of other, and statement_mem * segment number. Divide the remaining number by the segment number. In this case, shared_buffers is usually set during installation. The default value is 125 MB.