CentOS 7下Greenplum 源碼安裝教程

來源:互聯網
上載者:User

CentOS 7下Greenplum 源碼安裝教程

叢集組成:

  一台主機,一台從節點。

系統內容:

  作業系統:CentOS 7,64位,7.4.1708(/etc/RedHat-release中查看)

  CPU:AMD Fx-8300 8核

  記憶體:8GB

  硬碟:120GB

  GNOME:3.22.2

安裝版本:

  GPDB:V5.4.1

  GPORCA:V2.53.11

前提條件:禁用防火牆(所有節點和主機都要禁用!!)

使用root帳號執行下列命令(同時禁用預設的防火牆和可能已經安裝的iptables,共兩個防火牆程式):

關閉預設的防火牆

# systemctl stop firewalld

屏蔽預設的防火牆(重啟後也不會啟動)

# systemctl mask firewalld

關閉iptables

# systemctl stop iptables

禁用iptables

# systemctl disable iptables

安裝過程

一)建立專有帳號gpdba,並加入root使用者組。

下面所有操作都使用gpdba帳號來執行!如果操作失敗,則使用root帳號。

二)修改所有伺服器的主機名稱(所有節點和主機)

1)修改hosts使用命令 vi /etc/hosts 來修改

127.0.0.1 localhost localhost.localdomain

192.168.58.102 Master shsm002

192.168.58.104 Slave1 shsm004

最後,再輸入 source /etc/profile 重新整理。

2)修改network檔案,輸入命令vi /etc/sysconfig/network

NETWORKING=yes
HOSTNAME=對應的主機名稱

3)如果主機名稱與裝置名稱不符,則按照下列格式修改:

127.0.0.1 localhost localhost.localdomain

IP地址 主機名稱 裝置名稱
最後使用ping命令驗證是否可以連通。

三)修改系統檔案(所有節點和主機)

1)修改核心配置

vi /etc/sysctl.conf,添加下面內容:

kernel.shmmax = 5000000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 250 512000 100 2048

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.ip_forward = 0

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 1025 65535

net.core.netdev_max_backlog = 10000

net.core.rmem_max = 2097152

net.core.wmem_max = 2097152

vm.overcommit_memory = 2

執行命令 sysctl -p 使修改數值生效

2)修改限制配置

vi /etc/security/limits.conf

添加下面內容:

* soft nofile 65536

* hard nofile 65536

* soft nproc 131072

* hard nproc 131072
3)禁用SELINUX

vi /etc/selinux/config,修改SELINUX的值為disabled。修改後,如下:

# This file controls the state of SELinux on the system.

# SELINUX= can take one of these three values:

# enforcing - SELinux security policy is enforced.

# permissive - SELinux prints warnings instead of enforcing.

# disabled - No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

# targeted - Targeted processes are protected,

# mls - Multi Level Security protection.

SELINUXTYPE=targeted


三)安裝最佳化器GPORCA的依賴項(所有節點和主機)

1)安裝cmake(3.10.2)

下載:
$ wget http://www.cmake.org/files/v3.10/cmake-3.10.2.tar.gz
解壓:
$ tar xzf cmake-3.10.2.tar.gz

定位到解壓後的目錄中:
$ cd cmake-3.10.2
關於configure命令:
如果要查看詳細的配置選項,使用下面命令:

$ ./configure --help
執行配置命令(安裝到目錄/usr/cmake):

$ ./configure --prefix=/usr/cmake

編譯:
$ make
安裝:
# make install
最後進行驗證:
$ /usr/cmake/bin/cmake -version

輸出類似下面內容顯示出版本號碼:
cmake version 3.10.2

編輯修改/etc/profile檔案,將cmake添加到環境變數定義中,添加下面內容:

  ### CMAKE 3.10 ###
  export PATH=/usr/cmake/bin:$PATH

2)安裝gp-xerces

使用gpdba帳號解壓源碼檔案壓縮包,進入解壓目錄,執行下面命令。
mkdir build
cd build
../configure --prefix=/usr/local  ##安裝到/usr/local目錄下
(注意:如果出錯,則使用root帳號執行下面的make命令)
make
make install

3)安裝re2c(1.0.3)

進入 http://re2c.org/install/install.html 頁面下載自己需要的版本
安裝re2c是由於配置ninja時需要
$ ./configure --prefix=/usr/local
(注意:使用root帳號執行下面的make命令;如果使用者沒有在root使用者組中時)
$ make
$ make install

4)安裝Ninja

可以使用git下載:https://github.com/ninja-build/ninja.git
下載後進入ninja目錄執行如下命令:
./configure.py --bootstrap
由於最終結果只是一個二進位檔案ninja,之後使用root帳號拷貝ninja檔案到/usr/bin目錄即可(/usr/bin目錄已經在環境變數PATH中配置定義了)
Installation is not necessary because the only required file is the resulting ninja binary. However, to enable features like Bash completion and Emacs and Vim editing modes, some files in misc/ must be copied to appropriate locations.

特別說明:先在主機上安裝所有依賴項的程式,然後通過scp命令遠程複製安裝包或壓縮包到其他節點上逐個執行安裝。

四)安裝GPORCA

:https://github.com/greenplum-db/gporca

安裝GPORCA(GPDB-5.4.1對應的依賴版本,2.53.11)
使用gpdba帳號解壓源碼檔案壓縮包,進入解壓目錄,執行下面命令。
cmake -GNinja -H. -Bbuild
ninja install -C build

查看GPDB依賴的ORCA的版本資訊:/gpdb-5.4.1/depends/conanfile_orca.txt檔案
[requires]
orca/v2.53.11@gpdb/stable

安裝完成後,進入/gporca/build目錄,執行ctest命令進行檢查
如果最後輸出類似如下結果:
100% tests passed, 0 tests failed out of 119

Total Test time (real) = 195.48 sec
這說明編譯成功了。

【刪除舊版的GPORCA】
進入源檔案目錄下,執行命令
rm -rf build/*
rm -rf /usr/local/include/naucrates
rm -rf /usr/local/include/gpdbcost
rm -rf /usr/local/include/gpopt
rm -rf /usr/local/include/gpos
rm -rf /usr/local/lib/libnaucrates.so*
rm -rf /usr/local/lib/libgpdbcost.so*
rm -rf /usr/local/lib/libgpopt.so*
rm -rf /usr/local/lib/libgpos.so*

五)安裝GPDB(選擇版本5.4.1)

1)使用root帳號安裝依賴項

sudo yum install -y epel-release

sudo yum install -y apr-devel bison bzip2-devel cmake3 flex gcc gcc-c++ krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel perl-ExtUtils-Embed python-devel python-paramiko python-pip python-psutil python-setuptools readline-devel xerces-c-devel zlib-devel

# Install lockfile with pip because the yum package `python-pip` is too old (0.8).
sudo pip install lockfile conan

2)下載原始碼檔案,解壓後編譯安裝。

使用gpdba帳號進入下載解壓的源檔案目錄下,執行命令(prefix後面的路徑/usr/gpdb是安裝目錄)
./configure --with-perl --with-python --with-libxml --with-gssapi --prefix=/usr/gpdb
如果沒有安裝ORCA,則可以使用:./configure --with-perl --with-python --with-libxml --with-gssapi --disable-orca --prefix=/usr/gpdb

然後執行make
make -j8

最後執行安裝
make -j8 install

3)分發

首先,建立伺服器之間的ssh免密串連。

建立目錄/usr/gpdb-conf,在該目錄中建立主機資訊清單檔hostlist,檔案內容如下:

  Master

  Salve1

然後繼續在gpdb-conf目錄中建立seg_hosts,檔案內容如下:

  Slave1

重新整理greenplum_path的配置

source /usr/gpdb/greenplum_path.sh

gpssh交換密鑰

gpssh-exkeys -f /usr/gpdb-conf/hostlist

 

最後,將安裝成功的檔案夾壓縮打包

gtar -cvf /home/gpdba/gpdb-install-binary-5.4.1.tar /usr/gpdb

使用gpscp命令複製到其他節點上(或者先ssh後scp也可以)

gpscp -f /usr/gpdb-conf/seg_hosts /home/gpdba/gpdb-install-binary-5.4.1.tar =:/usr

使用gpssh串連主機與從節點,解壓tar檔案,安裝路徑同主機的安裝路徑保持一致。

gpssh -f /usr/gpdb-conf/hostlist

master 節點串連 slave 節點之後,執行所有命令都應該有n份輸出才表示正常。

解壓檔案

gtar -xvf gpdb-install-binary-5.4.1.tar

建立資料庫工作目錄

cd /home/gpdba/gpdata

mkdir gpdatap1 gpdatap2 gpdatam1 gpdatam2 gpmaster

4)初始化資料庫(在master主機)

配置bash_profile環境變數

vi .bash_profile

修改如下:

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

## Greenplum Database
source /usr/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpdba/gpdata/gpmaster/gpseg-1
export PGPORT=2346
export PGDATABASE=testDB

儲存後,重新整理生效:

. ~/.bash_profile

設定資料庫的啟動參數

將/usr/gpdb/docs/cli_help/gpconfigs/gpinitsystem_config 檔案 複製到 /usr/gpdb-conf 目錄下然後編輯,保留如下內容:

# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="Greenplum Data Platform"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=40000

#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=mdw

#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/data/master

#### Port number for the master instance.
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UNICODE

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers
#### are calculated.
#MIRROR_PORT_BASE=50000

#### Base number by which primary file replication port
#### numbers are calculated.
#REPLICATION_PORT_BASE=41000

#### Base number by which mirror file replication port
#### numbers are calculated.
#MIRROR_REPLICATION_PORT_BASE=51000

#### File system location(s) where mirror segment data directories
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the
#### DATA_DIRECTORY parameter.
#declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
#DATABASE_NAME=name_of_database

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem

最後,執行命令開始初始化:

gpinitsystem -c /usr/gpdb-conf/gpinitsystem_config -a

 

特別說明:如果初始化執行失敗之後,再次執行初始化,則需要先執行下面命令進行環境重設:

查詢並關閉配置指定連接埠的postgres進程

刪除產生的未完成的資料庫檔案(可能是所有節點伺服器),/home/gpdba/gpdata/gpmaster/gpseg-1檔案夾。

六)錯誤解決

錯誤:
[gpdba@shsm002 ~]$ gpssh-exkeys -f /usr/gpdb-conf/hostlist
Error: unable to import module: version conflict: '/usr/lib64/python2.7/site-packages/psutil/_psutil_linux.so' C extension module was built for another version of psutil (different than 2.2.1)
解決:重新安裝psutil。sudo pip install psutil==2.2.1


錯誤:
20180129:23:40:43:gpinitsystem:shsm002:gpdba-[FATAL]:-Found indication of postmaster process on port 2345 on Master host Script Exiting!
解決:關閉殺死佔用連接埠2345的進程。
先查詢進程
$ lsof -i:2345

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME

postgres 10738 gpadmin 3u IPv4 264510 0t0 TCP *:postgres (LISTEN)
postgres 10738 gpadmin 4u IPv6 264511 0t0 TCP *:postgres (LISTEN)
然後殺死進程
$ kill -9 10738


錯誤:
20180207:00:14:09:005166 gpinitsystem:shsm002:gpdba-[INFO]:-Building the Master instance database, please wait...
20180207:00:14:17:005166 gpinitsystem:shsm002:gpdba-[INFO]:-Starting the Master in admin mode
20180207:00:14:23:gpinitsystem:shsm002:gpdba-[FATAL]:-Unknown host shsm004 Script Exiting!
20180207:00:14:23:005166 gpinitsystem:shsm002:gpdba-[WARN]:-Script has left Greenplum Database in an incomplete state
原因:hostname與使用者帳號的@後面的主機名稱不一致,hosts定義中也沒有shsm004,添加進去即可。
解決:修改hosts檔案,每行記錄為:IP地址 主機名稱 網域名稱。將hostname數值shsm004放到網域名稱欄位儲存即可。使用ping命令可以ping通。

 

錯誤:
20180207:00:05:00:003516 gpinitsystem:shsm002:gpdba-[INFO]:-Checking Master host
20180207:00:05:00:003516 gpinitsystem:shsm002:gpdba-[WARN]:-Have lock file /tmp/.s.PGSQL.2346.lock but no process running on port 2346
20180207:00:05:00:gpinitsystem:shsm002:gpdba-[FATAL]:-Found indication of postmaster process on port 2346 on Master host Script Exiting!
解決:刪除檔案,rm /tmp/.s.PGSQL.2346.lock。


錯誤:
[gpdba@shsm002 ~]$ /bin/bash /home/gpdba/gpAdminLogs/backout_gpinitsystem_gpdba_20180207_225128
[FATAL]:-Not on original master host Master, backout script exiting!
解決:不使用這個指令碼清理中間資料,直接刪除gpdata目錄下的未完成的資料庫檔案即可。

 

錯誤:
20180207:23:39:31:028691 gpcreateseg.sh:shsm002:gpdba-[INFO][1]:-Start Function PROCESS_QE
20180207:23:39:31:028691 gpcreateseg.sh:shsm002:gpdba-[INFO][1]:-Processing segment Slave1
/usr/gpdb/bin/postgres: error while loading shared libraries: libgpopt.so.3: cannot open shared object file: No such file or directory
no data was returned by command ""/usr/gpdb/bin/postgres" -V"
The program "postgres" is needed by initdb but was either not found in the same directory as "/usr/gpdb/bin/initdb" or failed unexpectedly.
Check your installation; "postgres -V" may have more information.
/usr/gpdb/bin/postgres: error while loading shared libraries: libgpopt.so.3: cannot open shared object file: No such file or directory
no data was returned by command ""/usr/gpdb/bin/postgres" -V"
The program "postgres" is needed by initdb but was either not found in the same directory as "/usr/gpdb/bin/initdb" or failed unexpectedly.
Check your installation; "postgres -V" may have more information.
cat: /home/gpdba/gpdata/gpdatap1/gpseg0.initdb: No such file or directory
cat: /home/gpdba/gpdata/gpdatap2/gpseg1.initdb: No such file or directory
解決:修改/usr/gpdb/greenplum_path.sh檔案,添加libgpopt.so.3檔案所在目錄到環境變數LD_LIBRARY_PATH定義中,然後執行source命令重新整理(在重啟電腦之前,可能每次開啟終端命令列時都需要手動重新整理一下)。修改後的檔案內容如下:

GPHOME=/usr/gpdb

# Replace with symlink path if it is present and correct
if [ -h ${GPHOME}/../greenplum-db ]; then
GPHOME_BY_SYMLINK=`(cd ${GPHOME}/../greenplum-db/ && pwd -P)`
if [ x"${GPHOME_BY_SYMLINK}" = x"${GPHOME}" ]; then
GPHOME=`(cd ${GPHOME}/../greenplum-db/ && pwd -L)`/.
fi
unset GPHOME_BY_SYMLINK
fi
#setup PYTHONHOME
if [ -x $GPHOME/ext/python/bin/python ]; then
PYTHONHOME="$GPHOME/ext/python"
fi
PYTHONPATH=$GPHOME/lib/python
PATH=$GPHOME/bin:$PYTHONHOME/bin:$PATH
LD_LIBRARY_PATH=$GPHOME/lib:/usr/local/lib:${LD_LIBRARY_PATH-}
export LD_LIBRARY_PATH
OPENSSL_CONF=$GPHOME/etc/openssl.cnf
export GPHOME
export PATH
export PYTHONPATH
export PYTHONHOME
export OPENSSL_CONF


錯誤:
20180208:01:57:59:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function CREATE_DATABASE
psql: FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1513)
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function ERROR_CHK
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-End Function ERROR_CHK
20180208:01:58:00:012804 gpinitsystem:shsm002:gpdba-[INFO]:-Start Function ERROR_EXIT
20180208:01:58:00:gpinitsystem:shsm002:gpdba-[FATAL]:-Failed to complete create database testDB Script Exiting!
解決:關閉並禁用防火牆(所有的防火牆程式)
運行命令:
# systemctl stop firewalld
# systemctl mask firewalld
# systemctl stop iptables
# systemctl disable iptables
另一種方法供參考:shared_buffers設定太大,對於如何根據自己記憶體和segment節點個數分配shared_buffers,可以去官網找一下,通常出去2g的other,以及statement_mem * segment 個數,剩下的除以segment的個數即可。這種情況通常出現中安裝過程中就設定了shared_buffers,一般預設的125MB。

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.