Ceph environment setup (2)
1. There are three layout hosts: node1, node2, and node3. Each host has three osdks, as shown in figure. osd1, and 8 are SSD disks, 4 is the SATA disk. Each of the three hosts has a Monitor and an MDS. We use osd1, 3, and 4 to create a pool named ssd, in the form of three copies, osd0, and 2, 4 to build a Pool named sata, in the form of Erasure code, k = 2, m = 1, that is, use two osdks to store data fragments, one osd to store verification information, and one osd6, 7, and 8 to create a pool named metadata to store the metadata of cephfs. Pool ssd and sata form a writeback mode cache layer. ssd is hotstorage, that is, cache, and sata is coldstorage, that is, back-end storage. sata and metadata are two pools to build a cephfs, mount it to the/mnt/cephfs directory.
Step 2
1. Install software
(1) install dependency apt-get install autoconf automake autotools-dev libbz2-dev debhelper default-jdk git javahelper junit4 libaio-dev libatomic-ops-dev libbabeltrace-ctf-dev libbabeltrace-dev libblkid-dev libboost- dev libboZ restart? Http://www.bkjia.com/kf/ware/vc/ "target =" _ blank "class =" keylink "> counter/Counter + Cgo8YnI + counter/s/CsvrJ + counter =" allow *'
(4) In/tmp/ceph. client. admin. the keyring file generates a keyringceph-authtool -- create-keyring/etc/ceph. client. admin. keyring -- gen-key-n client. admin -- set-uid = 0 -- cap mon 'Allow * '-- cap osd 'Allow *' -- cap mds 'Allow *'
(5) set ceph. client. admin. import keyring to ceph. mon. keyringceph-authtool/tmp/ceph. mon. keyring -- import-keyring/etc/ceph. client. admin. keyring
(6) create a Mon on node1, named node1,/tmp/monmap store monmapmonmaptool -- create -- add node1 172.10.2.171 -- fsid 2fc115bf-b7bf-439a-9c23-8f39f025a9da/tmp/monmap
(7) create a folder to store monitor data, the folder mainly includes keyring and store. dbmkdir-p/var/lib/ceph/mon/ceph-node1
(8) use the monitor map and keyring to assemble the mon daemon to enable the required initial data ceph-mon -- mkfs-I node1 -- monmap/tmp/monmap -- keyring/tmp/ceph. mon. keyring
(9) touch/var/lib/ceph/mon/ceph-node1/done
(10) Enable monitor/etc/init. d/ceph start mon. node1
3. Add OSD
(1) perform disk formatting. ceph-disk prepare -- cluster ceph -- cluster-uuid 2fc115bf-b7bf-439a-9c23-8f39f025a9da -- fs-type xfs/dev/sdb
Mkdir-p/var/lib/ceph/bootstrap-osd/mkdir-p/var/lib/ceph/osd/ceph-0
(2) Mount ceph-disk activate/dev/sdb1 -- activate-key/var/lib/ceph/bootstrap-osd/ceph. keyring
(3) in/etc/ceph. /etc/init. d/ceph start can start all OSD after startup if ceph osd stat check whether there is still no up on rm-rf/var/lib/ceph/osd/ceph-2/upstart restart/etc /init. d/ceph start
(4) set node2 login-free (this step can be skipped) ssh-keygenssh-copy-id node2
(5) To add osd to the second node, copy the configuration scp/etc/ceph. root@172.10.2.173:/etc/ceph/scp/etc/ceph. client. admin. keyring root@172.10.2.173:/etc/ceph/scp/var/lib/ceph/bootstrap-osd/ceph. keyring root@172.10.2.173:/var/lib/ceph/bootstrap-osd/then follow the above (1)-(3) operations and so on node1, 2, 3 osd each
4. Create an mds instance and a file system
(1) create a folder for storing mds data mkdir-p/var/lib/ceph/mds/ceph-node1/
(2) generate the keyring of the Mds and use cephx to verify that this step requires ceph auth get-or-create mds. node1 mon 'Allow rwx 'ossd' allow * 'mds 'Allow * '-o/var/lib/ceph/mds/ceph-node1/keyring
(4) Enable mmsd. node1/etc/init. d/ceph start mds. node1 and so on.
5. Add Monitor to the second node.(1) ssh node2 (2) mkdir-p/var/lib/ceph/mon/ceph-node2 (3) ceph auth get mon. -o/tmp/ceph. mon. keyring (4) ceph-authtool/tmp/ceph. mon. keyring -- import-keyring/etc/ceph. client. admin. keyring (5) ceph mon getmap-o/tmp/monmap (6) ceph-mon -- mkfs-I node2 -- monmap/tmp/monmap -- keyring/tmp/ceph. mon. keyring (7) touch/var/lib/ceph/mon/ceph-node2/done (8) rm-f/var/lib/ceph/mon/ceph-node2/upstart (9) /etc/init. d/ceph start mon. node2 and so on
At this point, ps-ef | grep ceph should be able to see that each Node has a Mon process, an mds process, three osd processes, and the ceph-s command can also be viewed. The configuration file is as follows: [global] fsid = 2fc115bf-b7bf-439a-9c23-8f39f025a9damon initial members = node1, node2, node3mon host = 172.10.2.171, 172.10.2.172, required network = 172.10.2.0/24 auth cluster required = cephxauth service required = cephxauth client required = cephxosd journal size = 1024 filestore xattr use omap = trueosd pool default size = 3osd pool default min size = 1osd pool default pg num = 333osd pool default pgp num = 333osd crush chooseleaf type = 1 [mon. node1] host = node1mon addr = 172.10.2.171: 6789 [mon. node2] host = node2mon addr = 172.10.2.172: 6789 [mon. node3] host = node3mon addr = 172.10.2.173: 6789 [osd] osd crush update on start = false [osd.0] host = node1addr = 172.10.2.171: 6789 [osd.1] host = node1addr = slave: 6789 [osd.2] host = node2addr = slave: 6789 [osd.3] host = node2addr = slave: 6789 [osd.4] host = node3addr = 172.10.2.173: 6789 [osd.5] host = node3addr = slave: 6789 [osd.6] host = node3addr = 172.10.2.173: 6789 [osd.7] host = node2addr = 172.10.2.172: 6789 [osd.8] host = node1addr = 172.10.2.171: 6789 [mds. node1] host = node1 [mds. node2] host = node2 [mds. node3] host = node3
6. Modify crushmap(1) obtain the crush mapceph osd getcrushmap-o compiled-crushmap-filename
(2) decompile crushtool-d compiled-crushmap-filename-o decompiled-crushmap-filename
(3) EDIT decompiled-crushmap-filename and add ruleset. A total of three root pairs should have three pools. Then, establish the correspondence between root and osd and connect it with root in ruleset, set the pool type.
(4) Compile crushtool-c decompiled-crushmap-filename-o compiled-crushmap-filename
(5) Set crush mapceph osd setcrushmap-I compiled-crushmap-filename
The edited crushmap is as follows:
# Begin crush maptunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1
# Devicesdevice 0 osd1_device 1 osd.1device 2 osd.2device 3 osd.3device 4 osd.4device 5 osd.5device 6 osd.6device 7 osd.7device 8 osd.8
# Typestype 0 osdtype 1 hosttype 2 chassistype 3 racktype 4 rowtype 5 pdutype 6 podtype 7 roomtype 8 datacentertype 9 regiontype 10 root
Root sata {id-1 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.0 weight 0.1 item osd.2 weight 0.1 item osd.4 weight 0.1} root ssd {id-8 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.1 weight 0.1 item osd.3 weight 0.1 item osd.5 weight 0.1} root metadata {id-9 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item osd.7 weight 0.1 item osd.6 weight 0.1 item osd.8 weight 0.1}
Rule ssd {ruleset 1 type replicated min_size 1 max_size 10 step take ssd step chooseleaf firstn 0 type osd step emit}
Rule sata {ruleset 0 type erasure min_size 1 max_size 10 step take sata step chooseleaf firstn 0 type osd step emit}
Rule metadata {ruleset 2 type replicated min_size 1 max_size 10 step take metadata step chooseleaf firstn 0 type osd step emit}
7. Create a pool
(1) Create an ssd pool with the type of replicated command prototype: ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] [crush-ruleset-name] actual command: ceph osd pool create ssd 128 128 repicated ssd
(2) create a sata pool and the type is erasure command prototype: ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure [erasure-code-profile] [crush-ruleset-name] actual command: ceph osd pool create sata 128 128 erasure default sata check which erasure-code-pofile command is ceph osd erasure-code-profile ls check the specific profile content is ceph osd erasure-code- profile get default, the result is directory =/usr/lib/ceph/erasure-codek = 2 m = 1 plugin = jerasuretechnique = reed_sol_vanerasure-code-profile is important and cannot be changed once set and applied to the pool, set the command to ceph osd erasure-code-profile set myprofile \ k = 3 \ m = 2 \ ruleset-failure-domain = rack
(3) create a metadata pool with the type of replicated ceph osd pool create metadata128 128 repicated metadata
View pg status ceph pg dump can check which PG can use ceph OSD lspools to view which pools, ceph osd tree to view osd Information
8. Create a cache tier
Cache tier has writeback and readonly modes. writeback is written to cachepool during write operations, and ACK is returned. The cachepool is flushed to the storage pool. If you read data from the storagepool during read operations, the data is first copied to the cachepool and then returned to the client. readonly: writes data directly to the storagepool during the write operation. If you read data from the storagepool during the read operation, the data is copied to the cachepool first, if there is outdated data on the cachepool, the data will be cleared and then returned to the client. This mode does not guarantee the consistency between the two layers. During reading, it is very likely that the outdated data will be read from the cachepool, therefore, it is not suitable for scenarios where data changes frequently.
(1) create a tier command prototype: ceph osd tier add {storagepool} {cachepool} actual command: ceph osd tier add sata ssd
(2) set the tier mode. There are two prototype commands: ceph osd tier cache-mode {cachepool} {cache-mode: ceph osd tier cache-mode ssd writeback
(3) This Operation Command prototype is required for writeback cache: ceph osd tier set-overlay {storagepool} {cachepool} actual command: ceph osd tier set-overlay sata ssd
(3) set the prototype of the parameter command: ceph osd pool set {cachepool} {key} {value} command for obtaining the parameter value: ceph osd pool get {cachepool} {key}
Ceph osd pool set ssd hit_set_type bloomceph osd pool set ssd hit_set_count 1 ceph osd pool set ssd limit 3600 ceph osd pool set ssd target_max_bytes 1000000000000 ceph osd pool set sata limit 0.4 ceph osd pool set sata limit 0.8 ceph osd pool ssd target_max_bytes 1000000000000 ceph osd pool set ssd target_max_objects 1000000 ceph osd pool set ssd cache_min_flush_age 600 ceph osd pool set ssd cache_min_evict_age 1800
9. Create cephfs
Command prototype: ceph fs new <fs_name> <metadata> <data>
Actual command: ceph fs new cephfs metadata sata you can use ceph fs ls to view the cephfs status
10. Mount cephfs
(1) create a mount point mkdir/mnt/mycephfs
(1) obtain the password ceph-authtool -- print-key/etc/ceph. client. admin. keyringAQBNw5dU9K5MCxAAxnDaE0f9UCA/zAWo/hfnSg = or view ceph directly. client. admin. obtain the password from the keyring File
(2) If the kernel itself supports ceph fs, use mount-t ceph node1: 6789: // mnt/cephfs-o name = admin, secret = AQBNw5dU9K5MCxAAxnDaE0f9UCA/zAWo/hfnSg =
The kernel does not support ceph fs. For example, if redhat is not supported, use fuse for mounting and ceph-fuse-m node1: 6789/mnt/cephfs for cephx authentication, ensure that ceph is available in/etc/ceph. client. admin. keyring. The password is in this file.