GlusterFS Distributed Storage Deployment use

Source: Internet
Author: User
Tags disk usage glusterfs automake gluster

The Glusterfs is a very easy-to-use Distributed file storage system that implements all standard POSIX interfaces and is virtualized with fuse to make the user look like a local disk. So the program wants to switch from the local disk to Glusterfs without having to modify any code, to do a seamless switch. and making multiple computers appear to be using the same hard drive simplifies a lot of logic. If your application standalone disk is not enough, you might want to consider the glusterfs.

First, Glusterfs source installation
1. Glusterfs Dependent Installation

A. Install Yum under CentOS

Yum Install-y Flex Bison openssl-devel libacl-devel sqlite-devel libxml2-devel libtool automake autoconf gcc attr

LIBURCU-BP need source installation, Yum source inside No
Install the General command first, after entering the source directory

./bootstrap./configuremakesudo Make Install

The following two commands are required after the general installation command is executed to allow the system to find URCU.

sudo ldconfigsudo pkg-config--libs--cflags liburcu-bp liburcu

B. Ubuntu under Apt-get installation

sudo apt-get install Flex Bison libssl-dev Libacl1-dev libsqlite3-dev libxml2-dev liburcu-dev automake autoconf gcc attr

D. Optional installation
In addition, if you want geo-replication, additional installation is required and the SSH service is turned on:

Yum install-y passwd openssh-client openssh-server

E. Installing additional operations under Docker
If the machine has only one, and want to test the cluster, you can consider using Docker, however, Docker has some limitations on the application, so it can not be used directly, need to do more.
① need to install attr

Yum Install Attr-y

② need to build a manual when there is no fuse

Mknod/dev/fuse C 10 229

③ need to elevate permissions when running the container
Docker Run--privileged=true
For example:

sudo docker run--privileged=true-it-h glfs0-v/dk/d0:/d--name=glfs0 gfs7:2/bin/bash or sudo docker run--privileged=t Rue-it--rm-v/dk/rm0:/d gfs7:2/bin/bash

④. You need to load a local volume, place the data file in a local volume's directory, or the extra properties of the disk cannot be used.

2. Glusterfs Compile and install
After installing the above dependence, we download the source code from the official website http://www.gluster.org/download/, and then compile the Glusterfs,gluserfs Compile command for the General Command, Configuration plus--enable-debug to compile as Debug version with debug information

./configure--prefix=/usr Makesudo make install


Second, Glusterfs service start and stop
Most of the commands in Glusterfs need to be run under root, and there will be a variety of errors without root, so I have sudo in front of the command here, and if you log in directly with root, you need to avoid sudo.
1. Start the command

sudo service glusterd start or sudo/etc/init.d/glusterd start or sudo glusterd

2. Stop command

sudo service glusterd stop or sudo/etc/init.d/glusterd stop or PS aux|grep Glusterdsudo kill xxxxxx-pid or Ubuntu under sudo killall GL Usterd

Direct kill needs to stop each volume before it is more secure.


Third, cluster Association
1. Prepare the machine (or virtual machine, Docker) a number of units, I started here 4 Docker,ip for 172.17.0.2 ~ 172.17.0.5
2. Start the Glusterfs service on each machine, as in the previous section.
3. Get the IP or hostname of each machine
4. Execute the associated command on the first machine (172.17.0.2),

sudo gluster peer probe 172.17.0.3sudo Gluster peer probe 172.17.0.4sudo Gluster peer probe 172.17.0.5 ...

So all the machines are connected together, note that the command means that the cluster invites someone to join their organization.


Four, Volume/volume operation
1. Create Volume
A. Single disk, debug environment recommended

sudo gluster volume create Vol_name 172.17.0.2:/d/disk0

B. Multi-disk, no RAID, test, test environment recommended.

sudo gluster volume create Vol_name 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 172.17.0.5:/d/disk0

C. Multiple disks, with RAID1. Recommended for high concurrency environments on line.

sudo gluster volume create vol_name replica 2 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 172.17.0.5:/d/ Disk0

Note: The number of disks in the above command must be an integer multiple of the number of copies.
In addition, there are raid0,raid10,raid5,raid6 and other methods, but the online small file cluster is not recommended.

2. Start Volume
The volume you just created is not yet running, and you need to execute a run command to use it.

sudo gluster volume start vol_name

3. Mount Volume

sudo mkdir/local_mount_dirsudo mount-t glusterfs-o ACL 172.17.0.2:/vol_name/local_mount_dir

4. Using Glusterfs
A. Once a volume of glusterfs is mounted, it can be accessed as a local file, with only the native file API in the code. The use of this method does not necessarily require root permissions, as long as you have permissions to the corresponding directory or file.
B. Direct API mode, which requires root permission to use, and Java, Python, Ruby API packaging is not currently complete, generally not recommended.

5. Uninstall Volume
Unloading is a pair of mount operations. Although you can stop volume without uninstalling, doing so will cause problems, and if the cluster is large, the subsequent volume boot failure may occur.

sudo umount/local_mount_dir

6. Stop volume
The stop and start operation is a pair. It is a good idea to uninstall all clients before stopping.

sudo gluster volume stop vol_name

7. Delete Volume
Delete is a pair with the create operation. You need to stop volume before deleting. Generally does not remove volume in production

sudo gluster volume Delete vol_name

8. Online Repair
When a piece of disk is damaged, a new disk needs to be replaced, and a backup disk is reserved in the cluster, so replace the damaged disk with a spare disk and command the following two commands

sudo gluster volume Replace-brick vol_name 172.17.0.3:/d/damaged_disk 172.17.0.16:/d/new_disk Commitsudo gluster Volume Heal Vol_name Full

9. Online expansion
As the business grows and the cluster capacity is not enough, more machines and disks need to be added to the cluster.
A. The general situation only needs to increase the breadth of the distribution, you can increase the number of disks must be the smallest expansion unit of an integer multiple, that is, replicaxstripe, or disperse number of integers:

sudo gluster volume Add-brick vol_name 172.17.0.11:/d/disk0 172.17.0.12:/d/disk0 172.17.0.13:/d/disk0 172.17.0.14:/d/ Disk0

When the method finishes executing, the disk that needs to be added may not be actually used, and the data needs to be balanced:

sudo gluster volume rebalance vol_name start

B. When the cluster reaches a certain size and you want to increase the number of backups, the number of disks that you increase must be an integer multiple of the original distribution number. Gluster the first value seen in volume info, you need to add a parameter to let the system know that the number of backups of the data has been modified. Assuming the original replica is 2, want to change to 3, the command is as follows:

sudo gluster volume Add-brick vol_name replica 3 172.17.0.11:/d/disk0 172.17.0.12:/d/disk0 172.17.0.13:/d/disk0 172.17.0 .14:/d/disk0

After executing the Add-brick command, the new disk is not actually used, and the system will not be copied automatically, you need to repair the data, so that the system reaches the new specified number of backups

sudo gluster volume heal vol_name full

Note: Only one backup is added at a time, and if more than one backup is added at a time, the current version may be wrong.

10. Online shrinkage
Perhaps the original configuration scale is unreasonable, the intention is to use some of the storage machine for other purposes, as with the expansion, there are two cases.
A. Reduce the breadth of the distribution, the removed disk must be a whole or multiple storage units, in the result list of volume info is a contiguous number of disks. The command automatically balances the data.

sudo gluster volume Remove-brick vol_name 172.17.0.11:/d/disk0 172.17.0.12:/d/disk0 172.17.0.13:/d/disk0 172.17.0.14:/ D/disk0 start

After starting, you need to see the state of the deletion, which is actually automatically balanced until the state changes from in progress to completed.

sudo gluster volume Remove-brick vol_name 172.17.0.11:/d/disk0 172.17.0.12:/d/disk0 172.17.0.13:/d/disk0 172.17.0.14:/ D/disk0 status

After the status display executes, the removal action is submitted.

sudo gluster volume Remove-brick vol_name Commit

B. Reduce the number of backups, the removal of the disk must be compliant (very difficult to express). In the results list of volume info, it is generally a fragmented number of disks (IP may be contiguous). The command does not require balanced data.

sudo gluster volume Remove-brick vol_name replica 2 172.17.0.11:/d/disk0 172.17.0.12:/d/disk0 172.17.0.13:/d/disk0 172.1 7.0.14:/d/disk0 Force

When you reduce the number of backups, it is simply deleted, and the command finally uses the force parameter, if the original system data is not replicated well, then there will be partial loss. Therefore, this operation requires extreme caution. must first ensure data integrity, execute sudo gluster volume heal vol_name full command repair, and execute sudo gluster volume heal vol_name info, and sudo gluster volume sta Tus Check to make sure the data is in normal condition.

11. Quota Settings
A. A volume often allows multiple systems to be used simultaneously, in order to facilitate management, you can add a disk quota for a class or level two directory, avoid excessive use of one system, and affect the normal use of other systems.

sudo gluster volume quota vol_name enablesudo gluster Volume quota vol_name limit-usage/srv_a 10GBsudo gluster Volume Quo Ta vol_name limit-usage/srv_b 200MB

B. Viewing the current quota usage is shown in a fairly intuitive list.

sudo gluster volume quota vol_name list

C. Remove quotas for a directory,

sudo gluster volume quota vol_name remove/srv_a

D. Stop quotas, the method is used with caution, otherwise it will be all clear, often not the result of their own, because after the re-enable, the original set of quotas have disappeared. Of course, it would be appropriate if you were to reconfigure all the directories.

sudo gluster volume quota vol_name Disable

E. If the system does not intend to use all disks for Glusterfs, you can set quotas on the root directory. Given that Glusterfs cannot take full advantage of all disk space, it is best to set the size slightly smaller than the actual space.

sudo gluster volume quota vol_name LIMIT-USAGE/100TB

F. And to use this quota as a disk size, you need to execute the following command, so that the disk size shown in DF is the quota. Quotas use 1024 binary, not disk 1000. When the quota amount exceeds the amount of disk, DF also displays the quota amount, so this must not be set.

Gluster Volume set Vol_name quota-deem-statfs on

The above quota is for disk usage, and Glusterfs provides a quota for the number of files, Limit-objects,list-object. Can be used according to the scene.
Disk quotas function Gluster Volume Quota directory capacity reaches the target size, not immediately effective, but there is a certain time window, (a few seconds), at this time, the data can also be written. Such a feature is not affected when the quota is relatively large, and generally does not exceed too much in a short period of time.

12. RAID selection
RAID1: Suitable for online small and medium file scenes, create commands as previously described.

Single disk, no raid,raid0 three ways only suitable for the experimental environment, allowing the loss of data, once the data is lost, basically need to start over.
RAID0: Suitable for large file experiment environment.

sudo gluster volume create vol_name stripe 3 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 172.17.0.5:/d/ Disk0 172.17.0.6:/d/disk0 172.17.0.7:/d/disk0

RAID10: Suitable for large file scenes.

sudo gluster volume create vol_name replica 2 stripe 3 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 172.17. 0.5:/d/disk0 172.17.0.6:/d/disk0 172.17.0.7:/d/disk0

RAID5,RAID6, etc., generally not suitable for the online environment, but suitable for the GEO backup environment, because it is a software implementation of RAID5 and other functions, so the CPU overhead, and once there is disk corruption, computational CPU overhead is greater, if the pressure on the line environment run, prone to large delays. If the reading and writing pressure on the line is small, you can also consider using it.

RAID5: Not recommended because there is not enough balance, the fault tolerance is too low, and the overhead is relatively large.

sudo gluster volume create vol_name Disperse 6 redundancy 1 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 17 2.17.0.5:/d/disk0 172.17.0.6:/d/disk0 172.17.0.7:/d/disk0

RAID6: Can be used, than RAID5 balance, fault tolerance is much higher than RAID5, the cost is only slightly larger.

sudo gluster volume create vol_name Disperse 7 redundancy 2 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 17 2.17.0.5:/d/disk0 172.17.0.6:/d/disk0 172.17.0.7:/d/disk0 172.17.0.8:/d/disk0

More secure offline GEO backup cluster RAID recommendations: (maximum allowable half of disk corruption, high fault tolerance, data availability up to 10 9)

sudo gluster volume create vol_name Disperse redundancy 5 172.17.0.2:/d/disk0 172.17.0.3:/d/disk0 172.17.0.4:/d/disk0 1 72.17.0.5:/d/disk0 172.17.0.6:/d/disk0 172.17.0.7:/d/disk0 172.17.0.8:/d/disk0 172.17.0.9:/d/disk0 172.17.0.10:/d/ Disk0 172.17.0.11:/d/disk0


five, System characteristics
1. Cache, Data consistency
The client has a cache, but the cache does not have data consistency, the cache update is scheduled to update, the default interval of 1 seconds, that is, after a client modifies or deletes a file, it takes 1 seconds for the entire cluster to be fully aware, there will be inconsistent data in a second. Therefore, Glusterfs does not apply to data with strong data consistency requirements. For pictures, voice and other files, in the application to do the qualification, do not modify the picture, only add pictures, so that the consistency of the data will not appear.
The client's cache provides a performance boost, so when a cluster has a certain size, it is also necessary to plan the files that the client accesses reasonably, to enhance the cache utilization.

2. Users, Permissions
Glusterfs users are Linux itself users, Linux users have two attributes, one is the user name, one is the user number, Linux in the file system always identify a file permissions are user number, and this user number can be passed between glusterfs. For example, a user name User1, user number 1001,user1 user created file A, the permission is 0600.
At this time another computer, there is a user name User1, user number 1002, then the user can not access a file.
However, the computer has a user User3, the user number is 1001, the user number is the same as the previous User1 user number, you can access a file.
In order for the user name and user number to not conflict, when creating the system user, specify a user number, and is not easy to be automatically assigned to the system of the interval, so that the cluster can use consistent user rights.

vi. cluster size

The client or mount process will connect to all brick, and the port is less than 1024, so the number of brick must not exceed 1024.
In replica mode, each server has a glusterfs process that connects each brick as a client, and if Mount is to be performed on those servers, the maximum number of brick is less than 500.
Because the port is using a limited number of ports less than 1024, be careful to keep some commonly used ports, such as 21,22,80,443 ports. Avoid critical services that cannot be started due to port occupancy.
Modify the/etc/sysctl.conf file, add a row, the specific port needs to be set.

net.ipv4.ip_local_reserved_ports=0-25,80,443

Then execute the sysctl command to make it effective:

sudo sysctl-p

Online cluster, under different scales, the cluster configuration is the following table. Assuming that each machine in the table has 12 disks, the average can be used to store 10 blocks of disk. Another 2 blocks are used to install the system, log, backup and other functions.

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M02/75/DB/wKioL1ZD_hGwjoY_AAB1BUUccJ0198.png "title=" Glusterfs number of clusters. png "alt=" Wkiol1zd_hgwjoy_aab1buuccj0198.png "/>

System according to different periods, continuous expansion, the largest scale can reach 500 units.

The system needs to prepare a certain spare disk or standby machine, in case the disk or machine damage can be timely repair data. When the scale is 2-15 units, prepare 1-3 disks as backup, and when the scale is 15-500, prepare 1-3 computers for backup.
When the system cluster is large, it is necessary to ensure that multiple backups of the data are on different machines, so that the whole system is in a usable state when a machine in the system is down.

This article is from the "Bar Bar" blog, please be sure to keep this source http://bangbangba.blog.51cto.com/3180873/1712061

GlusterFS Distributed Storage Deployment use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.