MFs Distributed File System
MFs is a semi-Distributed File System developed by Poles. The MFs file system can implement raid functions, which not only saves storage costs, but is no worse than professional storage systems. It can also achieve online scaling.
A Distributed File System is a physical storage resource managed by a file system that is not necessarily directly connected to a local node but connected to a node through a computer network.
The advantages of a distributed file system are centralized access, simplified operations, Data Disaster Tolerance, and improved file access performance.
Composition architecture of MFs File System:
- Metadata server (master): manages the file system and maintains metadata throughout the system;
- Metalogger: backs up the change log file of the master server. The file type is changlog_ml. *. MFs. When data on the master server is lost or damaged, files can be obtained from the log server for recovery;
- Data Storage Server (chunk server): the server that actually stores data. When a file is stored, the file is saved in blocks and copied between data servers. The more data servers, the larger the capacity available, the higher the reliability, and the better the performance;
- Client: you can mount an MFs file system like NFS. The operations are the same.
MFs Data Reading Process:
- The client sends a read request to the metadata server;
- The metadata server informs the client of the location where the required data is stored (the chunkserver IP address and Chunk number;
- The client sends data to a known chunkserver request;
- The chunkserver sends data to the client.
MFs Data Writing Process:
- The client sends a write request to the metadata server;
- The metadata server interacts with the chunkserver, but the metadata server only creates new chunks on some servers. After the chunkservers is created, the metadata server is notified that the operation is successful;
- The metadata server informs the client of the chunkserver on which data can be sucked;
- The client writes data to the specified chunkserver;
- The chunkserver synchronizes data with other chunkservers. After successful synchronization, the chunkserver informs the client that data is written successfully;
- The client informs the metadata server that the write is complete.
System Environment
Host |
Operating System |
IP address |
Master Server |
Centos 7.3 x86_64 |
192.168.1.11 |
Metalogger |
Centos 7.3 x86_64 |
192.168.1.12 |
Chunk1 |
Centos 7.3 x86_64 |
192.168.1.13 |
Chunk2 |
Centos 7.3 x86_64 |
192.168.1.14 |
Chunk3 |
Centos 7.3 x86_64 |
192.168.1.15 |
Client |
Centos 7.3 x86_64 |
192.168.1.22 |
Start deployment
Master Server:
- Add key value
# curl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS
- Add database entries
# curl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repo
- Install the mfsmaster Software Package
yum -y install moosefs-master moosefs-cgi moosefs-cgiserv moosefs-cli
Confirm the configuration file and generate the relevant configuration files (mfsexports. cfg, mfsmaster. cfg, etc.) under/etc/MFs)
The following configuration files use the default values and do not need to be modified: mfsmaster. cfg, mfsexports. cfg, mfstopology. cfg
- Start mfsmaster and check whether the application is started
mfsmaster startps -ef | grep mfs
Metalogger:
- Add key value
# curl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS
- Add database entries
# curl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repo
- Install the mfsmetalogger package
yum -y install moosefs-metalogger
- Edit the configuration file mfsmetalogger. cfg
# Vim/etc/MFs/mfsmetalogger. CFG ############################## ################## runtime options #################### ########################### user to run daemon as (default is mFs) # working_user = MFs # group to run daemon as (optional-If empty then Default User Group will be used) # working_group = MFs # Name of process to place in syslog messages (default is mfsmetalogger) # syslog_ident = mfsmetalogger # Whether to perform mlockall () to avoid swapping out mfsmetalogger process (default is 0, i. e. no) # lock_memory = 0 # Linux only: Limit malloc arenas to given value-prevents server from using huge amount of virtual memor y (default is 4) # limit_glibc_malloc_arenas = 4 # Linux only: disable out of memory killer (default is 1) # disable_oom_killer = 1 # Nice level to run daemon with (default is-19; Note: process must be started as root to increase priorit y, if setting of priority fails, process retains the nice level it started with) # nice_level =-19 # Set default umask for group and others (user has always 0, default is 027-block write for group and bl ock all for others) # file_umask = 027 # Where to store daemon lock file (default is/var/lib/mFs) # data_path =/var/lib/MFs # Number of metadata change log files (default is 50) # back_logs = 50 # Number of previous metadata files to be kept (default is 3) # back_meta_keep_previous = 3 # metadata download frequency in hours (default is 24, shocould be at least back_logs/2) # meta_download_freq = 24 #################################### ########### master connection options ######################### ####################### delay in seconds before next try to reconnect to master if not connected (default is 5) # master_reconnection_delay = 5 # local address to use for connecting with Master (default is *, I. e. default local address) # bind_host = * # moosefs master host, IP is allowed only in single-master installations (default is mfsmaster) modify the IP address master_host = 192.168.1.11 # moosefs master supervisor port (default is 9419) # master_port = 9419 # timeout in seconds for Master connections (default is 10) # master_timeout = 10
- Start mfsmetalogger and check whether the application is started
mfsmetalogger startps -ef | grep mfs
Chunkservers:
The configuration of the three data storage servers is consistent.
- Add key value
# curl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS
- Add database entries
# curl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repo
- Install the mfsmaster Software Package
yum -y install moosefs-chunkserver
- Modify the master configuration file and the IP address of the master
# Vim/etc/MFs/mfschunkserver. CFG ############################## ################## master connection options ################### ############################# labels string (default is empty-no labels) # labels = # local address to use for Master connections (default is *, I. e. default local address) # bind_host = * # moosefs master host, IP is allowed only in single-master installations (default is mfsmaster) # modify the IP address master_host = 192.168.1.11 # moosefs master command port (default is 9420) # master_port = 9420 # timeout in seconds for Master connections. value> 0 forces given timeout, but when value is 0 then CS as KS master for timeout (default is 0-ask Master) # master_timeout = 0 # delay in seconds before next try to reconnect to master if not connected # master_reconnection_delay = 5 # authentication string (used only when Master requires authorization) # auth_code = mfspassword
- Specifies the file location that the data storage server assigns to the mfsmaster.
# Vim/etc/MFs/mfshdd. CFG · omitted part of the Information # This file keeps definitions of mounting points (paths) of hard drives to use with chunk server. # A path may begin with extra characters which swiches additional options: #-'*' means that this hard drive is 'marked for removal' and all data will be replicated to other hard drives (usually on other chunkservers) #-'<'means that all data from this hard drive Shoshould be moved to other hard drives #-'> 'means that all data from other hard drives shoshould be moved to this hard drive #-'~ 'Means that significant change of total blocks count will not mark this drive as damaged # If there are both '<' and '> 'drives then data will be moved only between these drives # It is possible to specify optional space limit (after each mounting point ), there are two ways of doing that: #-set space to be left unused on a hard drive (this overrides the default setting from mfschunkserver. CFG) #-l Imit space to be used on a hard drive # space limit definition: [0-9] * (. [0-9] *)? ([Kmgtpe] | [kmgtpe] I )? B ?, Add minus in front for the first option. # examples: # Use hard drive '/mnt/hd1' with default options: #/mnt/hd1 # Use hard drive'/mnt/hd2 ', but Replicate all data from it: #*/mnt/hd2 # Use hard drive '/mnt/hd3', but try to leave 5gib on it: #/mnt/hd3-5gib # Use hard drive '/mnt/hd4', but use only 1.5tib on it: #/mnt/hd4 1.5tib # Use hard drive '/mnt/hd5', but fill it up using data from other drives #>/Mnt/hd5 # Use hard drive '/mnt/hd6 ', but move all data to other hard drives #</mnt/hd6 # Use hard drive '/mnt/hd7', but ignore significant change of hard drive total size (e.g. compressed file systems )#~ /Mnt/hd7 # partition directory/data provided to MFs
- Create a directory and modify the owner/group, start the chunkserver service, and check whether the application is started.
mkdir /datachown -R mfs:mfs /datamfschunkserver startps -ef | grep mfs
Client:
- Add key value
# curl "https://ppa.moosefs.com/RPM-GPG-KEY-MooseFS" > /etc/pki/rpm-gpg/RPM-GPG-KEY-MooseFS
- Add database entries
# curl "http://ppa.moosefs.com/MooseFS-3-el7.repo" > /etc/yum.repos.d/MooseFS.repo
- Install the mfsmaster Software Package
yum -y install moosefs-client
- Create a mount point, load the fuse module to the kernel, and mount MFs
mkdir -p /mfs/datamodprobe fusemfsmount /mfs/data -H 192.168.1.11
MFs monitoring
The mfscgiserv function has been installed by default through Yum installation. It is a web server written in the same way as Python, and its listening port is 9425. It can be enabled through the mfscgiserv command on the master server, then, open the browser to fully monitor all client mounting, Chunk server, master server, and various client operations.
The meanings of each part are as follows:
- Info: displays basic MFs information.
- Server: The chunk server in the column
- Disks section: lists the disk directories and usage of each chunk server.
- Exports section: lists the directories shared to be mounted.
- Mounts section: displays the mounting status.
- Operations Section: displays ongoing operations
- Master charts section: displays master server operations, including reading, writing, creating, and deleting directories.
Common MFs operations
Mfsgetgoal and mfssetgoal commands
The target refers to the number of copies of a file. After the number of copies is set, the mfsgetgoal command can be used to verify the number of copies, or The setting can be changed through mfssetgoal.
Mfscheckfile and mfsfileinfo commands
The actual copy score can be confirmed by the mfscheckfile and mfsfileinfo commands.
Mfsdirinfo command
The Content summary of the entire directory tree can be displayed through a function-enhanced command mfsdirinfo equivalent to "Du-s.
MFs Maintenance
The most important thing is to maintain the metadata server. The most important directory of the metadata server is/var/lib/MFs /, changes to MFs data storage, modification, update, and other operations will record a file in this directory, so as long as the data security of this directory is ensured, to ensure the security and reliability of the entire MFs file system.
The data in the/var/lib/MFs/directory is composed of two parts: the change log of the metadata server, and the file name is similar to changelog. *. MFs; the other is the metadata file metadata. MFs, the file will be named metadata when running mfsmaster. MFs. back. As long as the security of the two data parts is ensured, even if the metadata server is fatal, a set of metadata servers can be deployed through the backup metadata file.
Build an MFs Distributed File System