This article describes and demonstrates TFs Data Writing problems when data server fails.
Environment Introduction:
TFS name server VIP: 192.168.1.229
TFS namerver 1: 192.168.1.225
TFS namerver 2: 192.168.1.226
Data Server 1: 192.168.1.226
Data Server 2: 192.168.1.227
Data Server 3: 192.168.1.228
I. simulate a single data server fault in the following configuration Environment
Max_replication = 3 # maximum number of block backups, default: 2min_replication = 2 # minimum number of block backups, default: 2
Restart the name server service after modification. In this article, a data server mount point is used.
225 server: #/usr/local/TFS/scripts/TFS check_ns nameserver is running PID: 1061 226 Server: #/usr/local/TFS/scripts/TFS check_ns nameserver is running PID: 32506 #/usr/local/TFS/scripts/TFS check_ds dataserver [1] is running 227 Server: #/usr/local/TFS/scripts/TFS check_ds dataserver [1] is running 228 server: #/usr/local/TFS/scripts/TFS check_ds dataserver [1] is running
Use SSM to view the current status:
# /usr/local/tfs/bin/ssm -s 192.168.1.229:8108show > server -m
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M01/4B/50/wKiom1QqAtfAdMv1AALk3q99Su4334.jpg "Title =" image 1.png "alt =" wkiom1qqatfadmv1aalk3q99su4334.jpg "/>
Write test: #/usr/local/TFS/bin/tfstool-s 192.168.1.229: 8108 TFs> put/etc/security/limits. confput/etc/security/limits. conf => t1rtztbyjt1rcvbvdk success.
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/4B/50/wKiom1QqAvrxS_rEAAYaJFehgbg241.jpg "Title =" image 2.png "alt =" wkiom1qqavrxs_reaayajfehgbg241.jpg "/>
Shut down data server 226 and write the test again #/usr/local/TFS/scripts/TFS stop_ds 1 dataserver 1 exit successfully TFs> put/etc/security/limits. conf // failed to write put/etc/security/limits. conf => fail.
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M02/4B/51/wKiom1QqAymi_9b-AAdqMKvOaKo002.jpg "Title =" image 3.png "alt =" wKiom1QqAymi_9b-AAdqMKvOaKo002.jpg "/>
Show> server-M // no master block found
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/4B/54/wKioL1QqA3uxrQP8AADc1xOwbpk246.jpg "Title =" image 4.png "alt =" wkiol1qqa3uxrqp8aadc1xowbpk246.jpg "/>
Ii. test again in the following configuration environment:
Max_replication = 2 # maximum number of block backups, default: 2min_replication = 2 # minimum number of block backups, default: 2 #/usr/local/TFS/bin/SSM-s 192.168.1.229: 8108 show> server-m
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/4B/53/wKiom1QqA4TSPyUAAAKMRppsr1I636.jpg "Title =" image 5.png "alt =" wkiom1qqa4tspyuaaakmrppsr1i636.jpg "/>
TFs> put/etc/security/limits. conf // you can write put/etc/security/limits. conf => t1btxtbyht1rcvbvdk success.
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/4B/53/wKiom1QqA7TCr3HkAA0Y7niT2t8533.jpg "Title =" image 6.png "alt =" wkiom1qqa7tcr3hkaa0y7nit2t8533.jpg "/>
After the file is successfully uploaded, TFS returns three important parameters: block_id, file_id, and filename. We can use the admintool to query the dataservers of block_id, and then use the ds_clinet tool to list all file_id and filename on the corresponding block.
# /usr/local/tfs/bin/admintool -s 192.168.1.229:8108TFS > listblk 1011list block 1011 success.------block: 1011, has 2 replicas------block: 1011, (0)th server: 192.168.1.226:10000 block: 1011, (1)th server: 192.168.1.227:9998
Iii. Summary
1: If you want to write data successfully, the number of actually active data servers must be greater than or equal to the value set by max_replication. For example, if the max_replication parameter is set to 3 when three data servers are alive, the data server fails to write data.
2: If only one data server exists, the value of the max_replication and min_replication parameters must be set to 1. Otherwise, the write operation will fail.
3: Data Disaster Recovery Mechanism of TFS
For example, for three data server servers, each server provides three mount points, and one mount point can write 200 GB of data. The max_replication and min_replication parameters are set to 2. In fact, the total TFs space should be 200 GB * 3*3 = 1.8 TB.
Because the max_replication and min_replication parameters are set to 2, at least two copies of data must be saved. The actual available space of TFS is 900 GB.
This article is from the "Bo Yue" blog and will not be reproduced!
TFS dataserver Fault Test