First, preface
Have previously written a correct way to delete the OSD, the inside is just a simple way to say how to reduce the amount of migration, this article belongs to an extension, describes the frequently occurring in the Ceph operation of the bad disk to change the disk to optimize the steps
The basic environment two hosts each host 8 OSD, altogether 16 OSD, the replica set to the 2,PG number set to 800, calculates the average number of P g on each OSD is 100, this article will analyze the different processing method difference by the data
Set the environment to noout before starting the test and then stop the OSD to simulate an exception on the OSD and then perform different processing methods
Ii. Three methods of testing method One: First out of an OSD, then remove the OSD, and then increase the OSD to stop the specified OSD process out of the specified OSD crush remove the specified OSD add a new OSD
General production environment will be set to Noout, of course, not set can also, then to the program to control the node out, the default is five minutes after the process is stopped, in short, if there is an out trigger, whether it is a human trigger, or automatically trigger the data flow is certain, we here to facilitate testing, Using a man-made trigger, the pre-fabricated environment mentioned above is the set Noout
Get the most original distribution before starting the test
[root@lab8106 ~]# ceph pg dump Pgs|awk ' {print $1,$15} ' |grep-v pg > Pg1.txt
|
Get current PG distribution, save to file Pg1.txt, this PG distribution record is the OSD where the PG is located, recorded, convenient to compare later, so as to draw the data need to be migrated Stop the specified OSD process
[root@lab8106 ~]# Systemctl Stop ceph-osd@15
|
Stop process will not trigger the migration, will only cause a change in the PG status, such as the original main PG on the stop OSD, then stop the OSD, the original copy of the PG will be the role of upgrading the main PG out off an OSD
[root@lab8106 ~]# Ceph OSD out 15
|
Before triggering out, the current PG status should have active+undersized+degraded, after triggering out, all the status of the PG should slowly become Active+clean, waiting for the cluster to be normal, again query the current PG distribution status
[root@lab8106 ~]# ceph pg dump Pgs|awk ' {print $1,$15} ' |grep-v pg > Pg2.txt
|
Saves the current PG distribution to Pg2.txt
Comparing the changes of PG before and after the out, the following is a more specific change, only the parts of the change are listed
[root@lab8106 ~]# diff-y-w pg1.txt pg2.txt --suppress-common-lines
|
What we are concerned about here is the number of changes, the number of PG that only counts.
[root@lab8106 ~]# diff-y-w pg1.txt pg2.txt --suppress-common-lines|wc-l
102
|
After the first out there are 102 PG changes, this number is remembered, the following statistics will be used to remove the OSD from the crush
[root@lab8106 ~]# ceph OSD Crush Remove osd.15
|
Crush Delete will also trigger the migration, waiting for the PG equalization, that is, all become Active+clean state
[root@lab8106 ~]# ceph pg dump Pgs|awk ' {print $1,$15} ' |grep-v pg > Pg3.txt
|
Gets the status of the current PG distribution
Now, compare the PG changes before and after crush remove
[root@lab8106 ~]# diff-y-w pg2.txt pg3.txt --suppress-common-lines|wc-l
137
|
We re-add the new OSD
[root@lab8106 ~]# Ceph-deploy OSD Prepare LAB8107:/DEV/SDI
[root@lab8106 ~]# ceph-deploy OSD Activate lab8107:/dev/ Sdi1
|
Statistics the current new PG status after addition
[root@lab8106 ~]# ceph pg dump Pgs|awk ' {print $1,$15} ' |grep-v pg > Pg4.txt
|
Changes before and after the comparison
[root@lab8106 ~]# diff-y-w pg3.txt pg4.txt --suppress-common-lines|wc-l
167
|
The entire replacement process is complete, counting the total changes in the PG above
102 +137 +167 = 406
That is, according to the change of this method for 406 PG, because it is only a dual mainframe, there may be some amplification problem, here do not do in-depth discussion, because my three sets of test environment are the same situation, only to do horizontal comparison, the principle of the same, here is the data to analyze the difference