Background: The company used the CDH5 cluster online. At first, due to negligence, it forgot to regularly execute Balancer in scheduled tasks to balance the data of each node. Later, after the introduction of a large number of jobs, the data growth was very rapid. Many nodes began to encounter utilization rate exceeding 99.9%, and some jobs even began to fail. So we will execute Balancer.
Background: The company used the CDH5 cluster online. At first, due to negligence, it forgot to regularly execute Balancer in scheduled tasks to balance the data of each node. Later, after the introduction of a large number of jobs, the data growth was very rapid. Many nodes began to encounter utilization rate exceeding 99.9%, and some jobs even began to fail. So we will execute Balancer.
Background:
The company used CDH5 clusters online. At first, due to negligence, it forgot to regularly execute Balancer in scheduled tasks to balance the data of each node.
Later, after the introduction of a large number of jobs, the data growth was very rapid. Many nodes began to encounter utilization rate exceeding 99.9%, and some jobs even began to fail.
So we executed the Balancer to clean up the data and found that there was a balance between 26 TB of data, while the Balancer only moved 50 GB of data each time, and it took 30 minutes, the newly written data in each small hour of the cluster will cause 40-60 GB of data to be balanced. In this way, the Balancer cannot be competent at all.
14/10/14 20:31:11 INFO balancer.Balancer: Need to move 26.49 TB to make the cluster balanced.14/10/14 20:31:11 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.100.1.10:50010 to 10.100.1.60:5001014/10/14 20:31:11 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.100.1.20:50010 to 10.100.1.70:5001014/10/14 20:31:11 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.100.1.30:50010 to 10.100.1.80:5001014/10/14 20:31:11 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.100.1.40:50010 to 10.100.1.90:5001014/10/14 20:31:11 INFO balancer.Balancer: Decided to move 10 GB bytes from 10.100.1.50:50010 to 10.100.1.100:5001014/10/14 20:31:11 INFO balancer.Balancer: Will move 50 GB in this iteration...
Solution:
1. Increase the available bandwidth of the Balancer.
We think about whether the Balancer's default bandwidth is too small, so the efficiency is low. So we try to increase the Balancer's bandwidth to 500 M/s:
hadoop dfsadmin -setBalancerBandwidth 524288000
However, the problem has not been significantly improved.
2. Forcibly Decommission the node
We found that when Decommission is performed on some nodes, although the data above has 10-30 t or more, it can always be copied to other nodes within one day, because the default number of cluster replicas is 3, only 1/3 of the data is copied, but the data is complete, the copied data is evenly distributed to each node. So why don't we use it as a Balancer-like function to solve some nodes with disk usage exceeding 99.9%?
Facts have proved that this method is very feasible. We performed the Decommission operation on eight online nodes (we should try our best to do it one by one), and immediately formatted the data disk after the removal, and add it back to the cluster. The new data will be quickly balanced. It solved the headache perfectly and took less than four days.
3. Hadoop's support for LVM disk volumes
When solving the Balancer problem, we also found that Hadoop does not provide good support for LVM disk volumes, because if a logical volume/root partition is created on a disk, after the logical volume/data1 partition is created, Hadoop will always write/data1 to 100%, and some jobs will prompt that there is no space to write. We suppose that Hadoop should control the usage in units of physical volumes. Therefore, we have to reinstall these hosts that contain logical volume data disks and allocate separate physical volumes, such as/dev/sda3 mounting as/data1.
Original article address: it is difficult for Balancer of Hadoop O & M notes to balance a large amount of data in a rapidly growing cluster. Thanks to the original author for sharing.