HDFS rolling upgradeintroduction
HDFS rolling upgrade allows you to upgrade each HDFS process separately. For example, datanode can be upgraded independently of namenode. One namenode can be upgraded independently of other namenode. Namenode can be upgraded independently of datanode and journalnode.
Upgrade
In hadoop V2, HDFS supports the ha of the namenode service and is compatible with the frontend and backend. These two capabilities make it possible to upgrade HDFS online. To upgrade an HDFS cluster online, the cluster must be installed with HA.
Upgrade without downtime
In Ha clusters, there are two or more namenode, many datanode, several journalnodes, and several zookeepernodes. JNS is relatively stable. In most cases, upgrading HDFS does not require JNS. The rolling upgrade process is described here. Only namenode and datanode are considered, and JNS and zkns are not considered. Upgrading JNS and zkns may cause downtime.
Upgrading non-federated Clusters
Assume that two namenode nn1 and nn2 are in active and standby States, respectively. The steps for upgrading an HA cluster are as follows:
1. Prepare rolling upgrade
1. Run the "HDFS dfsadmin-rollingupgrade prepare" command to create a fsimage for scrolling.
2. Run the command "hdfsdfsadmin-rollingupgrade query" to check the rolling image status. Wait a moment and then run the command again until the message "Proceed with rolling upgrade" appears.
2. Upgrade active and standby NNS
1. Disable nn2 and upgrade nn2.
2. Use"-Rollingupgrade started"To start nn2 as standbynamenode.
3. Failover from nn1 to nn2 to make nn2 active and nn1 in standby.
4. Disable nn1 and upgrade nn1.
5. Use"Rollingupgrade started"Option, enable nn1 as standby".
3. Upgrade DNS
1. Select a small portion of datanode (all datanode are on a specific rack ).
1. Run the command "hdfsdfsadmin-shutdowndatanode <datanode_host: ipc_port> Upgrade" to disable the selected datanode.
2. Run the command "hdfsdfsadmin-getdatanodeinfo <datanode_host: ipc_port>" to check that datanode is disabled.
3. Upgrade and restart datanode
4. Run the preceding steps on all selected machines. The selected datanode can be operated in parallel.
2. Run the preceding steps again until all datanode in the cluster is upgraded.
4. Complete Rolling upgrade
1. Run "hdfsdfsadmin-rollingupgrade finalize" to end the rolling operation.
Upgrade federated Clusters
In a federated cluster, there are multiple namespaces. Each namespace has a pair of NN, active, and standby. Upgrade a federated cluster and upgrade a non-federated cluster except Step 1 and Step 4 need to run in each namespace, step 2 runs in each NNS pair, and the others are the same.
1. Prepare rolling upgrades for each namespace
2. Upgrade the active and standby NN of each namespace.
3. Upgrade DNS.
4. Complete the rolling upgrade of each namespace.
Upgrade with downtime
For non-ha clusters, upgrading HDFS without downtime is impossible, because the upgrade requires restarting namenode. However, datanode can still be upgraded in a rolling manner.
Upgrading non-ha Clusters
In a non-HA cluster, there is only one NN, one SNN and multiple DN. Upgrading a non-HA cluster is similar to upgrading an HA cluster. In addition to step 2, the upgrade of active and standby NN is changed to the following:
U upgrades a nn and SNN
1. Disable SNN
2. Disable and upgrade NN
3. Use"-Rollingupgrade started"Option to start NN
4. Upgrade and restart SNN
Downgrade and rollback
When the upgraded version is unsatisfactory or, in some situations that seem unlikely, the upgrade fails (because of the new version bug), the Administrator may choose to downgrade HDFS to the previous version, or roll back the HDFS to the version before the upgrade and the status before the upgrade. Cluster downtime is required for both downgrade and rollback, and cannot be completed in a rolling manner.
Note that the downgrade and rollback operations can only take place after the rolling upgrade starts and before the upgrade ends. An upgrade can be completed by finalize, downgrade, or rollback. Therefore, it is impossible to run rollback after finalize or downgrade, and to execute downgrade after finalize.
Downgrade
Downgrade restores the software to a version before the upgrade and saves user data. Assume that the time t is the start time of the rolling upgrade and the upgrade is terminated by downgrade. Then, files created before or after T are available in HDFS. Files deleted before or after T are deleted.
If the namenode version and datanode version are not modified in both HDFS versions, a new version can be downgraded to the previous version. The steps for downgrading are as follows:
U downgrade HDFS
1. Disable all NN and DN
2. restore to the previous version on all machines.
3. Use"-Rollingupgrade rollback"Option to start NN
4. Enable DN normally
Rollback
Rollback restores the HDFS status to the version before the upgrade, but resets the user data to the status before the upgrade. Assume that the time t is the start time of the rolling upgrade and the upgrade is rolled back. Files created before t are saved, and files created after t become unavailable. Files deleted before t are retained and deleted after T are restored.
Rollback from a new version is always supported. The following is a rollback procedure:
U rollback HDFS
1. Disable all NN and DN
2. Restore HDFS versions before Upgrade on all machines
3. Use-Rollingupgrade rollbackEnable NN
4. Enable DN normally
Commands and startup options forrolling upgradedfsadmin commands Dfsadmin-rollingupgrade
HDFS dfsadmin-rollingupgrade <query | START | finalize>
Perform rolling upgrade
U Option
Query |
Query the current rolling upgrade status. |
Prepare |
Prepare a new rolling upgrade. |
Finalize |
Finalize the current rolling upgrade. |
Dfsadmin-getdatanodeinfo
hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>
Obtains the information of a specified datanode. This command can be used to check whether a datanode is alive, similar to a Unix ping command.
Dfsadmin-shutdowndatanode
hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]
Submit a shutdown request to the specified datanode. If the optional parameter upgrade is specified. The client accessing datanode is recommended to wait until datanode restarts and enable the Quick Start mode. When the restart fails, the client times out and ignores the datanode. In this case, the Quick Start mode is disabled.
Note: The command cannot be completed after datanode is disabled. If the datanode is disabled, the command "dfsadmin-getdatanodeinfo" can be used to check the datanode status.
Namenode startup options Namenode-rollingupgrade
hdfs namenode -rollingUpgrade <downgrade|rollback|started>
When the rolling upgrade is in progress, the start option-rollingupgrade is used to specify different rolling upgrades:
Downgrade |
Restores the namenode back to the pre-upgrade release and preserves the user data. |
Rollback |
Restores the namenode back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state. |
Started |
Specifies a rolling upgrade already started so that the namenode shoshould allow image directories with different layout versions during startup. |