HDFS rolling upgrade

Source: Internet
Author: User
HDFS rolling upgradeintroduction

HDFS rolling upgrade allows you to upgrade each HDFS process separately. For example, datanode can be upgraded independently of namenode. One namenode can be upgraded independently of other namenode. Namenode can be upgraded independently of datanode and journalnode.

Upgrade

In hadoop V2, HDFS supports the ha of the namenode service and is compatible with the frontend and backend. These two capabilities make it possible to upgrade HDFS online. To upgrade an HDFS cluster online, the cluster must be installed with HA.

Upgrade without downtime

In Ha clusters, there are two or more namenode, many datanode, several journalnodes, and several zookeepernodes. JNS is relatively stable. In most cases, upgrading HDFS does not require JNS. The rolling upgrade process is described here. Only namenode and datanode are considered, and JNS and zkns are not considered. Upgrading JNS and zkns may cause downtime.

Upgrading non-federated Clusters

Assume that two namenode nn1 and nn2 are in active and standby States, respectively. The steps for upgrading an HA cluster are as follows:

1. Prepare rolling upgrade

1. Run the "HDFS dfsadmin-rollingupgrade prepare" command to create a fsimage for scrolling.

2. Run the command "hdfsdfsadmin-rollingupgrade query" to check the rolling image status. Wait a moment and then run the command again until the message "Proceed with rolling upgrade" appears.

2. Upgrade active and standby NNS

1. Disable nn2 and upgrade nn2.

2. Use"-Rollingupgrade started"To start nn2 as standbynamenode.

3. Failover from nn1 to nn2 to make nn2 active and nn1 in standby.

4. Disable nn1 and upgrade nn1.

5. Use"Rollingupgrade started"Option, enable nn1 as standby".

3. Upgrade DNS

1. Select a small portion of datanode (all datanode are on a specific rack ).

1. Run the command "hdfsdfsadmin-shutdowndatanode <datanode_host: ipc_port> Upgrade" to disable the selected datanode.

2. Run the command "hdfsdfsadmin-getdatanodeinfo <datanode_host: ipc_port>" to check that datanode is disabled.

3. Upgrade and restart datanode

4. Run the preceding steps on all selected machines. The selected datanode can be operated in parallel.

2. Run the preceding steps again until all datanode in the cluster is upgraded.

4. Complete Rolling upgrade

1. Run "hdfsdfsadmin-rollingupgrade finalize" to end the rolling operation.

Upgrade federated Clusters

In a federated cluster, there are multiple namespaces. Each namespace has a pair of NN, active, and standby. Upgrade a federated cluster and upgrade a non-federated cluster except Step 1 and Step 4 need to run in each namespace, step 2 runs in each NNS pair, and the others are the same.

1. Prepare rolling upgrades for each namespace

2. Upgrade the active and standby NN of each namespace.

3. Upgrade DNS.

4. Complete the rolling upgrade of each namespace.

Upgrade with downtime

For non-ha clusters, upgrading HDFS without downtime is impossible, because the upgrade requires restarting namenode. However, datanode can still be upgraded in a rolling manner.

Upgrading non-ha Clusters

In a non-HA cluster, there is only one NN, one SNN and multiple DN. Upgrading a non-HA cluster is similar to upgrading an HA cluster. In addition to step 2, the upgrade of active and standby NN is changed to the following:

U upgrades a nn and SNN

1. Disable SNN

2. Disable and upgrade NN

3. Use"-Rollingupgrade started"Option to start NN

4. Upgrade and restart SNN

Downgrade and rollback

When the upgraded version is unsatisfactory or, in some situations that seem unlikely, the upgrade fails (because of the new version bug), the Administrator may choose to downgrade HDFS to the previous version, or roll back the HDFS to the version before the upgrade and the status before the upgrade. Cluster downtime is required for both downgrade and rollback, and cannot be completed in a rolling manner.

Note that the downgrade and rollback operations can only take place after the rolling upgrade starts and before the upgrade ends. An upgrade can be completed by finalize, downgrade, or rollback. Therefore, it is impossible to run rollback after finalize or downgrade, and to execute downgrade after finalize.

Downgrade

Downgrade restores the software to a version before the upgrade and saves user data. Assume that the time t is the start time of the rolling upgrade and the upgrade is terminated by downgrade. Then, files created before or after T are available in HDFS. Files deleted before or after T are deleted.

If the namenode version and datanode version are not modified in both HDFS versions, a new version can be downgraded to the previous version. The steps for downgrading are as follows:

U downgrade HDFS

1. Disable all NN and DN

2. restore to the previous version on all machines.

3. Use"-Rollingupgrade rollback"Option to start NN

4. Enable DN normally

Rollback

Rollback restores the HDFS status to the version before the upgrade, but resets the user data to the status before the upgrade. Assume that the time t is the start time of the rolling upgrade and the upgrade is rolled back. Files created before t are saved, and files created after t become unavailable. Files deleted before t are retained and deleted after T are restored.

Rollback from a new version is always supported. The following is a rollback procedure:

U rollback HDFS

1. Disable all NN and DN

2. Restore HDFS versions before Upgrade on all machines

3. Use-Rollingupgrade rollbackEnable NN

4. Enable DN normally

Commands and startup options forrolling upgradedfsadmin commands Dfsadmin-rollingupgrade

HDFS dfsadmin-rollingupgrade <query | START | finalize>

Perform rolling upgrade

U Option

Query

Query the current rolling upgrade status.

Prepare

Prepare a new rolling upgrade.

Finalize

Finalize the current rolling upgrade.

Dfsadmin-getdatanodeinfo
hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>

Obtains the information of a specified datanode. This command can be used to check whether a datanode is alive, similar to a Unix ping command.

Dfsadmin-shutdowndatanode
hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]

Submit a shutdown request to the specified datanode. If the optional parameter upgrade is specified. The client accessing datanode is recommended to wait until datanode restarts and enable the Quick Start mode. When the restart fails, the client times out and ignores the datanode. In this case, the Quick Start mode is disabled.

Note: The command cannot be completed after datanode is disabled. If the datanode is disabled, the command "dfsadmin-getdatanodeinfo" can be used to check the datanode status.

Namenode startup options Namenode-rollingupgrade
hdfs namenode -rollingUpgrade <downgrade|rollback|started>

When the rolling upgrade is in progress, the start option-rollingupgrade is used to specify different rolling upgrades:

Downgrade

Restores the namenode back to the pre-upgrade release and preserves the user data.

Rollback

Restores the namenode back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state.

Started

Specifies a rolling upgrade already started so that the namenode shoshould allow image directories with different layout versions during startup.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.