Hadoop namenode formatting problem summary, hadoopnamenode

Source: Internet
Author: User
Tags xsl

Hadoop namenode formatting problem summary, hadoopnamenode
Hadoop namenode formatting problem summary

(Continuous update)


0 Hadoop Cluster Environment

3 rhel6.4, 2 namenode + 2 zkfc, 3 journalnode + zookeeper-server constitute a simple HA cluster solution.

1) The hdfs-site.xml configuration is as follows:

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Quorum Journal Manager HA:  http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html--><configuration>    <!-- Quorum Journal Manager HA -->    <property>        <name>dfs.nameservices</name>        <value>hacl</value>        <description>unique identifiers for each NameNode in the nameservice.</description>    </property>    <property>        <name>dfs.ha.namenodes.hacl</name>        <value>hn1,hn2</value>        <description>Configure with a list of comma-separated NameNode IDs.</description>    </property>    <property>        <name>dfs.namenode.rpc-address.hacl.hn1</name>        <value>hacl-node1.pepstack.com:8020</value>        <description>the fully-qualified RPC address for each NameNode to listen on.</description>    </property>    <property>        <name>dfs.namenode.rpc-address.hacl.hn2</name>        <value>hacl-node2.pepstack.com:8020</value>        <description>the fully-qualified RPC address for each NameNode to listen on.</description>    </property>    <property>        <name>dfs.namenode.http-address.hacl.hn1</name>        <value>hacl-node1.pepstack.com:50070</value>        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>    </property>    <property>        <name>dfs.namenode.http-address.hacl.hn2</name>        <value>hacl-node2.pepstack.com:50070</value>        <description>the fully-qualified HTTP address for each NameNode to listen on.</description>    </property>    <property>        <name>dfs.namenode.shared.edits.dir</name>        <value>qjournal://hacl-node1.pepstack.com:8485;hacl-node2.pepstack.com:8485;hacl-node3.pepstack.com:8485/hacl</value>        <description>the URI which identifies the group of JNs where the NameNodes will write or read edits.</description>    </property>    <property>        <name>dfs.journalnode.edits.dir</name>        <value>/hacl/data/dfs/jn</value>        <description>the path where the JournalNode daemon will store its local state.</description>    </property>    <property>        <name>dfs.client.failover.proxy.provider.hacl</name>        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>        <description>the Java class that HDFS clients use to contact the Active NameNode.</description>    </property>    <!-- Automatic failover adds two new components to an HDFS deployment:        - a ZooKeeper quorum;        - the ZKFailoverController process (abbreviated as ZKFC).        Configuring automatic failover:    -->    <property>        <name>dfs.ha.fencing.methods</name>        <value>sshfence</value>        <description>a list of scripts or Java classes which will be used to fence the Active NameNode during a failover.</description>    </property>    <property>        <name>dfs.ha.fencing.ssh.private-key-files</name>        <value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>        <description>The sshfence option SSHes to the target node and uses fuser to kill the process          listening on the service's TCP port. In order for this fencing option to work, it must be          able to SSH to the target node without providing a passphrase. Thus, one must also configure the          dfs.ha.fencing.ssh.private-key-files option, which is a comma-separated list of SSH private key files.             logon namenode machine:             cd /var/lib/hadoop-hdfs             su hdfs             ssh-keygen -t dsa        </description>    </property>    <!-- Optionally, one may configure a non-standard username or port to perform the SSH.      One may also configure a timeout, in milliseconds, for the SSH, after which this      fencing method will be considered to have failed. It may be configured like so:    <property>        <name>dfs.ha.fencing.methods</name>        <value>sshfence([[username][:port]])</value>    </property>    <property>        <name>dfs.ha.fencing.ssh.connect-timeout</name>        <value>30000</value>    </property>    //-->    <property>        <name>dfs.ha.automatic-failover.enabled</name>        <value>true</value>    </property>    <property>        <name>dfs.ha.automatic-failover.enabled.hacl</name>        <value>true</value>    </property>    <!-- Configurations for NameNode: -->    <property>        <name>dfs.namenode.name.dir</name>        <value>/hacl/data/dfs/nn</value>        <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>    </property>    <property>        <name>dfs.blocksize</name>        <value>268435456</value>        <description>HDFS blocksize of 256MB for large file-systems.</description>    </property>    <property>        <name>dfs.replication</name>        <value>3</value>        <description></description>    </property>    <property>        <name>dfs.namenode.handler.count</name>        <value>100</value>        <description>More NameNode server threads to handle RPCs from large number of DataNodes.</description>    </property>    <!-- Configurations for DataNode: -->    <property>        <name>dfs.datanode.data.dir</name>        <value>/hacl/data/dfs/dn</value>        <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>    </property>
</configuration>

2) The core-site.xml configuration is as follows:

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration>    <property>        <name>fs.defaultFS</name>        <value>hdfs://hacl</value>    </property>    <property>        <name>io.file.buffer.size</name>        <value>131072</value>    </property>    <property>        <name>hadoop.tmp.dir</name>        <value>/hdfs/data/tmp</value>        <description>chown -R hdfs:hdfs hadoop_tmp_dir</description>    </property>    <!-- Configuring automatic failover -->    <property>        <name>ha.zookeeper.quorum</name>        <value>hacl-node1.pepstack.com:2181,hacl-node2.pepstack.com:2181,hacl-node3.pepstack.com:2181</value>        <description>This lists the host-port pairs running the ZooKeeper service.</description>    </property>    <!-- Securing access to ZooKeeper --></configuration>

1. The namenode formatting process is as follows:

1) Start all journalnodes. The JN of the three nodes must be correctly started. Disable all namenode:

# service hadoop-hdfs-journalnode start# service hadoop-hdfs-namenode stop

2) format namenode. Hacl-pepstack-com is the name I gave to the cluster, which can be ignored. Su-hdfs-c "..." is formatted as an hdfs user.

All Directories specified on the hdfs-site.xml and core-site.xml must have the correct permissions:

# chown -R hdfs:hdfs /hacl/data/dfs
Format on any namenode, for example, execute

########## hn1# su - hdfs -c "hdfs namenode -format -clusterid hacl-pepstack-com -force"# service hadoop-hdfs-namenode start   ##### hn1

First, you must start hn1, and then execute it on another namenode (hn2:

########## hn2# su - hdfs -c "hdfs namenode -bootstrapStandby -force"
# service hadoop-hdfs-namenode start   ##### hn2

At this point, both namenode are formatted and started.








Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.