Discover namenode and datanode in hadoop, include the articles, news, trends, analysis and practical advice about namenode and datanode in hadoop on alibabacloud.com
{long remaining = getcapacity ()-getdfsused (); long available = usage. getavailable (); // here is a bug which should be loage. getavailable ()-reserved; because the above minus reserved if (remaining> available) {remaining = available;} return (remaining> 0 )? Remaining: 0 ;}
The DF and Du classes are used to regularly update the space usage information of the partition to make the statistics accurate. This is described below.
Use of DU and DF
To accurately obtain the total capacity, usage,
The situation of manufacturing namenode downtime1) Kill the Namenode process
[Email protected] bin]$ kill-9 13481
2) Delete the folder that Dfs.name.dir points to, here is /home/hadoop/hdfs/name. Current image In_use.lock Previous.checkpoint[[email protected] name]$ RM-RF * delete everything under the name directory, but you must ensure that the
Hadoop Source Learning Notes (3)--first glance datanode and learning threadsinto the main function, we went out the first step, and then we look at how to go:
Public class DataNode extends configured implements Interdatanodeprotocol,
Clientdatanodeprotocol, fsconstants, Runnable {
Public Static DataNode Cr
Distributed File System HDFS-namenode architecture namenode
Is the management node of the entire file system.
It maintains the file directory tree of the entire file system [to make retrieval faster, this directory tree is stored in memory],
The metadata of the file/directory and the data block list corresponding to each file.
Receives user operation requests.
Hadoop
Recently in the application of Hadoop cluster, encountered the task to submit the cluster, long-time card in the accepted state, the application of resources difficult situation, after a series of log analysis, the state of the investigation, only to find that the namenode has been caused by the primary and standby switch, The previous Namenode primary node has b
ObjectiveWhen you build a Hadoop cluster, the first time you format it, take a snapshot . Do not casually lack of any process, just a format. problem description : start Hadoop times NameNode uninitialized: Java.io.IOException:NameNode is notformatted.At the same time, if you start the Namenode alone, it will appear
Hadoop DataNode Add Disk
1. Disk formatting
Mkfs-t EXT4/DEV/XVDD
2. Mount the disk
Vim/etc/fstab; Mount-a
3. Create Datanode Data Catalog
Mkdir-p/data2/hadoopdata3/dfs/data//data2/hadoopdata4/dfs/data/
Chown Hbase.hbase-r/DATA2/HADOOPDATA3/DATA2/HADOOPDATA4
4. Close Datanode
Cd/opt/
For example, if the ip address of the newly added node is 192.168.1.xxx, add the hosts of 192.168.1.xxxdatanode-xxx to all nn and dn nodes, create useraddhadoop-sbinbash-m on xxx, and add the ip address of another dn. all files in ssh are copied to homehadoop on xxx. install jdkapt-getinstallsun-java6-j in ssh path
For example, if the ip address of the newly added node is 192.168.1.xxx, add the hosts of 192.168.1.xxx datanode-xxx to all nn and dn node
When I first got to know hadoop, I had to configure a hadoop cluster on a 7-7-8 basis. However, when I had a big hurdle, I often fell victim to the ship.
Every time you execute hadoop namenode-format to format the hadoop file system, an error is always reported. As a result
Add several new nodes to a running hadoop Cluster
1. Deploy the Java/hadoop program on the new node and configure the corresponding environment variables.
2. Add users to the new node, copy id_rsa.pub from the master, and configure authorized_keys
3. Set the host on the new node, as long as there is a host of the Local Machine and the master.
4. Create related directories on the new node and modify the owne
To make it easy to customize the presentation in the management interface of Hadoop (Namenode and Jobtracker), the management interface of Hadoop is implemented using proxy servlet.First of allThe constructors in Org.apache.hadoop.http.HttpServer public httpserver (string name, string bindaddress, int Port,boolean findport, Configuration conf, accesscontrollist a
BenCodeFunction: Get the datanode name and write it to the file in the HDFS file system.HDFS: // copyoftest. C.
And count filesHDFS: // wordcount count in copyoftest. C,Unlike hadoop's examples, which reads files from the local file system.
Package Com. fora; Import Java. Io. ioexception; Import Java. util. stringtokenizer; Import Org. Apache. hadoop. conf. configuration; Import Org. Apache.
After the hadoop cluster is started, run the JPS command to view the process. Only the tasktracker process is found on the datanode node, as shown in.
Master process:Two Slave node processes found that there was no datanode process on the salve node. after checking the log, we found that the data directory permission on data
reproduce this problem in the test environment and run a sleep job
cd /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce;hadoop jar hadoop-mapreduce-client-*-tests.jar sleep -Dmapred.job.queue.name=sleep -m5 -r5 -mt 60000 -rt 30000 -recordt 1000
After you restart nodemanage, an error is reported.Analyze logs
However, where can I find the AM log not found? We have co
1) Modify the Namespaceid of each slave to make it consistent with the Namespaceid of the master.Or2) Modify the Namespaceid of master so that it is consistent with the Namespaceid of slave.The "Namespaceid" is located in the "/usr/hadoop/tmp/dfs/data/current/version" file and the front Blue May vary according to the actual situation, but the red in the back is unchanged.Example: View the "VERSION" file under "Master"650) this.width=650; "src=" http:
Configure Core-site.xmlConfigure Hdfs-site.xmlConfigure Mapred-site.xmlConfigure Yarn-site.xmlSend to other nodesModify RM 2.. N the node information aboveFormat ZK HDFs Zkfc-formatzkInitialize Journalnode:HDFs namenode-initializesharededitsYou need to start the process of each Journalnode node before the operation.Otherwise, formatting is unsuccessful.No reformatting of data is required to turn from non-ha to ha. Follow this procedure.An issue to be
The emergence of namenode running as process 18472. Stop it first. and so on in hadoop is similar to several.Namenode runing as process 32972. stop it first.127.0.0.1: SSH: connect to host 127.0.0.1 port 22: No error127.0.0.1: SSH: connect to host 127.0.0.1 port 22: No errorjobtracker running as process 81312. stop it first.127.0.0.1: SSH: connect to host 127.0.0.1 port 22: no errorSolution: You are not sta
Sometimes it may be necessary to remove Datanode from the Hadoop cluster because of a temporary adjustment, as follows:
First add the machine name of the node you want to delete in/etc/hadoop/conf/dfs.exclude
In the console page, you see a dead datanodes
To refresh node information using commands:
[HDFS@HMC ~]$ Hadoop
When a problem occurs in a single node of a Hadoop cluster, it is generally not necessary to restart the entire system, just restart the node and it will automatically connect to the entire cluster.Enter the following command on the necrotic node:hadoop-daemon.sh Start Datanodehadoop-daemon.sh Start SecondarynamenodeThe cases are as follows:Hadoop node crashes, can ping Pass, SSH Connection not onCase:Time: 2014/9/11 a.m.Performance: tc-hadoop018 node
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.