}"/sbin/start-yarn.sh ]; then "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIRfi
After the execution is complete, call jps to check whether all services have been started:
2014-07-21 22:05:21, 064 INFO org. apache. hadoop. hdfs. server. common. storage: Lock on/usr/local/gz/hadoop-2.4.1/dfs/data/in_use.lock acquired by nodename 3312 @ ubuntu2014-07-21 22:05:21, 075 FATAL org. apache.
. All configuration files for hadoop will be present in the directory$ Hadoop_install/hadoop/Conf.
Startup scripts
The$ Hadoop_install/hadoop/binDirectory contains some scripts used to launch hadoop DFS and hadoop MAP/reduce daemons. These are:
Start-all.sh-Starts all
when selecting the machine, i.e.,Most likely, when writing data, Hadoop writes the first piece of data Block1 to Rack1, and then randomly chooses to write Block2 to Rack2.At this time, two rack between the data transmission flow, and then, in the case of random, and then Block3 re-write back to the Rack1,At this point, a data flow is generated between the two rack.When the amount of data being processed by the job is very large, or the amount of data
servers
192.168.1.151 hadoop-master-001
192.168.1.152 hadoop-slave-001
192.168.1.153 hadoop-slave-002
Get through 001 to 001, 002,003 SSH without password login
Install SSH on the 001 machine to enter the HDFs user's. SSH directory using ssh-keygen-t RSA to generate the public and private keys (continuous carriage return, no password set) Copy the public key f
: The Data storage node (also called the slave node), stores the actual data, performs the reading and writing of the data block, and reports the storage information to the NN
Secondary NameNode: The role of younger brother, share the workload of eldest brother NameNode; is a cold backup of NameNode; merge Fsimage and fsedits and then send
Org. apache. hadoop. IPC. remoteException: Org. apache. hadoop. HDFS. server. namenode. safemodeexception: cannot delete/tmp/hadoop/mapred/system. name node is in safe mode.
The ratio of reported blocks 0.7857 has not reached the threshold 0.9990. Safe mode will be turned off automatically.
At org. Apache.
operating modeThere are three modes of operation of Hadoop:Standalone mode (standalone or local mode): no daemon (daemon) is required, and all programs are executed on a single JVM. Mainly used in the development phase. The default property is set for this mode, so no additional configuration is required.Pseudo-distributed mode (pseudo-distributed model): The Hadoop daemon runs on the local machine, simulating a small-scale cluster.fully distributed
Hadoop Introduction: a distributed system infrastructure developed by the Apache Foundation. You can develop distributed programs without understanding the details of the distributed underlying layer. Make full use of the power of clusters for high-speed computing and storage. Hadoop implements a Distributed File System (HadoopDistributed File System), HDFS for short. HDFS features high fault tolerance and
Hadoop + Zookeeper)
I. Overview
1.1 single point of failure in Hadoop1.0
NameNode in Hadoop is like a human heart. It is very important and never stops working. In the hadoop1 era, there is only one NameNode. If the NameNode data is lost or cannot work, the whole cluster can
, because there is no way to re-assemble the file blocks in each datanodes. Therefore, it is necessary to ensure that namenode is reliable enough. hadoop provides two mechanisms to ensure data security in namenode. The first mechanism is to back up persistent information on namenode.
"The files under the directory are all deleted. readers can observe the catalogue themselves " Hadoop.tmp.dir "Changes before and after formatting. There are few scenarios in which a format operation fails. If it does appear, check that the configuration is correct.3. start Hadoopafter formatting is complete, start start Hadoop program. since we are using pseudo-distributed installation mode, it is necessary for a single machine to run all
editing (prompt, enter: wq !). In fact, carefully looking for it will find that the hadoop-env.sh file itself has JAVA_HOME this line, we only need to delete the note # above, and then modify the HOME address just fine. As shown in:
4. 5. Configure core-site.xml
[Root @ master conf] # vi core-site.xml
(Note: The IP address after hdfs must be your centos IP address, which is why ifconfig needs to get the IP address first. The localhost
Hadoop pseudo-distributed mode configuration and installation
Hadoop pseudo-distributed mode configuration and installation
The basic installation of hadoop has been introduced in the previous hadoop standalone mode. This section describes the basic simulation and deployment of had
parameters are mapred-site.xml inside the mapred.job.tracker inside the IP and port;
DFS Master in this box:Host: Is the namenode of the cluster machine, I am here because of pseudo-distribution, are on the 192.168.80.100 abovePort: Is the port of Namenode, write 9000 here (the default port number)These two parameters are the IP and port inside Fs.default.name inside the core-site.xml.(Use M/R master h
without namenode, because without Namenode, all files on the file system will disappear and do not know how to rebuild the file based on the Datanode block.
Therefore, it is important to implement fault tolerance for Namenode, and Hadoop provides two mechanisms:
(1) Back up files that make up the persistent state of t
a distributed system, which requires no password access between nodes. This section of the task is to set up SSH, user creation, Hadoop parameter settings, the completion of the HDFS distributed environment to build.Task implementation:This task requires four node units to be clustered, each node machine installed centos-6.5-x86_64 system. The IP addresses used by the four nodes are: 192.168.23.111, 192.168.23.112, 192.168.23.113, 192.168.23.114, cor
Install Hadoop fully distributed (Ubuntu12.10) and Hadoop Ubuntu12.10 in Linux
Hadoop installation is very simple. You can download the latest versions from the official website. It is best to use the stable version. In this example, three machine clusters are installed. The hadoop version is as follows:Tools/Raw Mater
Eof # use the following hdfs-site.xmlcat> hdfs-site.xml
Dfs. replication
1
Eof # use the following mapred-site.xmlcat> mapred-site.xml
Mapred. job. tracker
Usd ip: 9001
Eof} # configure ssh password-free login function PassphraselessSSH () {# generate a private key without repeating it [! -F ~ /. Ssh/id_dsa] ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa
Run Hadoop WordCount. jar in Linux.
Run Hadoop WordCount in Linux
Enter the shortcut key of Ubuntu terminal: ctrl + Alt + t
Hadoop launch command: start-all.sh
The normal execution results are as follows:
Hadoop @ HADOOP :~ $ Start-all.sh
Warning: $ HADOOP_HOME is deprecate
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.