are many tasktracker nodes.I deploy namenode and jobtracker ON THE dbrg-1, dbrg-2, dbrg-3 as datanode and tasktracker. You can also deploy namenode, datanode, jobtracker, and tasktracker on one machine.
Directory structureBecause hadoop requires that the directory structure of hadoop deployment on all machines be the same and there is an account with the same user name.On all three of my machines, there is
# Set JavaenvironmentExport java_home=/usr/program/jdk1.6.0_27After editing save exit (prompt, enter: wq!). In fact, a closer look will find hadoop-env.sh file itself has java_home this line, we just need to put the previous comment # Cancel, and then modify the home address is good. As shown in the following:4.5. Configuring Core-site.xml[Email protected] conf]# VI core-site.xmlhdfs://192.168.154.129:9000
support concurrent write operations, random access, and data modification on the same file by multiple users.
The HDFS architecture is as follows:
Hive Data Management
Okay, now let's talk about the main line. hive, the main target of this pre-study is it. As shown in the preceding figure, hive is the data warehouse infrastructure on hadoop. How can we compare this detour? To put it bluntly, it is:
1.
DescriptionHadoop version: hadoop-2.5.0-cdh5.3.6Environment: centos6.4Must be networkedHadoop Download URL: http://archive.cloudera.com/cdh5/cdh/5/In fact, compiling is really manual work, according to the official instructions, step by step down to do it, but always meet the pit.Compile steps :1, download the source code, decompression, in this case, extracted to/opt/softwares:Command: TAR-ZXVF hadoop-2.5.
configuration directory that must be modified is the location of java_home in the hadoop-env.sh.
Put the compiled hadoopgrep and regmapper. class into the hadoop/build/classes/demo/hadoop/directory to find a large log file and put it in a directory, and then runHadoop/bin/
Hadoop. tmp. DIR is the basic configuration that the hadoop file system depends on. Many Paths depend on it. Its default location is under/tmp/{$ user}, but the storage in the/tmp path is insecure, because the file may be deleted after a Linux restart.
After following the steps in the Single Node setup section of hadoop getting start, the pseudo-distributed fil
Hadoop cmake maven protobufProblem description
Hadoop installed in 64-bit linux may encounter libhadoop. so.1.0.0 which might have disabled stack guard in many places. It is because hadoop is 32-bit and hadoop needs to be compiled manually.
Hadoop is 2.2.0, and the operating
successfully formatted" and so on appear that the format is successful. Note: Each format will generate a namenode corresponding ID, after multiple formatting, if the Datanode corresponding ID number is not changed, run WordCount will fail to upload the file to input.
Start HDFs
start-all.sh
Show process
JPs
Enter http://localhost:50070/in the browser, the following page appears
Enter http://localhost:8088/, the following page appears
Indicates that the pseudo-distribution installation c
Hadoop In The Big Data era (1): hadoop Installation
If you want to have a better understanding of hadoop, you must first understand how to start or stop the hadoop script. After all,Hadoop is a distributed storage and computing framework.But how to start and manage t
test submission Job, like the document, is called Joe
[Root@com2 mr]# useradd Joe [Root@com2 mr]# passwd Joe [Root@com2 mr]# su Joe [joe@com2 mr]$ Export hadoop_mapred_home=/ usr/lib/hadoop-mapreduce [joe@com2 mr]# sudo-u hdfs hadoop fs-mkdir/user/joe [joe@com2 mr]# sudo-u HDFs Hadoop fs-cho WN Joe/user/joe [joe@com2 mr]$ h
scheduled for the job.
Why only one datanode is started for hadoop distributed configuration?
Firewall configured?The fs. default. name configuration in the core-site.xml is correctIs the cluster ID of namenode and datanode different because namenode-format is executed after the system is started?What do gopivotal and hadoop hdfs mean? To put it simply,
package , click OK to finish.8) in compiling the project, found or wrong, then right click on Project selection build path-->libraries find JRE System library--> Click access rules:no rules Defined- Select EditResolution box changed to Accessible,rule pattern fill **/* dot okThis completes.Ii. writing Hadoop programs under eclipseBasic steps:1) New Hadoop project, in the project to create a new Lib directo
configuration file;
When switching between the two, if you encounter a situation that does not start properly, you can delete the temporary folders of the nodes involved, so that although the previous data will be deleted, it will ensure that the cluster starts correctly. So if the cluster can be started before, but will not start, especially DataNode can not start, you may want to try to delete all nodes (including Slave node) on the/usr/local/hadoop
. Generate public and private keys:
$ ssh-keygen-y-T Rsa-p ""
At this point, two files are generated under/home/hduser/.ssh: Id_rsa and Id_rsa.pub, the former private key and the public key.
5. Now we append the public key to the Authorized_keys$ cat ~/.ssh/id_rsa.pub>> ~/.ssh/authorized_keys
6. Login to SSH, confirm that you do not need to enter the password
SSH localhost
7. Log OutExit
If you log in again, you don't need a password.
Four, install Hadoop
1 Creating Hadoop user groups and Hadoop users STEP1: Create a Hadoop user group:~$ sudo addgroup Hadoop STEP2: Create a Hadoop User:~$ sudo adduser-ingroup Hadoop hadoopEnter the password when prompted, this is the new
function is to periodically merge the namespace image file of the metadata node with the modified log to prevent the log file from being too large. This will be believed in the narrative below. The merged namespace image file is also saved from the metadata node, which can be recovered when the metadata node fails.
Basic file Commands
The HDFs File System command takes the form:
Hadoop fs–cmd where cmd is a specific file command, is a variable set
Install EclipseDownload eclipse (click to download) to unzip the installation. I installed it under the/usr/local/software/directory.
Installing the Hadoop plugin on eclipseDownload the Hadoop plugin (click to download) and put the plugin in the Eclipse/plugins directory.
Restart Eclipse, configure Hadoop installat
.
Hadoop fs-mkdir input create input directory folder on HDFS
Hadoop fs-put ~ File/file *. txt input: upload files from the local file folder to HDFS
Run the WordCount ProgramHadoop jar/usr/local/hadoop/hadoop-0.20.2/hadoop-0.20.2
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.