#####
# # # #安装hadoop2.6.0 fully distributed cluster
#####
# # # #文件及系统版本:
####
hadoop-2.6.0
Java version 1.8.0_77
CentOS 64-bit
# # #预备
####
Under/home/hadoop/: mkdir Cloud
Put the Java and Hadoop packages under/home/hadoop/cloud
# # #配置静态ip
####
Master192.168.116.100
Slave1192.168.116.110
Slave2192.168.116.120
# # # #修改机器相关名称 (all under the root authority)
####
Su Root
Vim/etc/hosts
Under the original information input: (Space +tab key)
192.168.116.100 Master
192.168.116.110 slave1
192.168.116.120 Slave2
Vim/etc/hostname
Master
Shutdown-r Now (restart machine)
Vim/etc/hostname
Slave1
Shutdown-r now
Vim/etc/hostname
Slave2
Shutdown-r now
# # #安装openssh
####
Su Root
Yum Install OpenSSH
SSH-KEYGEN-T RSA
and always confirm.
Send the public key of Slave1 and Slave2 to master:
scp/home/hadoop/.ssh/id_rsa.pub [Email protected]:~/.ssh/slave1.pub
scp/home/hadoop/.ssh/id_rsa.pub [Email protected]:~/.ssh/slave2.pub
Under Master: CD. ssh/
Cat Id_rsa.pub >> Authorized_keys
Cat Slave1.pub >> Authorized_keys
Cat Slave2.pub >> Authorized_keys
Send the public key package to slave1 and Slave2:
SCP Authorized_keys [Email protected]:~/.ssh/
SCP Authorized_keys [Email protected]:~/.ssh/
SSH slave1
SSH slave2
SSH Master
The corresponding input Yes
SSH No password login configuration completed here
####
# # # #设计JAVA_HOME Hadoop_home
####
Su Root
Vim/etc/profile
Input:
Export java_home=/home/hadoop/cloud/jdk1.8.0_77
Export Jre_home= $JAVA _home/jre
Export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export hadoop_home=/home/hadoop/cloud/hadoop-2.6.0
Export path= $PATH: $JAVA _home/bin: $HADOOP _home/bin: $HADOOP _home/sbin
Then Source/etc/profile
(all three must be configured)
####
# # #配置hadoop文件
####
Under the/home/hadoop/cloud/hadoop-2.6.0/sbin:
Vim hadoop-daemon.sh
Modifying the path of the PID
Vim yarn-daemon.sh
Modifying the path of the PID
Under the/HOME/HADOOP/CLOUD/HADOOP-2.6.0/ETC:
Vim Slaves Input:
Master
Slave1
Slave2
Vim hadoop-env.sh Input:
Export java_home=/home/hadoop/cloud/jdk1.8.0_77
Export hadoop_home_warn_suppress= "TRUE"
Vim Core-site.xml Input:
############################################## #core
<configuration>
<property>
<name>io.native.lib.avaliable</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/Cloud/workspace/temp</value>
</property>
</configuration>
################################################ #core
Vim Hdfs-site.xml
##################################################### #hdfs
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/Cloud/workspace/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.dir</name>
<value>/home/hadoop/Cloud/workspace/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.dir</name>
<value>/home/hadoop/Cloud/workspace/hdfs/data</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
###################################################### #hdfs
Vim Mapred-site.xml
##################################### #mapred
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>
##################################### #mapred
Send configured Hadoop to slave1 and slave2
Scp-r hadoop-2.6.0 [Email protected]:~/cloud/
Scp-r hadoop-2.6.0 [Email protected]:~/cloud/
Send Java packages to slave1 and slave2:
Scp-r jdk1.8.0_77 [Email protected]:~/cloud/
Scp-r jdk1.8.0_77 [Email protected]:~/cloud/
Here, the Hadoop cluster configuration is complete
########
####### #现在可以启动hadoop
########
First of all, format Namenode
Hadoop Namenode-format (can be executed in any directory due to the hadoop-env.sh and system environment previously designed)
View the logs yes, go down.
start-all.sh
And then
Full words through JPS view:
[Email protected] ~]$ JPS
42306 ResourceManager
42407 NodeManager
42151 Secondarynamenode
41880 NameNode
41979 DataNode
[Email protected] ~]$ JPS
21033 NodeManager
20926 DataNode
[Email protected] ~]$ JPS
20568 NodeManager
20462 DataNode
At this point, the hadoop-2.6.0 fully distributed configuration is complete.
Here's the browser port number for Hadoop:
localhost:50070
localhost:8088
########
####### #配置C的API连接HDFS
########
Find/-name libhdfs.so.0.0.0
Vi/etc/ld.so.conf
Write:
/home/hadoop/cloud/hadoop-2.6.0/lib/native/
/home/hadoop/cloud/jdk1.8.0_77/jre/lib/amd64/server/
Then design the startup load:
/sbin/ldconfig–v
Then configure the environment variables:
Find and Print:
Find/home/hadoop/cloud/hadoop-2.6.0/share/-name *.jar|awk ' {printf ("Export classpath=%s: $CLASSPATH \ n", $);} '
You'll see what's printed like this:
Export Classpath=/home/hadoop/cloud/hadoop-2.6.0/share/hadoop/common/lib/activation-1.1.jar: $CLASSPATH
Export Classpath=/home/hadoop/cloud/hadoop-2.6.0/share/hadoop/common/lib/jsch-0.1.42.jar: $CLASSPATH
。。。。。。
Add all the printed content to the environment variable Vim/etc/profile
Then write the C language code to verify that the configuration was successful:
Vim above_sample.c
The code reads as follows:
#################################################################################
#include "Hdfs.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char **argv) {
HDFSFS FS =hdfsconnect ("192.168.116.100", 9000); Made a little change here
Const char* Writepath = "/tmp/testfile.txt";
Hdfsfile WriteFile = Hdfsopenfile (Fs,writepath, o_wronly| O_creat, 0, 0, 0);
if (!writefile) {
fprintf (stderr, "Failed toopen%s for writing!\n", Writepath);
Exit (-1);
}
char* buffer = "hello,world!";
Tsize num_written_bytes = Hdfswrite (Fs,writefile, (void*) buffer, strlen (buffer) +1);
if (Hdfsflush (FS, WriteFile)) {
fprintf (stderr, "Failed to ' flush '%s\n", Writepath);
Exit (-1);
}
Hdfsclosefile (FS, WriteFile);
}
###############################################################################
Compiling C language code:
GCC above_sample.c-i/home/hadoop/cloud/hadoop-2.6.0/include/-l/home/hadoop/cloud/hadoop-2.6.0/lib/native/-lhdfs /home/hadoop/cloud/jdk1.8.0_77/jre/lib/amd64/server/libjvm.so-o Above_sample
To perform a compile-completed build of the Above_sample file:
./above_sample
See if the log and Hadoop file directories generate a testfile file
At this point, the C language API connection HDFs configuration is complete
#########
###### #集群的文件操作
########
# # # (Automatic distribution script) auto.sh
Vim auto.sh
chmod +x auto.sh
./auto.sh jdk1.8.0_77 ~/cloud/
Automatic Distribution Scripts
############################
#!/bin/bash
nodes= (slave1 slave2)
num=${#nodes [@]}
File=$1
Dst_path=$2
For ((i=0;i<${num};i++));d o
Scp-r ${file} ${nodes[i]}:${dst_path};
Done
####################
######### #hadoop -2.6.0 Basic operations for a fully distributed cluster #hdfs dfs -mkdir /inputecho "Hello hadoop" > test1.txt Import all files from the current directory into the in directory of HDFs: Hadoop dfs -put / inhadoop dfs -ls /in/*hadoop dfs -cp /in/test1.txt /in/test1.txt.bakhadoop dfs -ls /in /*hadoop dfs -rm /in/test1.txt.bakmkdir dir_from_hdfs all files from the HDFs download directory in to Dir_from_ In HDFs: hadoop dfs -get /in/* /dir_from_hdfscd /home/hadoop/cloud/ hadoop-1.2.1 counts the number of words in all text files in the in directory, separated by spaces (note that Output/wordcount directories are not available):hadoop jar hadoop-examples-2.6.0.jar wordcount in /output/wordcount View Statistical results: Hadoop fs -cat output/wordcount/part-r-00000####### #管理 # # # #1. Cluster-related management: Edit log: Modify the log, when the file system client is writing, We are going to put this record in the change log. After logging the modification log, Namenode modifies the data structure in memory. Before each write operation succeeds, Edit log synchronizes to the file system Fsimage: namespace mirroring, which is the checkpoint of the in-memory metadata on the hard disk. When Namenode fails, the latest checkpoint metadata information is loaded from fsimage into memory, and attention is paid to re-executing the operation in the modification log.For The Secondary namenode is used to help the metadata node checkpoint the in-memory metadata information to the hard disk. 2. Cluster properties: Advantages: 1) Ability to handle oversized files, 2) streaming access to data. HDFs can handle "write-once, read-write" tasks very well. That is, once a dataset is generated, it is copied to a different storage node and responds to a variety of data Analysis task requests. In most cases, the analysis task will involve most of the data in the data set. Therefore, HDFS requests to read the entire data set is more efficient than reading a record. Cons: 1) Not suitable for low latency data access: HDFS is designed to handle large data set analysis tasks, primarily to achieve big data analysis, so latency may be high. 2) cannot efficiently store large numbers of small files: Because Namenode puts the filesystem's metadata in memory, the number of files the file system can hold is determined by the size of the Namenode memory. 3) does not support multi-user write and arbitrary modification of files: There is only one writer in a file in HDFs, and the write operation can only be done at the end of the file, that is, only the append operation can be performed. Currently HDFS does not support multiple users writing to the same file, as well as modifying it anywhere in the file.
This article is from the "10700016" blog, please be sure to keep this source http://10710016.blog.51cto.com/10700016/1896278
Deploy Hadoop for production under centos6.5 and connect to Hadoop using the C language API