Discover cloudera hadoop installation, include the articles, news, trends, analysis and practical advice about cloudera hadoop installation on alibabacloud.com
Inkfish original, do not reprint commercial nature, reproduced please indicate the source (http://blog.csdn.net/inkfish). (Source: Http://blog.csdn.net/inkfish)
Pig is a project Yahoo! donated to Apache and is currently in the Apache Incubator (incubator) phase, and the current version is v0.5.0. Pig is a large-scale data analysis platform based on Hadoop, which provides the sql-like language called Pig Latin, which translates the SQL-class data analy
Mac Configuration Hadoop1. Modify/etc/hosts127.0.0.1 localhost2. Download the hadoop2.9.0 and JDK and install the appropriate environmentVim/etc/profileExport hadoop_home=/users/yg/app/cluster/hadoop-2.9.0Export hadoop_conf_dir= $HADOOP _home/etc/hadoopExport path= $PATH: $HADOOP _home/binExport Java_home=/library/java/javavirtualmachines/jdk1.8.0_144.jdk/content
The first step of JDK installation 1. Put the downloaded JDK in the directory to be installed (my directory is:/root/hadoop/opt/cloud, use WINSCP to drag directly to the target directory) 2. Unzip the target directory sudo tar xvf jdk-7u45-linux-x64.tar.gz 3. Configure Environment variables Here I am using the command in the directory under the command: [Email protected] cloud]#/bin/vi/etc/profile The advan
1. Installing the various software needs to configure the path variable and so on home variables, what exactly is this?First, the main purpose of configuring the path variable is to use the commands inside the software, such as start-all.sh, under any directory, under any path, and so on.2. All kinds of software have env-sh and so on the end of the script file, here usually also need to configure various variables such as java--home, and so on, this is why?This is mainly because these software l
Reference http://blog.csdn.net/lalaguozhe/article/details/10912527 Environment: hadoop2.3cdh5.0.2 Hive 1.2.1 Target: Install Lzo Test job run with Hive table creation using LZO format store Before installing the trial snappy, it was found that the CDH extracted native contains a local library such as Libsnappy, but does not contain lzo. Therefore, the use of Lzo, in addition to installing the LZO program, but also to compile the installation
1. Hive MySQL metastore installation Preparation
Unzip the hive-0.12.0.tar.gz to the/zzy/.
# TAR-ZXVF Hive-0.12.0.tar.gz-c/zzy (-c Specifies the path after unpacking)
Modify the/etc/profile file to add hive to the environment variable
# Vim/etc/profile
Export java_home=/usr/java/jdk1.7.0_79
Export hadoop_home=/itcast/hadoop-2.4.1
Export hive_home=/itcast/hive-0.
}Replaced byExport JAVA_HOME=/OPT/JDK1. 8. 0_181/Third, copy to SlaveIv. format of HDFsThe shell executes the following commandHadoop Namenode-formatFormatting succeeds if the following red log content appears -/Ten/ A A: -: -INFO util. Gset:capacity =2^ the=32768Entries -/Ten/ A A: -: -INFO Namenode. fsimage:allocated New blockpoolid:bp-1164998719-192.168.56.10-153936231358418/10/12 12:38:33 INFO Common. Storage:storage Directory/opt/hdfs/name has been successfully formatted. -/Ten/ A A: -:
we use is to connect the Virtual Machine bridge to the physical network, occupying the IP address of the physical LAN, to achieve communication between the virtual machine and the physical machine and cross-Physical Machine Communication. Build a virtual machine again, this time using virtualbox View Firewall Disable Firewall Chkconfig -- list to view all the system services,If on exists, the startup is triggered under certain circumstances.. All disabled here
Mapred.job.shuffle.input.buffer.percent property, which represents the percentage of heap space used for this purpose), if the amount of data exceeds a certain percentage of the buffer size (by Mapred.job.shuffle.merg E.percent), the data is merged and then overflowed to disk. 2. As overflow files increase, background threads merge them into a larger, ordered file to save time for subsequent merges. In fact, regardless of the map or the reduce side, MapReduce is repeated to perform the sort, m
. Environment deployment
The deployment of the Zookeeper cluster is based on the Hadoop cluster deployed in the previous article. The cluster configuration is as follows:
Zookeeper1 rango 192.168.56.1
Zookeeper2 vm2 192.168.56.102
Zookeeper3 vm3 192.168.56.103
Zookeeper4 vm4 192.168.56.104
Zookeeper5 vm1 192.168.56.101
3. installation and configuration
3.1 download and install Zookeeper
Download the lat
statement)Sqoop Import--connect jdbc:mysql://192.168.1.10:3306/itcast--username root--password 123 \--query ' SELECT * from Trade_detail where ID > 2 and $CONDITIONS '--split-by trade_detail.id--target-dir '/sqoop/td3 'Note: If you use the--query command, it is important to note that the argument after the where, and $CONDITIONS This parameter must be addedAnd there is the difference between single and double quotes, if--query is followed by double quotes, then you need to add \ \ \ $CONDITIONS
completes, the JDK folder will be generated in the/opt/tools directory./jdk-6u34-linux-i586.binTo configure the JDK environment command:[Email protected]:/opt/tools# sudo gedit/etc/profileTo enter the profile file, change the file:Export java_home=/opt/tools/jdk1.6.0_34Export Jre_home= $JAVA _home/jreExport classpath= $JAVA _home/lib: $JRE _home/lib: $CLASSPATHExport path= $JAVA _home/bin: $JRE _home/bin: $PATHSave file, closeExecute the following command to make the configuration file effectiv
The first step is to select the tar.gz of the Hadoop version you want to install and extract the compressed files to the specified directory.The second step, create a folder to hold the data, the name of this folder can be self-command, but to include three sub-folders (these three subfolders, can be separated, but generally we put them in the same folder)Of these three folders, where data (the Datanode node is used, the contents of the storage is sav
(pubdate= ' 2010-08-22 ');Load data local inpath '/root/data.am ' into table beauty partition (nation= "USA");Select Nation, AVG (size) from the Beauties group by Nation ORDER by AVG (size);Two. UDFCustom UDF to inherit the Org.apache.hadoop.hive.ql.exec.UDF class implementation evaluatepublic class Areaudf extends Udf{private static MapCustom Function Call Procedure:1. Add a jar package (executed in the Hive command line)hive> add Jar/root/nudf.jar;2. Create a temporary functionHive> Create te
Ann to HadoopMy installation path is software under the root directoryUnzip the Hadoop package into the software directoryView directory after decompressionThere are four configuration files to modifyModify Hadoop-env.shModify the Core-site.xml fileConfigure Hdfs-site.xmlConfigure Mapred-site.xmlCompounding Yarn-site.xmlCompounding slavesFormat HDFs File systemSu
First, the preparation conditions:1. Four Linux virtual machines (1 namenode nodes, 1 secondary nodes (secondary and 1 datanode shared), plus 2 datanode)2. Download the Hadoop version, this example uses the Hadoop-2.5.2 versionSecond, install Java JDKBest installed, JDK 1.7 is best for JDK 1.7 compatibility-IVH jdk-7u79-linux-/root/. Bash_profilejava_home=/usr/java/jdk1. 7 . 0_79path= $PATH: $JAVA _home/bin
1. First download the Hadoop version of the plug-in to the Hadoop 1.0 version of the corresponding plug-in Hadoop-eclipse-plugin1.0.3.jar as an example2. Place the downloaded plugin in the plugins directory of the Ecplise installation directory3, Start ecplise, click Window->show view->other, click Mapreudce tools->map
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.