CDH: Full name Cloudera ' s distribution including Apache HadoopCDH version-derived Hadoop is an open source project, so many companies are commercializing this foundation, and Cloudera has made a corresponding change to Hadoop.Cloudera Company's release, we call this version CDH (Cloudera distribution Hadoop). So far
I. Introduction to the Hadoop releaseThere are many Hadoop distributions available, with Intel distributions, Huawei Distributions, Cloudera Distributions (CDH), hortonworks versions, and so on, all of which are based on Apache Hadoop, and there are so many versions is due to Apache Hadoop's Open source agreement: Anyo
CM Add hive Service after installing CDH, error message appearsWhen adding a service, hive is configured as follows: Error message: Error log:Xec/opt/cloudera/parcels/cdh-5.4.7-1.cdh5.4.7.p0.3/lib/hadoop/bin/hadoop jar/opt/cloudera/parcels/ Cdh-5.4.7-1.cdh5.4.7.p0.3/lib/hive
1. Create the lib121 directory under the hive0.13.1 version
Cd/opt/cloudera/parcels/cdh/lib/hive;mkdir lib1212. Download the hive1.2.1 version and copy all files from this version of Lib to lib121
3. Modify the Hive_lib variable in/opt/cloudera/parcels/cdh/lib/hive/bin/hive
hive_lib=${hive_home}/lib121
4. Update the JLine jar package on Hadoop and remove the ol
I. risks are classified into internal and externalFirst, internal:During the deployment of CDH Big Data clusters, users named after services are automatically created,Username (login_name): Password location (passwd): User ID (UID): User Group ID (GID): annotation description (users): Home directory ): log on to Shell)CAT/etc/shadowThe format of the second column in the shadow file. It is the encrypted password. This column is "!! ", That is ":!! : ",
Original address: http://blog.csdn.net/a921122/article/details/51939692
File Download
CDH (Cloudera's distribution, including Apache Hadoop), is one of the many branches of Hadoop, built from Cloudera maintenance, based on the stable version of Apache Hadoop, and integrates many patches, Can be used directly in prod
Three cluster nodes
192.168.1.170 CDH-Master
Cdh-slave-1 192.168.1.171
Cdh-slave-2 192.168.1.171
1. Install centos6.5 (64-bit) and set up the basic environment, including:
(1) Add sudo Permissions
(2) modify the host name, gateway, static IP address, and DNS
(3) Disable SELinux and Firewall
Refer to the article
(4) modify the system time zone and configure the NT
Recently CDH cluster frequent alarm, because some host frequent swapping, greatly affected the performance of the cluster.Later found a setting (/proc/sys/vm/swappiness) needs to be modified, the default value of 60Setting the vm.swappiness Linux Kernel Parametervm.swappinessis a Linux Kernel Parameter This controls how aggressively memory pages are swapped to disk. It can set to a value between 0-100; The higher the value, the more aggressive the ker
Error when executing hive count query:Error:java Heap SpaceThe solution is set io.sort.mb=10;Error when executing Hadoop exeample, also Java heap space problemDiagnostic Messages for this Task:Error:java Heap SpaceFailed:execution Error, return code 2 from Org.apache.hadoop.hive.ql.exec.mr.MapRedTaskMapReduce Jobs Launched:Stage-stage-1: map:1 reduce:1 hdfs read:0 HDFs write:0 FAILTotal MapReduce CPU time spent:0 msecHive execution hql prompt error Er
is the module responsible for interaction with hive in hue. Therefore, we need to make some configuration before starting the hue service. Configurations are mainly divided into two aspects: one is the configuration that hue needs to do, and the other is the configuration that needs to be modified in the hadoop cluster to work with hue. The hue configuration file is/etc/hue/CONF/hue. ini. First, you need to perform some basic hue configurations. The
: Master node ssh other node ...; If not successful, then the other nodes in the other node to do their own password-free login: On the node to use the command ssh-keygen-t dsa-p "-F ~/.SSH/ID_DSAAnd then repeat the above operation 3. Turn off the firewallTemporary shutdown:Service Iptables StopPermanently closed (after reboot):Chkconfig iptables off 4. Turn off SELINUXTemporary shutdown:Setenforce 0Modify configuration file/etc/selinux/config (restart effective):Change Selinux=enforcing to Seli
Original address: Http://blog.selfup.cn/1631.html?utm_source=tuicoolutm_medium=referral
Spit Groove
Recently "idle" to have nothing to do, through the CM to vcores use situation to look at a glance, found that no matter how many tasks in the cluster running, the allocated vcores will never exceed 120. The available vcores for the cluster are 360 (15 machines x24 virtual cores). That's equivalent to 1/3 of CPU resources, and as a semi-obsessive-compulsive disorder, this is something that can nev
Carbondata is a new type of tabular file format for distributed computing, this time using Spark-thrift mode to operate Carbondata, briefly describes how to start Spark-carbondata-thriftserver. version CDH 5.10.3 spark 2.1.0 carbondata 1.2.0 download spark https://archive.apache.org/dist/spark/spark-2.1.0 /spark-2.1.0-bin-hadoop2.6.tgz Carbondata https://dist.apache.org/repos/dist/release/carbondata/1.2.0/ Apache-carbondata-1.2.0-source-release.zip ca
subproject of Lucene called hadoop.
Doug cutting joined yahoo at about the same time and agreed to organize a dedicated team to continue developing hadoop. In February of the same year, the Apache hadoop project was officially launched to support independent development of mapreduce and HDFS. In January 2008, hadoop b
1> removing the UUID of the agent node# rm-rf/opt/cm-5.4.7/lib/cloudera-scm-agent/*2> emptying the master node cm databaseGo to the MySQL database of the master node, and then drop db cm;3> Removing Agent node Namenode and Datanode node information# rm-rf/opt/dfs/nn/*# rm-rf/opt/dfs/dn/*4> re-initializing the CM database on the primary node#/opt/cm-5.4.7/share/cmf/schema/scm_prepare_database.sh MySQL cm-hlocalhost-uroot-p123456--scm-host localhost SCM SCM SCM5> Execute startup scriptMaster node:
Encounter a problem, because the default is to install CDH System/var/log directory, because it is a virtual instance, the system disk smaller only 50G, is the use of the system cm will be alerted to the alarm log directory space is not enough, if the script is deleted periodically, although it can solve the current problem, but not a good way. The other is to directly modify the configuration file, all the/var/log/* manually changed to/home/var/log/*
Tags: CDH mysql5.7 binarycdh-cdh5.8.3 offline installation--mysql5.7 binary deployment1. Check whether the system has installed MySQL, need to uninstall clean#rpm-qa|grep-i MySQLMysql-server-5.1.71-1.el6.x86_64Mysql-5.1.71-1.el6.x86_64Mysql-devel-5.1.71-1.el6.x86_64Qt-mysql-4.6.2-26.el6_4.x86_64Mysql-libs-5.1.71-1.el6.x86_64Perl-dbd-mysql-4.013-3.el6.x86_64#rpm-E mysql-server-5.1.71-1.el6.x86_64--nodeps#rpm-E mysql-5.1.71-1.el6.x86_64--nodeps#rpm-E my
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.