Impala online documentation describes Impala ODBC interface installation and configurationhttp://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/Impala/ Installing-and-using-impala/ciiu_impala_odbc.htmlImpala ODBC Driver:http://www.cloudera.com/content/support/en/downloads/connectors.htmlThis article explains in detail the installation and use of Impala ODBC in the CENTOS-6.5-X86_64 envi
Upgrade CDH5.2.0 to CDH5.3.3 and cdh5.2.0cdh5.3.3
The company has a Spark on Yarn cluster, which is built based on CM5.2.0 + CDH5.2.0 and the Spark version is 1.1.0. To use some features of Spark1.2.0, we decided to upgrade the cluster version to CM5.3.3 + CDH5.3.3. The reason for CM upgrade is that the version number of CM must be greater than or equal to the version number of CDH. The following two steps describe the upgrade process: CM upgrade and CDH upgrade.
1 CM upgrade process Introductio
First, IntroductionCDH is a commercial product developed by Cloudera Corporation to rapidly deploy and efficiently manage Hadoop and its various components. It is mainly divided into two parts, Cloudera Manager and CDH package respectively. Where Cloudera Manager is responsible for the deployment and management of the cluster. The CDH package includes installatio
Error 1, time cannot be synchronized2014.12.18To do the synchronization time, perform the command operation:[Email protected] ~]#/usr/sbin/ntpdate pool.ntp.orgWill error:Name server cannot be used, Exiting18 Dec 19:39:39 ntpdate[3592]: Name server cannot is used, reason:temporary failure in Name resolution1, first confirm that the NTPD service is open.[[email protected] ~]#/etc/init.d/ntpd startStarting ntpd: [OK]2, check whether your DNS is added, if not added, you use the add up.To edit a file
The company has a spark on yarn cluster, built based on cm5.2.0+cdh5.2.0, and the version of Spark is 1.1.0. To use some of the features of Spark1.2.0, decide to upgrade the cluster version to cm5.3.3+cdh5.3.3. CM is upgraded because the version number of CM must be greater than or equal to the version number of CDH. The following two steps describe the upgrade process: CM Upgrade and CDH upgrade.1 cm Upgrade Process Introduction 1.1 Admin User Login Http://10.10.244.137:7180/cmf/home, turn off
Big Data We all know about Hadoop, but there's a whole range of technologies coming into our sights: Spark,storm,impala, let's just not come back. To be able to better architect big data projects, here to organize, for technicians, project managers, architects to choose the right technology, understand the relationship between the various technologies of big data, choose the right language.
We can read this article with the following questions:What technology is included in the 1.hadoop.2.
restart to take effect)Selinux=disabled5. Install NTP serviceSu RootYum Install-y NTPYum Install-y ntpdateVi/etc/ntp.confMain changes restrict and NTP server specific BaiduSystemctl Start NTPDSystemctl Enable NTPDNtpdate-u pool.ntp.orgNtpdate-u h1046. Install MySQLRPM-IVH mysql-community-release-el7-5.noarch.rpmYum Install Mysql-serverSystemctl Start Mysqld.serviceSystemctl Enable Mysqld.serviceMysql-u RootUse MySQL;Update user set Password=password (' 123456 ') where user= ' root ';Grant all o
cm UpgradeOperation Dimension: Root Unified password do not mistakenly delete cluster backup file login Cmserver installed host, execute command: cat/etc/cloudera-scm-server/db.properties login PostgreSQL database psql-u scm-p 7 432 input Password: Back up cm Data: pg_dump-h cdhmaster-p 7432-u SCM >/tmp/scm_server_db_backup.$ (date +%y%m%d) Check/tmp for file generation, period guarantee TM P under file should not be deleted. Stop it
Impala Hue Hive S
How to view the JVM configuration and generational memory usage of a running spark process is a common monitoring tool for online running jobs:1, through the PS command query PIDPs-ef | grep 5661You can position the PID according to the special characters in the command2. Query the JVM parameter settings of the process using the Jinfo commandJinfo 105007Detailed JVM configuration information can be obtained.Attaching to process ID 105007, please wait ... Debugger attached successfully. Server c
Deploy Hbase in the Hadoop cluster and enable kerberos
System: LXC-CentOS6.3 x86_64
Hadoop version: cdh5.0.1 (manmual installation, cloudera-manager not installed)
Existing Cluster Environment: node * 6; jdk1.7.0 _ 55; zookeeper and hdfs (HA) installed), yarn, historyserver, and httpfs, and kerberos is enabled (kdc is deployed on a node in the cluster ).
Package to be installed: All nodes> yum install hbase master node> yum install hbase-master hbase-
I. Introduction to the Hadoop releaseThere are many Hadoop distributions available, with Intel distributions, Huawei Distributions, Cloudera Distributions (CDH), hortonworks versions, and so on, all of which are based on Apache Hadoop, and there are so many versions is due to Apache Hadoop's Open source agreement: Anyone can modify it and publish/sell it as an open source or commercial product.Currently, there are three main versions of Hadoop that ar
Apache Hadoop configuration Kerberos Guide
Generally, the security of a Hadoop cluster is guaranteed using kerberos. After Kerberos is enabled, you must perform authentication. After verification, you can use the GRANT/REVOKE statement to control role-based access. This article describes how to configure kerberos in a CDH cluster.
1. KDC installation and configuration script
The script install_kerberos.sh can complete all the installation configurations and corresponding parameter configurations
Transferred from: http://www.aboutyun.com/thread-7569-1-1.htmlBig Data We all know about Hadoop, but there's a whole range of technologies coming into our sights: Spark,storm,impala, let's just not come back. To be able to better architect big data projects, here to organize, for technicians, project managers, architects to choose the right technology, understand the relationship between the various technologies of big data, choose the right language.We can read this article with the following q
yum and configure it.Here I want to introduce how to install cdh4 through cloudera-manager.
Cloudera-manager is also a product of the apache Foundation. Currently, there are two editions: the free version and the commercial version. The free version only supports 50 nodes, and the commercial version is not limited.
Of course, generally 50 nodes are enough. here we use the free version of
Exited_with_faIlure 2014-03-31 19:50:50,496 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher:Dispatching the event Org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:CLEANUP_ CONTAINER 2014-03-31 19:50:50,496 INFO Org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:Cleaning up Container container_1396266549856_0001_01_000001
This is not a waste of time, because only to find that CDH has provided a ready-made c
Convert multiple columns into one row, and convert multiple columns into one row.
A friend asked me how to convert multiple columns into one row in Impala. In fact, the built-in functions in Impala can be implemented without using custom functions.
The following is a demonstration:
-Bash-4.1 $ impala-shellStarting Impala Shell without Kerberos authentication
Connected to cdha: 21000
Server version: impalad version 1.4.2-cdh5 RELEASE (build eac952d4ff674663ec3834778c2b981b252aec78)
Welcome to the
Hadoop is a complex system mix and it's a hassle to build a Hadoop environment for production. But there are always some cows in this world who will help you solve some seemingly painful problems, if not now, that is sooner or later. CDH is the Cloudera of the Hadoop set environment, CDH related to the introduction please see www.cloudera.com, I will not say more. This is mainly about using CDH5.3 to install a Hadoop environment that can be used for p
Document directory
Motivation
Motivation
Preface
I have been in contact with hadoop for two years and encountered many problems, including classic namenode and jobtracker memory overflow faults, HDFS storage of small files, and task scheduling problems, there are also mapreduce performance problems. some of these problems are hadoop's own defects (short board), while others are improper.
In the process of solving the problem, you sometimes need to turn to the source code, and sometimes ask c
Cloudera, compilation: importnew-Royce Wong
Hadoop starts from here! Join me in learning the basic knowledge of using hadoop. The following describes how to use hadoop to analyze data with hadoop tutorial!
This topic describes the most important things that users face when using the hadoop mapreduce (hereinafter referred to as Mr) framework. Mapreduce is composed of client APIs and runtime environment. Client APIS is used to compile Mr programs. The r
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.