A friend of mine once consulted me on how to implement a multi-column into a row in Impala, in fact, the self-contained function in Impala can be implemented without custom functions.Let me start with a demo:-bash-4.1$ Impala-shellStarting Impala Shell without Kerberos authenticationConnected to cdha:21000Server Version:impalad version 1.4.2-cdh5 RELEASE (build eac952d4ff674663ec3834778c2b981b252aec78)Welcome to the Impala shell. Press TAB twice to se
Environment Configuration--
Operating system: CentOS 6.5
JDK version: 1.7.0_67
Hadoop cluster version: CDH 5.3.0
Installation Process--1. Installation RYum Install -y R2, install Curl-devel ( very important!) Otherwise, the installation of the Rcurl package cannot be installed, and the Devtools cannot be installed)Yum Install -y curl-devel3, set the necessary environment variables ( Very Important! This must be set to the corresponding version of the Hadoop environment and
/hosts192.168.2.200 Server1192.168.2.201 Server2192.168.2.202 Server3192.168.2.203 Server4192.168.2.124 archive.cloudera.com192.168.2.124 archive-primary.cloudera.comhttp://archive.cloudera.com/cm5/redhat/6/x86_64/cm/The local machine to enter this address to access the local source library you builtNote: Note that the map address is configured on all machines to follow CDH, if you want to verify whether everyone offline source is successful, please install the source address in the Windows loca
1). Background:During cluster deployment, consistent configuration and environment settings are required. for Virtual Machine clusters, you can copy, copy, and restore cluster machines by using images. different from physical machine clusters, if there are more than one machine and many people operate and configure it, it is good for mature and competent teams and small teams who are not familiar with the environment, due to varying levels, this may lead to inconsistent environments. therefore,
Hadoop User Experience (HUE) Installation and HUE configuration Hadoop
HUE: Hadoop User Experience. Hue is a graphical User interface for operating and developing Hadoop applications. The Hue program is integrated into a desktop-like environment and released as a web program. For individual users, no additional installation is required.
Official website address: http://gethue.com/
The Hue official website cannot be downloaded and has timed out.
Use the CDH version for installation.
:
Http://arch
CDH: Full name Cloudera ' s distribution including Apache HadoopCDH version-derived Hadoop is an open source project, so many companies are commercializing this foundation, and Cloudera has made a corresponding change to Hadoop.Cloudera Company's release, we call this version CDH (Cloudera distribution Hadoop). So far, there are 5 versions of CDH, of which the first two are no longer updated, and the last two, respectively, are CDH4, which evolved on the basis of Apache Hadoop 2.0.0,
preparation of the installation packageDownload the parcel package here and include a total of three files (for suselinux):Cdh-5.6.0-1.cdh5.6.0.p0.45-sles11.parcelCdh-5.6.0-1.cdh5.6.0.p0.45-sles11.parcel.sha1Manifest.jsonAfter downloading, place three files in the/opt/cloudera/parcel-repo directory of the CM server host and execute the following command:
mv cdh-5.6 . 0 -1 . Cdh5. 6.0 . P0. 45 -sles11.parcel.sha1 cdh-5.6 . 0 -1 .
CDH to us already encapsulated, if we need spark on Yarn, just need yum to install a few packages. The previous article I have written if you build your own intranet CDH Yum server, please refer to "CDH 5.5.1 Yum Source Server Building"http://www.cnblogs.com/luguoyuanf/p/56187ea1049f4011f4798ae157608f1a.html
If you do not have an intranet yarn server, use the Cloudera yum server.wget Https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-
Hue:hadoop User ExperienceWebsite address: http://gethue.com/The Hue website cannot be downloaded and timed out.Install using the CDH version.:http://archive.cloudera.com/cdh5/cdh/5/Description Document:http://archive.cloudera.com/cdh5/cdh/5/hue-3.9.0-cdh5.5.0/Install dependent packagesReference: Https://github.com/cloudera/hueMy system is CentOS, install CentOS dependency package;Yum install ant asciidoc c
There are many versions of Hadoop, and here I choose the CDH version. CDH is the Cloudera company in Apache original base processed things. The specific CHD is:http://archive-primary.cloudera.com/cdh5/cdh/5/The version information is as follows:Hadoop:hadoop 2.3.0-cdh5.1.0jdk:1.7.0_79maven:apache-maven-3.2.5 (3.3.1 and later must be above JDK1.7)protobuf:protobuf-2.5.0ant:1.7.11. Install MavenMaven can download it on the MAVEN website (http://maven.ap
Convert multiple columns into one row, and convert multiple columns into one row.
A friend asked me how to convert multiple columns into one row in Impala. In fact, the built-in functions in Impala can be implemented without using custom functions.
The following is a demonstration:
-Bash-4.1 $ impala-shellStarting Impala Shell without Kerberos authentication
Connected to cdha: 21000
Server version: impalad version 1.4.2-cdh5 RELEASE (build eac952d4ff6
2.0 respectively, and they are updated at intervals. Cloudera recently released cdh5 (based on Apache hadoop 2.2.0: CDH5-beta-1 download), comes with yarn ha implementation, although this version is currently beta, however, considering that this solution adopts the HA framework implemented by hadoop 2.0 (both HDFS ha and mapreduce ha adopt this framework), it is universal.
Cloudera divides minor versions b
support a variety of Hadoop platforms, such as starting with the 0.8.1 version to support Hadoop 1 (HDP1, CDH3), CDH4, Hadoop 2 (HDP2, CDH5), respectively. At present Cloudera Company's CDH5 in the CM installation, you can directly select the Spark service to install.
Currently the latest version of Spark is 1.3.0, this article in version 1.3.0, to see how to implement the spark single-machine pseudo-distr
Label:First, what is Sqoop Sqoop is an open source tool that is used primarily in Hadoop (Hive) and traditional databases (MySQL, PostgreSQL ...) Data can be transferred from one relational database (such as MySQL, Oracle, Postgres, etc.) to the HDFs in Hadoop, or the data in HDFs can be directed into a relational database. Second, the characteristics of Sqoop One of the highlights of Sqoop is the ability to import data from a relational database into HDFs through the mapreduce of Hadoop. Iii. S
the database, execute show tables; To see all the tables under the database:
Hive> show Tables;
Ok
Lxw1
lxw1234
Table1
T_site_log
3.2 Storage paths for tablesBy default, the storage path for the table is:${hive.metastore.warehouse.dir}/databasename.db/tablename/You can use the DESC formatted tablename; command to view the details of the table, including the storage path:location:hdfs://cdh5/hivedata/warehouse/
CDH Address
http://archive-primary.cloudera.com/cdh5/cdh/5/
Add sudo permissions to a Hadoop user without password access# useradd hadoop# vi /etc/sudoers hadoop ALL=(root) NOPASSWD:ALL# su - hadoopDownload unzipModify configuration first requires jdk1.7[[emailprotected] jdk1.7.0_80]# echo $JAVA_HOME/usr/java/jdk1.7.0_80Modify the etc/hadoop/hadoop-env.sh file在hadoop-env.sh文件中修改java_homeexport JAVA_HOME=/usr/java/jdk1.7.0_80增加HA
For partition consideration, do not use LVMRoot --> 20 GBSwap -- 2x system memory
Ram --> 4 GBMaster node:Raid 10, dual Ethernet cards, dual power supplies, etc.Slave node:1. Raid is not necessary
2. HDFS partition, not using LVM/Etc/fstab -- ext3 defaults, noatimeMount to/data/N, for n = 0, 1, 2... (one partition per disk)
Cloudera Repository:
Http://archive.cloudera.com/cdh5/
Http://archive-primary.cloudera.com/cm5/
On cloudera man
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.