Open source Cloud Computing Technology Series (iv) (Cloudera installation Configuration Hadoop 0.20 latest edition configuration)

Last Update:2015-03-17 Source: Internet

Author: User

Keywords Nbsp; name nbsp; name

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Then, we continue to experience the latest version of Cloudera 0.20.

wget Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb

wget Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb

debian:~# Dpkg–i Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb

Dpkg–i Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb

It's as simple as that. Ok

If you do not know where to install, you can use

debian:~# dpkg-l hadoop-0.20

You can see a clear installation directory structure.

Start：

debian:~# cd/etc/init.d/hadoop-0.20-
Hadoop-0.20-datanode Hadoop-0.20-namenode Hadoop-0.20-tasktracker
Hadoop-0.20-jobtracker Hadoop-0.20-secondarynamenode
debian:~# cd/etc/init.d/hadoop-0.20-

debian:~#/etc/init.d/hadoop-0.20-namenode Start

debian:~#/etc/init.d/hadoop-0.20-namenode Status
Hadoop-0.20-namenode is running

debian:~#/etc/init.d/hadoop-0.20-datanode Start

debian:~#/etc/init.d/hadoop-0.20-datanode Status
Hadoop-0.20-datanode is running

debian:~#/etc/init.d/hadoop-0.20-jobtracker Start

debian:~#/etc/init.d/hadoop-0.20-jobtracker Status
Hadoop-0.20-jobtracker is running

debian:~#/etc/init.d/hadoop-0.20-tasktracker Start

debian:~#/etc/init.d/hadoop-0.20-tasktracker Status
Hadoop-0.20-tasktracker is running

Boot complete.

Then you can perform regular example tests.

What is worth testing is

debian:~# Sqoop--help
Usage:hadoop Sqoop.jar org.apache.hadoop.sqoop.Sqoop (options)

Database connection options:
--connect (jdbc-uri)          Specify JDBC connect string
--driver (class-name)         manually specify JDBC Driver class to use
--username (username)         Set authentication username
-- Password (password)         Set authentication password
--local                      Use local import fast path (MySQL only)

Import control options:
--table (tablename)           table to Read
--columns (Col,col,col ...)    Columns to export from table
--order-by (column-name)      column of the table used to Order Results
--hadoop-home (dir)           Override $HADOOP _home
--warehouse-dir (dir)         HDFS path for table destination
-- as-sequencefile            Imports data to SequenceFiles
--as-textfile                Imports data as plain text (default)
--all-tables                  Import All tables in database
                              (ignores--table,--columns and--order-by)

Code generation options:
--outdir (dir) Output directory for generated code
--bindir (dir) Output directory for compiled objects
--generate-only Stop after code generation; Do not import

Additional commands:
--list-tables list tables in database and exit
--list-databases list all databases available and exit
--debug-sql (statement) Execute ' statement ' in SQL and exit

Generic Hadoop Command-Line options:
Generic Options keyword are
-conf <configuration file> Specify an creator revisit file
-D <property=value> Use value for given property
-fs <local|namenode:port> Specify a Namenode
-JT <local|jobtracker:port> Specify a job tracker
-files <comma separated list of files> specify comma separated files to is copied to the map reduce cluster
-libjars <comma separated list of jars> specify Comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify Comma separated archives to is unarchived on the compute Rogue.

The General Command line syntax is
Bin/hadoop command [genericoptions] [commandoptions]

At minimum, you moment-in specify--connect and either--table or--all-tables.
Alternatively, you can specify--generate-only or one of the additional
Commands.

Joint testing can be done by apt install mysql-server installation of MySQL under Debian.

Test experience See the previous article, here you can have a complete experience. The advent of Cloudera really simplifies the configuration of Hadoop and drives the development of open source cloud computing.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More