Open source Cloud Computing Technology Series (iv) (Cloudera installation Configuration Hadoop 0.20 latest edition configuration)

Source: Internet
Author: User
Keywords Nbsp; name nbsp; name

Then, we continue to experience the latest version of Cloudera 0.20.

wget Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb

wget Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb

debian:~# Dpkg–i Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb

Dpkg–i Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb

It's as simple as that. Ok

If you do not know where to install, you can use

debian:~# dpkg-l hadoop-0.20

You can see a clear installation directory structure.

Start:

debian:~# cd/etc/init.d/hadoop-0.20-
Hadoop-0.20-datanode Hadoop-0.20-namenode Hadoop-0.20-tasktracker
Hadoop-0.20-jobtracker Hadoop-0.20-secondarynamenode
debian:~# cd/etc/init.d/hadoop-0.20-

debian:~#/etc/init.d/hadoop-0.20-namenode Start

debian:~#/etc/init.d/hadoop-0.20-namenode Status
Hadoop-0.20-namenode is running

debian:~#/etc/init.d/hadoop-0.20-datanode Start

debian:~#/etc/init.d/hadoop-0.20-datanode Status
Hadoop-0.20-datanode is running

debian:~#/etc/init.d/hadoop-0.20-jobtracker Start

debian:~#/etc/init.d/hadoop-0.20-jobtracker Status
Hadoop-0.20-jobtracker is running

debian:~#/etc/init.d/hadoop-0.20-tasktracker Start

debian:~#/etc/init.d/hadoop-0.20-tasktracker Status
Hadoop-0.20-tasktracker is running

Boot complete.

Then you can perform regular example tests.

What is worth testing is

debian:~# Sqoop--help
Usage:hadoop Sqoop.jar org.apache.hadoop.sqoop.Sqoop (options)

Database connection options:
--connect (jdbc-uri)          Specify JDBC connect string
--driver (class-name)         manually specify JDBC Driver class to use
--username (username)         Set authentication username
-- Password (password)         Set authentication password
--local                        Use local import fast path (MySQL only)

Import control options:
--table (tablename)           table to Read
--columns (Col,col,col ...)    Columns to export from table
--order-by (column-name)      column of the table used to Order Results
--hadoop-home (dir)           Override $HADOOP _home
--warehouse-dir (dir)         HDFS path for table destination
-- as-sequencefile            Imports data to SequenceFiles
--as-textfile                Imports data as plain text (default)
--all-tables                  Import All tables in database
                              (ignores--table,--columns and--order-by)

Code generation options:
--outdir (dir) Output directory for generated code
--bindir (dir) Output directory for compiled objects
--generate-only Stop after code generation; Do not import

Additional commands:
--list-tables list tables in database and exit
--list-databases list all databases available and exit
--debug-sql (statement) Execute ' statement ' in SQL and exit

Generic Hadoop Command-Line options:
Generic Options keyword are
-conf <configuration file> Specify an creator revisit file
-D <property=value> Use value for given property
-fs <local|namenode:port> Specify a Namenode
-JT <local|jobtracker:port> Specify a job tracker
-files <comma separated list of files> specify comma separated files to is copied to the map reduce cluster
-libjars <comma separated list of jars> specify Comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify Comma separated archives to is unarchived on the compute Rogue.

The General Command line syntax is
Bin/hadoop command [genericoptions] [commandoptions]

At minimum, you moment-in specify--connect and either--table or--all-tables.
Alternatively, you can specify--generate-only or one of the additional
Commands.

Joint testing can be done by apt install mysql-server installation of MySQL under Debian.

Test experience See the previous article, here you can have a complete experience. The advent of Cloudera really simplifies the configuration of Hadoop and drives the development of open source cloud computing.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.