Then, we continue to experience the latest version of Cloudera 0.20.
wget Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb
wget Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb
debian:~# Dpkg–i Hadoop-0.20-conf-pseudo_0.20.0-1cloudera0.5.0~lenny_all.deb
Dpkg–i Hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb
It's as simple as that. Ok
If you do not know where to install, you can use
debian:~# dpkg-l hadoop-0.20
You can see a clear installation directory structure.
Start:
debian:~# cd/etc/init.d/hadoop-0.20-
Hadoop-0.20-datanode Hadoop-0.20-namenode Hadoop-0.20-tasktracker
Hadoop-0.20-jobtracker Hadoop-0.20-secondarynamenode
debian:~# cd/etc/init.d/hadoop-0.20-
debian:~#/etc/init.d/hadoop-0.20-namenode Start
debian:~#/etc/init.d/hadoop-0.20-namenode Status
Hadoop-0.20-namenode is running
debian:~#/etc/init.d/hadoop-0.20-datanode Start
debian:~#/etc/init.d/hadoop-0.20-datanode Status
Hadoop-0.20-datanode is running
debian:~#/etc/init.d/hadoop-0.20-jobtracker Start
debian:~#/etc/init.d/hadoop-0.20-jobtracker Status
Hadoop-0.20-jobtracker is running
debian:~#/etc/init.d/hadoop-0.20-tasktracker Start
debian:~#/etc/init.d/hadoop-0.20-tasktracker Status
Hadoop-0.20-tasktracker is running
Boot complete.
Then you can perform regular example tests.
What is worth testing is
debian:~# Sqoop--help
Usage:hadoop Sqoop.jar org.apache.hadoop.sqoop.Sqoop (options)
Database connection options:
--connect (jdbc-uri) Specify JDBC connect string
--driver (class-name) manually specify JDBC Driver class to use
--username (username) Set authentication username
-- Password (password) Set authentication password
--local Use local import fast path (MySQL only)
Import control options:
--table (tablename) table to Read
--columns (Col,col,col ...) Columns to export from table
--order-by (column-name) column of the table used to Order Results
--hadoop-home (dir) Override $HADOOP _home
--warehouse-dir (dir) HDFS path for table destination
-- as-sequencefile Imports data to SequenceFiles
--as-textfile Imports data as plain text (default)
--all-tables Import All tables in database
(ignores--table,--columns and--order-by)
Code generation options:
--outdir (dir) Output directory for generated code
--bindir (dir) Output directory for compiled objects
--generate-only Stop after code generation; Do not import
Additional commands:
--list-tables list tables in database and exit
--list-databases list all databases available and exit
--debug-sql (statement) Execute ' statement ' in SQL and exit
Generic Hadoop Command-Line options:
Generic Options keyword are
-conf <configuration file> Specify an creator revisit file
-D <property=value> Use value for given property
-fs <local|namenode:port> Specify a Namenode
-JT <local|jobtracker:port> Specify a job tracker
-files <comma separated list of files> specify comma separated files to is copied to the map reduce cluster
-libjars <comma separated list of jars> specify Comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify Comma separated archives to is unarchived on the compute Rogue.
The General Command line syntax is
Bin/hadoop command [genericoptions] [commandoptions]
At minimum, you moment-in specify--connect and either--table or--all-tables.
Alternatively, you can specify--generate-only or one of the additional
Commands.
Joint testing can be done by apt install mysql-server installation of MySQL under Debian.
Test experience See the previous article, here you can have a complete experience. The advent of Cloudera really simplifies the configuration of Hadoop and drives the development of open source cloud computing.