Hadoop 2.2.0 and HBase-0.98 install snappy
1. install required dependent packages and software
The dependency packages to be installed include:
Gcc, c ++, autoconf, automake, and libtool
The required software is:
Java 6, Maven
For the preceding dependency package, run the sudo apt-get install * command in Ubuntu and run the sudo yum install * command in CentOS.
For installation of related Java and Maven, refer to "installation of Java, Maven, and Tomcat in Linux".
2. Download snappy-1.1.2
Available download addresses:
------------------------------------------ Split line ------------------------------------------
Free in http://linux.bkjia.com/
The username and password are both www.bkjia.com
Detailed download directory in/2014 documents/December/25/Hadoop 2.2.0 and HBase-0.98 install snappy
For the download method, see
------------------------------------------ Split line ------------------------------------------
3. Compile and dynamically install
Download the package and decompress it to a folder. Assume that the extracted address is in the home directory. Run the following command:
$ Cd ~ Snappy-1.1.2
$ Sudo./configure
$ Sudo./make
$ Sudo make install
Run the following command to check whether the installation is successful.
$ Cd/usr/local/lib
$ Ll libsnappy .*
-Rw-r -- 1 root 233506 Aug 7 11: 56 libsnappy.
-Rwxr-xr-x 1 root 953 Aug 7 libsnappy. la
Lrwxrwxrwx 1 root 18 Aug 7 11: 56 libsnappy. so-> libsnappy. so.1.2.1
Lrwxrwxrwx 1 root 18 Aug 7 11: 56 libsnappy. so.1-> libsnappy. so.1.2.1
-Rwxr-xr-x 1 root 147758 Aug 7 libsnappy. so.1.2.1 if no error is encountered during installation and the above files are in the/usr/local/lib directory, the installation is successful.
4. hadoop-snappy source code compilation
1) download the source code in two ways
A. install svn. For ubuntu, use sudo apt-get install subversion. For centos, run the sudo yum install subversion command to install it.
B. Use svn to checkout source code from Google's svn repository and use the following command:
$ Svn checkout http://hadoop-snappy.googlecode.com/svn/trunk/ hadoop-snappy
In this way, the source code of hadoop-snappy is copied from the command execution directory and placed in the hadoop-snappy directory.
However, Google's services are always faulty in mainland China, so you can download them directly. For more information, see the resource download link of the Helper home in this article.
2) Compile the hadoop-snappy source code.
Switch to the hadoop-snappy source code directory and execute the following command:
A. If snappy is installed on the default path, run the following command:
Mvn package
B. If the snappy installed above uses a custom path, the command is:
Mvn package [-Dsnappy. prefix = SNAPPY_INSTALLATION_DIR]
The installation path of SNAPPY_INSTALLATION_DIR is snappy.
Possible problems during compilation:
(A)/root/modules/hadoop-snappy/maven/build-compilenative.xml: 62: Execute failed: java. io. IOException: Cannot run program "autoreconf" (in directory "/root/modules/hadoop-snappy/target/native-src"): java. io. IOException: error = 2, No such file or directory
Solution: The file is missing, but the file is under the target. It is automatically generated during the compilation process and should not have existed. What is this question? In fact, the root problem is not the lack of files, but Hadoop Snappy, which requires certain preconditions. For more information, see install dependency packages.
B) the following error message is displayed:
[Exec] make: *** [src/org/apache/hadoop/io/compress/snappy/SnappyCompressor. lo] Error 1
[ERROR] Failed to execute goal org. apache. maven. plugins: maven-antrun-plugin: 1.6: run (compile) on project hadoop-snappy: An Ant BuildException has occured: The following error occurred while executing this line:
[ERROR]/home/ngc/Char/snap/hadoop-snappy-read-only/maven/build-compilenative.xml: 75: exec returned:
Solution: the official documentation of Hadoop Snappy only lists the gcc versions, but not the gcc versions. In fact, Hadoop Snappy requires gcc4.4. If the gcc version is later than the default 4.4 version, an error is returned.
Assume that the system is centos. Run the following command: (Note: ubuntu needs to replace sudo yum install with sudo apt-get install)
$ Sudo yum install gcc-4.4
$ Sudo rm/usr/bin/gcc
$ Sudo ln-s/usr/bin/gcc-4.4/usr/bin/gcc
Run the following command to check whether the replacement is successful:
$ Gcc -- version
Gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
Warranty; not even for MERCHANTABILITY or fitness for a particle PURPOSE. c)
The following error message is displayed:
[Exec]/bin/bash. /libtool -- tag = CC -- mode = link gcc-g-Wall-fPIC-O2-m64-g-O2-version-info 0: 1:0-L/usr/local // lib- o libhadoopsnappy. la-rpath/usr/local/lib src/org/apache/hadoop/io/compress/snappy/SnappyCompressor. lo src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor. lo-ljvm-ldl
[Exec]/usr/bin/ld: cannot find-ljvm
[Exec] collect2: ld returned 1 exit status
[Exec] make: *** [libhadoopsnappy. la] Error 1
[Exec] libtool: link: gcc-shared-fPIC-DPIC src/org/apache/hadoop/io/compress/snappy /. libs/SnappyCompressor. o src/org/apache/hadoop/io/compress/snappy /. libs/SnappyDecompressor. o-L/usr/local // lib-ljvm-ldl-O2-m64-O2-Wl,-soname-Wl, libhadoopsnappy. so.0-o. libs/libhadoopsnappy. so.0.0.1
This is because libjvm. so symbolic for jvm installation is not linked to usr/local/lib. If your system is 64-bit, you can view libjvm at/root/bin/jdk1.6.0 _ 37/jre/lib/amd64/server. so to the link, modify the following here and use the command:
$ Sudo ln-s/usr/local/jdk1.6.0 _ 45/jre/lib/amd64/server/libjvm. so/usr/local/lib/
The problem can be solved.
5. Configure snappy for Hadoop 2.2.0
After hadoop-snappy is compiled successfully, some files will be generated in the target directory under the hadoop-snappy directory, one of which is named: hadoop-snappy-0.0.1-SNAPSHOT.tar.gz
1. Decompress hadoop-snappy-0.0.1-snapshot.tar.gz under target. decompress the package and copy the lib file.
$ Sudo cp-r ~ // Snappy-hadoop/target/hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64/* $ HADOOP_HOME/lib/native/Linux-amd64-64/
2) copy the hadoop-snappy-0.0.1-SNAPSHOT.jar under target to $ HADOOP_HOME/lib.
3) configure $ HADOOP_HOME/etc/hadoop/hadoop-env.sh and add:
Export LD_LIBRARY_PATH = $ LD_LIBRARY_PATH: $ HADOOP_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
4) configure $ HADOOP_HOME/etc/hadoop/mapred-site.xml. In this file, all the compression-related configuration options are:
<Property>
<Name> mapred. output. compress </name>
<Value> false </value>
<Description> shocould the job outputs be compressed?
</Description>
</Property>
<Property>
<Name> mapred. output. compression. type </name>
<Value> RECORD </value>
<Description> If the job outputs are to compressed as SequenceFiles, how shoshould
They be compressed? Shocould be one of NONE, RECORD or BLOCK.
</Description>
</Property>
<Property>
<Name> mapred. output. compression. codec </name>
<Value> org. apache. hadoop. io. compress. DefaultCodec </value>
<Description> If the job outputs are compressed, how should they be compressed?
</Description>
</Property>
<Property>
<Name> mapred. compress. map. output </name>
<Value> false </value>
<Description> shocould the outputs of the maps be compressed before being
Sent upload ss the network. Uses SequenceFile compression.
</Description>
</Property>
<Property>
<Name> mapred. map. output. compression. codec </name>
<Value> org. apache. hadoop. io. compress. DefaultCodec </value>
<Description> If the map outputs are compressed, how should they be
Compressed?
</Description>
</Property>
You can configure it as needed. The codec type is as follows:
<Property>
<Name> io. compression. codecs </name>
<Value>
Org. apache. hadoop. io. compress. GzipCodec,
Org. apache. hadoop. io. compress. DefaultCodec,
Org. apache. hadoop. io. compress. BZip2Codec,
Org. apache. hadoop. io. compress. SnappyCodec
</Value>
</Property> SnappyCodec represents the snappy compression method.
5) After configuration, restart the hadoop cluster.
6. Configure snappy in HBase 0.98
1) Configure HBase lib/native/lib file in the Linux-amd64-64. For simplicity, we only need to copy the lib File $ HADOOP_HOME/lib/native/Linux-amd64-64/to the corresponding HBase directory:
$ Sudo cp-r $ HADOOP_HOME/lib/native/Linux-amd64-64/* $ HBASE_HOME/lib/native/Linux-amd64-64/
2) Configure HBase environment variable hbase-env.sh
Export LD_LIBRARY_PATH = $ LD_LIBRARY_PATH: $ HADOOP_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
Export HBASE_LIBRARY_PATH = $ HBASE_LIBRARY_PATH: $ HBASE_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
Export CLASSPATH = $ CLASSPATH: $ HBASE_LIBRARY_PATH
Note: Don't forget to configure HADOOP_HOME and HBASE_HOME at the beginning of the habase-env.sh.
3) After configuration, restart HBase.
4) verify whether the installation is successful
Run the following statement in the HBase installation directory:
$ Bin/hbase shell
15:11:35, 874 INFO [main] Configuration. deprecation: hadoop. native. lib is deprecated. Instead, use io. native. lib. available
HBase Shell; enter 'help <RETURN> 'for list of supported commands.
Type "exit <RETURN>" to leave the HBase Shell
Version 0.98.2-hadoop2, r1591526, Wed Apr 30 20:17:33 PDT 2014
Hbase (main): 001: 0>
Then execute the creation statement:
Hbase (main): 001: 0> create 'test _ snappy ', {NAME => 'cf', COMPRESSION => 'snappy '}
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar: file:/home/q/hbase/hbase-0.98.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar! /Org/slf4j/impl/StaticLoggerBinder. class]
SLF4J: Found binding in [jar: file:/home/q/hadoop2x/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/StaticLoggerBinder. class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
0 row (s) in 1.2580 seconds
=> Hbase: Table-test_snappy
Hbase (main): 002: 0>
View the created table test_snappy:
Hbase (main): 002: 0> describe 'test _ snappy'
DESCRIPTION ENABLED
'Test _ snappy', {NAME => 'cf ', DATA_BLOCK_ENCODING => 'none', BLOOMFILTER => 'row', REPLICATION_SCOPE => '0 ', VERSIONS => '1', COMPRESSIO true
N => 'snappy', MIN_VERSIONS => '0', TTL => '20180101', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '20180101 ', IN_MEMORY => 'false', BLOC
KCACHE => 'true '}
1 row (s) in 0.0420 seconds
As you can see, COMPRESSION => 'snappy '.
Next, try to insert data:
Hbase (main): 003: 0> put 'test _ snappy', 'key1', 'cf: q1', 'value1'
0 row (s) in 0.0790 seconds
Hbase (main): 004: 0>
Try to traverse the test_snappy table:
Hbase (main): 004: 0> scan 'test _ snappy'
Row column + CELL
Key1 column = cf: q1, timestamp = 1407395814255, value = value1
1 row (s) in 0.0170 seconds
Hbase (main): 005: 0>
The above steps can be correctly executed, indicating that the configuration is correct.
Error solution:
A) The following exception occurs when hbase is started after Configuration:
WARN [main] util. CompressionTest: Can't instantiate codec: snappy
Java. io. IOException: java. lang. UnsatisfiedLinkError: org. apache. hadoop. util. NativeCodeLoader. buildSupportsSnappy () Z
At org. apache. hadoop. hbase. util. CompressionTest. testCompression (CompressionTest. java: 96)
At org. apache. hadoop. hbase. util. CompressionTest. testCompression (CompressionTest. java: 62)
At org. apache. hadoop. hbase. regionserver. HRegionServer. checkCodecs (HRegionServer. java: 660)
At org. apache. hadoop. hbase. regionserver. HRegionServer. <init> (HRegionServer. java: 538)
At sun. reflect. NativeConstructorAccessorImpl. newInstance0 (Native Method)
At sun. reflect. nativeconstruct%cessorimpl. newInstance (nativeconstruct%cessorimpl. java: 57)
At sun. reflect. delegatingconstruct%cessorimpl. newInstance (delegatingconstruct%cessorimpl. java: 45)
At java. lang. reflect. Constructor. newInstance (Constructor. java: 526) indicates that the configuration has not been configured. Check the configuration in the hbase-env.sh to see if it is correct.
New Features of Hadoop2.5.2
Install and configure Hadoop2.2.0 on CentOS
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)