Eclipse Integration hadoop+spark+hive Local development graphic

Source: Internet
Author: User
Tags exception handling xsl

In the previous article we implemented Java+spark+hive+maven implementation and exception handling, the test instance is packaged to run in the Linux environment, but when the Windows system runs directly, there will be Hive related exception output, This article will help you integrate the Hadoop+spark+hive development environment on a Windows system. I. Development environment

System: Windows 7

jdk:jdk1.7

Eclipse:mars.2 Release (4.5.2)

hadoop:hadoop-2.6.5

spark:spark-1.6.2-bin-hadoop2.6

hive:hive-2.1.1 Two. Pre-preparation 1. System Environment Configuration

Jdk,hadoop and Spark Configure system environment 2.Hadoop related files

Winutils.exe and Hadoop.dll, download address: hadoop2.6.5 winutils and Hadoop

Place the top 2 files. The \hadoop-2.6.5\bin directory;

Place the Winutils.exe in the C:\Windows\System32 directory at the same time; 3. New Tmp/hive directory

In the application Engineering directory, new tmp/hive directory, because my project is placed in e disk, gu can be in e disk new tmp/hive directory three. Hive Configuration

1.Hive Environment

The hive of this system is deployed on the remote Linux cluster environment. Primary installation directory IP address: 10.32.19.50:9083

For specific hive deployment in the Linux environment, please review the relevant documentation, not described in this article.

2.Windows hive-site.xml File Configuration

<?xml version= "1.0"?>
<?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>
< Configuration>
	<!--Configure Data source storage address-->
    <property>
        <name>hive.metastore.warehouse.dir </name>
        <value>/user/hive/warehouse</value>
    </property>

	<!--configuration is local-- >
    <property>
        <name>hive.metastore.local</name>
        <value>false</value >
    </property>

	<!--configuration data source address-->
    <property>
        <name> hive.metastore.uris</name>
        <value>thrift://10.32.19.50:9083</value>
    </property >
    
</configuration>


Hive-site.xml Configuration in Windows


four. Instance Test

Requirements: Query hive data, Eclipse normal display 1. Case Engineering Structure
Example Project 2.pom file

<project xmlns= "http://maven.apache.org/POM/4.0.0" xmlns:xsi= "Http://www.w3.org/2001/XMLSchema-instance" xsi: schemalocation= "http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" > < Modelversion>4.0.0</modelversion> <groupId>com.lm.hive</groupId> <artifactId> Sparkhive</artifactid> <version>0.0.1-SNAPSHOT</version> <packaging>jar</packaging > <name>SparkHive</name> <url>http://maven.apache.org</url> <properties> <pr Oject.build.sourceencoding>utf-8</project.build.sourceencoding> </properties> <dependencies&gt
        ; <!--spark--> <dependency> <groupId>org.apache.spark</groupId> &L T;artifactid>spark-core_2.10</artifactid> <version>1.6.0</version> <exclus Ions> <exclusion> <gRoupid>org.apache.hadoop</groupid> <artifactId>hadoop-client</artifactId>
            </exclusion> </exclusions> </dependency> <dependency>
            <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.10</artifactId> <version>1.6.0</version> </dependency> <dependency> <groupid >org.apache.spark</groupId> <artifactId>spark-hive_2.10</artifactId> <vers ion>1.6.0</version> </dependency> <dependency> <groupid>org.mongod B.spark</groupid> <artifactId>mongo-spark-connector_2.10</artifactId> <versio n>1.1.0</version> </dependency> <dependency> <groupid>org.apache.de
    Rby</groupid>        <artifactId>derby</artifactId> <version>10.10.2.0</version> </depen dency> <!--Hadoop--> <dependency> <groupid>org.apache.hadoop</groupi
            D> <artifactId>hadoop-client</artifactId> <version>2.6.4</version> <exclusions> <exclusion> <groupid>javax.servlet</groupid&
                    Gt <artifactId>*</artifactId> </exclusion> </exclusions> &LT;/DEP endency> </dependencies> <build> &LT;SOURCEDIRECTORY&GT;SRC/MAIN/JAVA&LT;/SOURCEDIRECTORY&G
		T <testSourceDirectory>src/main/test</testSourceDirectory> <plugins> <plugin> <artifact id>maven-assembly-plugin</artifactid> <configuration> <descriptorRefs> <descriptorref>jar-with-dependencies</descriptorref> </descriptorRefs> <archive> <manifes
				t> <mainClass></mainClass> </manifest> </archive> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phas e> <goals> <goal>single</goal> </goals> </execution> </exe cutions> </plugin> <plugin> <groupId>org.codehaus.mojo</groupId> <artifactid& gt;exec-maven-plugin</artifactid> <version>1.2.1</version> <executions> <execution > <goals> <goal>exec</goal> </goals> </execution> </execut ions> <configuration> <executable>java</executable> <includeprojectdependencies>t Rue</includeprojectdePendencies> <includePluginDependencies>false</includePluginDependencies> <classpathscope>co mpile</classpathscope> <mainClass></mainClass> </configuration> </plugin> &L T;plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin<
					/artifactid> <version>3.1</version> <configuration> <source>1.7</source> <target>1.7</target> <showWarnings>true</showWarnings> </configuration> </p

 lugin> </plugins> </build> </project>



3. Test Case Implementation

Package com.lm.hive.SparkHive;

Import org.apache.spark.SparkConf;
Import Org.apache.spark.api.java.JavaSparkContext;
Import Org.apache.spark.sql.hive.HiveContext;

/**
 * Spark SQL FETCH hive Data
 *
 *
/public class App 
{public
    static void Main (string[] args) 
    {
        sparkconf sparkconf = new sparkconf (). Setappname ("Sparkhive"). Setmaster ("local[2]");
        Javasparkcontext sc = new Javasparkcontext (sparkconf);
        
        Do not use SqlContext, the deployment exception cannot find the database and table
        hivecontext hivecontext = new Hivecontext (SC);        SqlContext sqlcontext = new SqlContext (SC);
        Query table Top 10 data
        hivecontext.sql ("select * from Bi_ods.owms_m_locator limit"). Show ();
        
        Sc.stop ();
    }

4. Test results show
Test results show


Code Download Address: ECLISPE Integrated hadoop+spark+hive Development Instance code

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.