eclipse整合hadoop+spark+hive本地開發圖文詳解

來源:互聯網
上載者:User

        上一篇文章我們實現了Java+Spark+Hive+Maven實現和異常處理,測試的執行個體是打包運行在linux環境,但當直接在Windows系統運行時,會有Hive相關異常的輸出,本文將協助您如何在Windows系統上整合Hadoop+Spark+Hive開發環境。 一.開發環境

系統:windows 7

JDK:jdk1.7

eclipse:Mars.2 Release (4.5.2)

Hadoop:hadoop-2.6.5

Spark:spark-1.6.2-bin-hadoop2.6

Hive:hive-2.1.1 二.前期準備 1.系統內容配置

JDK,Hadoop和Spark配置系統內容 2.Hadoop相關檔案

winutils.exe和hadoop.dll,下載地址:hadoop2.6.5中winutils和hadoop

將上面2個檔案放置..\hadoop-2.6.5\bin目錄下;

將winutils.exe同時放置到C:\Windows\System32目錄下; 3.建立tmp/hive目錄

在應用工程目錄中建立tmp/hive目錄,由於我的工程是放置在E盤,顧可以在E盤建立tmp/hive目錄 三.hive配置

1.Hive環境

本系統的Hive是部署在遠程linux叢集環境上的。主安裝目錄ip地址:10.32.19.50:9083

具體Hive在linux環境的部署,請查看相關文檔,本文不介紹。

2.Windows中hive-site.xml檔案配置

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><!-- 配置資料來源儲存地址 -->    <property>        <name>hive.metastore.warehouse.dir</name>        <value>/user/hive/warehouse</value>    </property><!-- 配置是否本地 -->    <property>        <name>hive.metastore.local</name>        <value>false</value>    </property><!-- 配置資料來源地址 -->    <property>        <name>hive.metastore.uris</name>        <value>thrift://10.32.19.50:9083</value>    </property>    </configuration>


windows中hive-site.xml配置


四.執行個體測試

需求:查詢hive資料,eclipse正常顯示 1.執行個體工程結構
執行個體工程 2.pom檔案

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.lm.hive</groupId><artifactId>SparkHive</artifactId><version>0.0.1-SNAPSHOT</version><packaging>jar</packaging><name>SparkHive</name><url>http://maven.apache.org</url> <properties>    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>  </properties>     <dependencies>        <!-- spark -->        <dependency>            <groupId>org.apache.spark</groupId>            <artifactId>spark-core_2.10</artifactId>            <version>1.6.0</version>            <exclusions>                <exclusion>                    <groupId>org.apache.hadoop</groupId>                    <artifactId>hadoop-client</artifactId>                </exclusion>            </exclusions>        </dependency>        <dependency>            <groupId>org.apache.spark</groupId>            <artifactId>spark-sql_2.10</artifactId>            <version>1.6.0</version>        </dependency>        <dependency>            <groupId>org.apache.spark</groupId>            <artifactId>spark-hive_2.10</artifactId>            <version>1.6.0</version>        </dependency>        <dependency>            <groupId>org.mongodb.spark</groupId>            <artifactId>mongo-spark-connector_2.10</artifactId>            <version>1.1.0</version>        </dependency>        <dependency>            <groupId>org.apache.derby</groupId>            <artifactId>derby</artifactId>            <version>10.10.2.0</version>        </dependency>        <!-- hadoop -->        <dependency>            <groupId>org.apache.hadoop</groupId>            <artifactId>hadoop-client</artifactId>            <version>2.6.4</version>            <exclusions>                <exclusion>                    <groupId>javax.servlet</groupId>                    <artifactId>*</artifactId>                </exclusion>            </exclusions>        </dependency>    </dependencies>        <build><sourceDirectory>src/main/java</sourceDirectory><testSourceDirectory>src/main/test</testSourceDirectory><plugins><plugin><artifactId>maven-assembly-plugin</artifactId><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs><archive><manifest><mainClass></mainClass></manifest></archive></configuration><executions><execution><id>make-assembly</id><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin><plugin><groupId>org.codehaus.mojo</groupId><artifactId>exec-maven-plugin</artifactId><version>1.2.1</version><executions><execution><goals><goal>exec</goal></goals></execution></executions><configuration><executable>java</executable><includeProjectDependencies>true</includeProjectDependencies><includePluginDependencies>false</includePluginDependencies><classpathScope>compile</classpathScope><mainClass></mainClass></configuration></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>3.1</version><configuration><source>1.7</source><target>1.7</target><showWarnings>true</showWarnings></configuration></plugin></plugins></build></project>



3.測試案例實現

package com.lm.hive.SparkHive;import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaSparkContext;import org.apache.spark.sql.hive.HiveContext;/** * Spark sql擷取Hive資料 * */public class App {    public static void main( String[] args )     {        SparkConf sparkConf = new SparkConf().setAppName("SparkHive").setMaster("local[2]");        JavaSparkContext sc = new JavaSparkContext(sparkConf);                //不要使用SQLContext,部署異常找不到資料庫和表        HiveContext hiveContext = new HiveContext(sc);//        SQLContext sqlContext = new SQLContext(sc);        //查詢表前10條資料        hiveContext.sql("select * from bi_ods.owms_m_locator limit 10").show();                sc.stop();    }}

4.測試結果展示
測試結果展示


代碼下載地址:eclispe整合hadoop+spark+hive開發執行個體代碼

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.