Installation of the Apache Zeppelin for the Spark Interactive analytics platform

Source: Internet
Author: User
Tags install node

Zeppelin Introduction

Apache Zeppelin provides a web version of a similar Ipython notebook for data analysis and visualization. The back can be connected to different data processing engines, including Spark, Hive, Tajo, native support Scala, Java, Shell, Markdown and so on. Its overall presentation and use form is the same as the Databricks cloud, which comes from the demo at the time.

Zeppelin can achieve what you need:
-Data acquisition
-Data discovery
-Data analysis
-Visualization and collaboration of data

Supports multiple languages, the default is Scala (behind the Spark shell), sparksql, Markdown and Shell.

You can even add your own language support. How to write a Zeppelin interpreter

Zeppelin features Apache Spark integration

Zeppelin provides built-in Apache Spark integration. You don't need to build a module, plug-in, or library separately.
The Zeppelin Spark Integration provides:
-Automatic introduction of Sparkcontext and SqlContext
-Load the jar packages that are dependent on the runtime from the local file system or from the MAVEN library. More about dependent Loader
-Can cancel job and show job progress

Visualization of data

Some basic charts are already included in the Zeppelin. Visualization is not limited to sparksql queries, and the output of any language in the backend can be identified and visualized.
Bank

Dynamic Tables
Zeppelin can create some input formats dynamically in your notebook.

Collaboration
Notebook URLs can be shared between collaborators. Zeppelin can then broadcast any changes in real time, just like in Google Docs.

Publish
Zeppelin provides a URL to show only the results, and that page does not include Zeppelin menus and buttons. This way, you can easily integrate it into your site as an IFRAME.

Installation deployment for Zeppelin

Since Zeppelin does not currently provide binary installation packages, the installation of Zeppelin here requires its own compilation.
Here you can refer to Zeppelin GitHub and install Zeppelin

Preparatory work

need to
Java 1.7
Tested on Mac OSX, Ubuntu 14.X, CentOS 6.X
Maven (if you want to build from the source code)
node. js Package Manager

This installation can be done in Ubuntu environments:

sudo apt-get updatesudo apt-get install openjdk-7-jdksudo apt-get install gitsudo apt-get install mavensudo apt-get install npm

Note: If the Maven tool here is not the latest source, it may just be Maven2,zeppelin compilation needs Maven3, or some tools will be affected by the download, you can download the binary compression package from MAVEN website, directly use.
The node command is also required here, and the Nodejs command will be installed automatically when Apt-get installs NPM, where only a link can be created:sudo ln -s /usr/bin/nodejs /usr/bin/node

Installation configuration for the Zeppelin-web project

I used to zeppelin the entire project MAVEN deployment of the time always appear Zeppelin-web project failure, not its solution, referring to the method on the Web, the Zeppelin-web project for a separate installation configuration.
Every step here is critical, I am here to install the installation toss many times, in order to complete the normal installation, the following one by one road.

Remove the contents of the Zeppelin-web project Pom.xml below, in exchange for manual installation:

<plugin>        <groupId>Com.github.eirslett</groupId>        <artifactid>Frontend-maven-plugin</artifactid>        <version>0.0.23</version>        <executions>          <execution>            <ID>Install node and NPM</ID>            <goals>              <goal>install-node-and-npm</goal>            </goals>            <configuration>              <nodeversion>v0.10.18</nodeversion>              <npmversion>1.3.8</npmversion>            </configuration>          </Execution>          <execution>            <ID>NPM Install</ID>            <goals>              <goal>Npm</goal>            </goals>          </Execution>          <execution>            <ID>Bower Install</ID>            <goals>                <goal>Bower</goal>            </goals>            <configuration>              <arguments>--allow-root Install</arguments>            </configuration>          </Execution>          <execution>            <ID>Grunt Build</ID>            <goals>                <goal>Grunt</goal>            </goals>            <configuration>              <arguments>--no-color--force</arguments>            </configuration>          </Execution>        </executions>      </plugin>

Manual Installation steps:
1. Install NPM and Node
2. Enter the Zeppelin-web directory and execute npm install . It installs some grunt components according to Package.json's description, installs the Bower, and then produces a node_modules directory under the catalog.
3. Execution bower –-allow-root install , will be based on Bower.json installation of pre-Library dependencies, a bit similar to Java MVN.
4. Execute grunt --no-color –-force , the Web file will be organized according to Gruntfile.js.
3rd, 4 steps to note that the command used in the Bower and grunt files that were originally given, "node/node" because when using MAVEN automatic installation, the node directory is generated under the current directory, which contains the node command. We have previously installed the Nodejs command and have a new link to the command node, so we need to modify it here "node" .
5. Execute mvn install -DskipTests , Package Web project, generate war in target directory
Pom.xml when generating the war package, refer to the dist\WEB-INF\web.xml file, so before performing this step, it is necessary to clear the Zeppelin-web directory by the Dist directory in order to eventually generate the correct war package.

Compilation of other Zeppelin projects

Other projects are compiled according to normal procedures, installation documentation: http://zeppelin.incubator.apache.org/docs/install/install.html

To compile your own way:
Local mode:
mvn install -DskipTests
Cluster mode:
mvn install -DskipTests -Dspark.version=1.1.0 -Dhadoop.version=2.2.0

Configuration

The configuration file is the environment variable file (conf/zeppelin-env.sh) and the Java Properties File (Conf/zeppelin-site.xml). Configure according to your requirements.

Start, close

The start and close Zeppelin process commands are:
bin/zeppelin-daemon.sh start
bin/zeppelin-daemon.sh stop

Resources

Apache Zeppelin Installation and introduction

reprint Please indicate the author Jason Ding and its provenance
Gitcafe Blog Home page (http://jasonding1354.gitcafe.io/)
GitHub Blog Home page (http://jasonding1354.github.io/)
CSDN Blog (http://blog.csdn.net/jasonding1354)
Jane Book homepage (http://www.jianshu.com/users/2bd9b48f6ea8/latest_articles)
Google search jasonding1354 go to my blog homepage

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Installation of the Apache Zeppelin for the Spark Interactive analytics platform

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.