Zeppelin Introduction
Apache Zeppelin provides a web version of a similar Ipython notebook for data analysis and visualization. The back can be connected to different data processing engines, including Spark, Hive, Tajo, native support Scala, Java, Shell, Markdown and so on. Its overall presentation and use form is the same as the Databricks cloud, which comes from the demo at the time.
Zeppelin can achieve what you need:
-Data acquisition
-Data discovery
-Data analysis
-Visualization and collaboration of data
Supports multiple languages, the default is Scala (behind the Spark shell), sparksql, Markdown and Shell.
You can even add your own language support. How to write a Zeppelin interpreter
Zeppelin features Apache Spark integration
Zeppelin provides built-in Apache Spark integration. You don't need to build a module, plug-in, or library separately.
The Zeppelin Spark Integration provides:
-Automatic introduction of Sparkcontext and SqlContext
-Load the jar packages that are dependent on the runtime from the local file system or from the MAVEN library. More about dependent Loader
-Can cancel job and show job progress
Visualization of data
Some basic charts are already included in the Zeppelin. Visualization is not limited to sparksql queries, and the output of any language in the backend can be identified and visualized.
Bank
Dynamic Tables
Zeppelin can create some input formats dynamically in your notebook.
Collaboration
Notebook URLs can be shared between collaborators. Zeppelin can then broadcast any changes in real time, just like in Google Docs.
Publish
Zeppelin provides a URL to show only the results, and that page does not include Zeppelin menus and buttons. This way, you can easily integrate it into your site as an IFRAME.
Installation deployment for Zeppelin
Since Zeppelin does not currently provide binary installation packages, the installation of Zeppelin here requires its own compilation.
Here you can refer to Zeppelin GitHub and install Zeppelin
Preparatory work
need to
Java 1.7
Tested on Mac OSX, Ubuntu 14.X, CentOS 6.X
Maven (if you want to build from the source code)
node. js Package Manager
This installation can be done in Ubuntu environments:
sudo apt-get updatesudo apt-get install openjdk-7-jdksudo apt-get install gitsudo apt-get install mavensudo apt-get install npm
Note: If the Maven tool here is not the latest source, it may just be Maven2,zeppelin compilation needs Maven3, or some tools will be affected by the download, you can download the binary compression package from MAVEN website, directly use.
The node command is also required here, and the Nodejs command will be installed automatically when Apt-get installs NPM, where only a link can be created:sudo ln -s /usr/bin/nodejs /usr/bin/node
Installation configuration for the Zeppelin-web project
I used to zeppelin the entire project MAVEN deployment of the time always appear Zeppelin-web project failure, not its solution, referring to the method on the Web, the Zeppelin-web project for a separate installation configuration.
Every step here is critical, I am here to install the installation toss many times, in order to complete the normal installation, the following one by one road.
Remove the contents of the Zeppelin-web project Pom.xml below, in exchange for manual installation:
<plugin> <groupId>Com.github.eirslett</groupId> <artifactid>Frontend-maven-plugin</artifactid> <version>0.0.23</version> <executions> <execution> <ID>Install node and NPM</ID> <goals> <goal>install-node-and-npm</goal> </goals> <configuration> <nodeversion>v0.10.18</nodeversion> <npmversion>1.3.8</npmversion> </configuration> </Execution> <execution> <ID>NPM Install</ID> <goals> <goal>Npm</goal> </goals> </Execution> <execution> <ID>Bower Install</ID> <goals> <goal>Bower</goal> </goals> <configuration> <arguments>--allow-root Install</arguments> </configuration> </Execution> <execution> <ID>Grunt Build</ID> <goals> <goal>Grunt</goal> </goals> <configuration> <arguments>--no-color--force</arguments> </configuration> </Execution> </executions> </plugin>
Manual Installation steps:
1. Install NPM and Node
2. Enter the Zeppelin-web directory and execute npm install
. It installs some grunt components according to Package.json's description, installs the Bower, and then produces a node_modules directory under the catalog.
3. Execution bower –-allow-root install
, will be based on Bower.json installation of pre-Library dependencies, a bit similar to Java MVN.
4. Execute grunt --no-color –-force
, the Web file will be organized according to Gruntfile.js.
3rd, 4 steps to note that the command used in the Bower and grunt files that were originally given, "node/node"
because when using MAVEN automatic installation, the node directory is generated under the current directory, which contains the node command. We have previously installed the Nodejs command and have a new link to the command node, so we need to modify it here "node"
.
5. Execute mvn install -DskipTests
, Package Web project, generate war in target directory
Pom.xml when generating the war package, refer to the dist\WEB-INF\web.xml
file, so before performing this step, it is necessary to clear the Zeppelin-web directory by the Dist directory in order to eventually generate the correct war package.
Compilation of other Zeppelin projects
Other projects are compiled according to normal procedures, installation documentation: http://zeppelin.incubator.apache.org/docs/install/install.html
To compile your own way:
Local mode:
mvn install -DskipTests
Cluster mode:
mvn install -DskipTests -Dspark.version=1.1.0 -Dhadoop.version=2.2.0
Configuration
The configuration file is the environment variable file (conf/zeppelin-env.sh) and the Java Properties File (Conf/zeppelin-site.xml). Configure according to your requirements.
Start, close
The start and close Zeppelin process commands are:
bin/zeppelin-daemon.sh start
bin/zeppelin-daemon.sh stop
Resources
Apache Zeppelin Installation and introduction
reprint Please indicate the author Jason Ding and its provenance
Gitcafe Blog Home page (http://jasonding1354.gitcafe.io/)
GitHub Blog Home page (http://jasonding1354.github.io/)
CSDN Blog (http://blog.csdn.net/jasonding1354)
Jane Book homepage (http://www.jianshu.com/users/2bd9b48f6ea8/latest_articles)
Google search jasonding1354 go to my blog homepage
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Installation of the Apache Zeppelin for the Spark Interactive analytics platform