/profile Environment Variables
Step 1 use the following command to open the/etc/profile file:
$ Sudo vi/etc/profile
Step 2: set the following parameters:
Export HADOOP_HOME =/app/hadoop/hadoop-2.2.0
Export HIVE_HOME =/app/complied/hive-0.13.1-src
Export HIVE_DEV_HOME =/app/complied/hive-0.13.1-src
Step 3: Configure and verify
$ Sudo vi/etc/profile
$ Echo $ HIVE_DEV_HOME1.3.3 run sbt for compilation
To run hive/console, you do not need to start Spark.
Reference Site:https://github.com/yahoo/kafka-managerFirst, the function
Managing multiple Kafka clusters
Convenient check Kafka cluster status (topics,brokers, backup distribution, partition distribution)
Select the copy you want to run
Based on the current partition status
You can choose Topic Configuration and Create topic (different configurations for 0.8.1.1 and 0.8.2)
Delete topic (supports only 0.8.2 and above and to set delete.topic.enable=true in broker conf
Configuring the Playframework EnvironmentDownload jar Package [Play with Activator], this step is a bit dizzy is the Java programmer cmd to perform the steps, run the jar package download a lot of configuration files, some resources do not have a VPN link on the Hanging machine one night;The next best ...Go to the folder where the files are located, configure environment variables [; directory: \activator]NewCreate a new project will prompt to choose what template, originally should choose Java
by each task, therefore, some variables are not shared. However, I need to share the variables that can be shared in the task or between the task and the dynamic program. Spark supports two types of shared variables:
Broadcast variable: Can be stored in all the points, used to store variables (only)
Accumulators: Only variables used for addition, such as sum
Some examples of Nantong show some features. It is better to be familiar with Scala, especially the packet method. Note that Spark can run
are some certificate files in the referenced third-party jar package. Of course, you can also use MVN packaging to perform mvn clean package Pom-related configuration in terminal:
question two, SBT packaging
Using the SBT package will only package the program, and the dependent jar packages will not be included in the same jar, and the sbt-assembly plug-in
Compilation of Spark Source code package
There are only two types of Spark Source code package: Maven and SBT (Simple Build Tool), which are only applicable to different scenarios:
Maven Compilation
SBT Compilation
IntelliJ IDEA compilation (can be compiled using Maven or SBT plug-in), suitable for developers
Package deployment generation (embedded Maven compilat
Preparatory work:Jdk-7u51-windows-i586.exeScala-2.10.3.msiSbt-0.13.2.msiSpark-1.0.0.tgzScala-sdk-3.0.3-2.10-win32.win32.x86.zip1.1. Installing the JDKTo install Jdk-7u51-windows-i586.exe, press the default settings. Assuming installation to directory d:\jdk1.7;After the installation is finished, add d:\jdk1.7\bin to the environment variable PATH ;1.2. Install ScalaTo install Scala-2.10.3.msi, press the default settings. Assuming installation to directory d:\scala2.10.3;After the installation is
Welcome to Exchange Study: email:sparkexpert@sina.com
There's a lot of installation on Spark R on the web, but it always goes wrong in that process. Of course the most common problem is: launching SBT from Sbt/sbt-launch-0.13.6.jar error:invalid or corrupt jarfile Sbt/sbt-la
The previous article was a primer on spark SQL and introduced some basics and APIs, but it seemed a step away from our daily use.There are 2 uses for ending shark:1. There are many limitations to the integration of Spark programs2. The Hive Optimizer is not designed for spark, and the computational model is different, making the Hive optimizer to optimize the spark program to encounter bottlenecks.Here's a look at the infrastructure of Spark SQL:Spark1.1 will support the spark SQL CLI when it is
1. The concept of the channel
A channel represents a stream of data to a device (disk or tape) and produces a corresponding server session on the target database or a secondary database instance (server sessions)
Multiple channels generate multiple server sessions, which will complete backup, restore, and recovery operations, and so on
Channels are divided into disk channels for backup or restore to disk (disk channel), Backup to tape channel (SBT)
A) preparatory workInstalling SBT on Linuxcurl https://bintray.com/sbt/rpm/rpm | sudo tee /etc/yum.repos.d/bintray-sbt-rpm.reposudo yum install sbt根据spark版本下载Spark-jobserverhttps://github.com/spark-jobserver/spark-jobserver/releasesThe version of the sample download is 0.6.2 https://github.com/spark-jobserver/spark-job
, depending on the individual situation, if installed on the C drive, change "D" to "C" Can.
Set the PATH variable: Locate the "path" under System variables as shown in figure, click Edit. Add the following path at the front of the "Variable Value" column:%scala_home%\bin;%scala_home%\jre\bin;
Note: The following semicolon; don't miss out.
Set the Classpath variable: Locate the "Classpath" under System variables, click Edit, if not, click New:
· "Variable name": ClassPath
· "Variable Value":
·
step. Please search by yourself for specific installation steps.Second, installing SBT,SBT is the build tool in Scala that works like Maven. Installation is relatively simple, the specific steps please refer to: http://www.scala-sbt.org/0.13/tutorial/zh-cn/Setup.htmlThird, after the installation is complete, execute the SBT command on the console and a message s
1) Preparatory work1) Install JDK 6 or JDK 7 or JDK8 Mac's see http://docs.oracle.com/javase/8/docs/technotes/guides/install/mac_jdk.html2) Install Scala 2.10.x (note version) See http://www.cnblogs.com/xd502djj/p/6546514.html2) Download IntelliJ Idea's latest version (this article IntelliJ idea Community Edition 13.1.1 as an example, different versions, the interface layout may be different): http://www.jetbrains.com/idea/download/3) After extracting the downloaded IntelliJ idea, install the Sc
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.