Example of integrated development of Spring Boot with Spark and Cassandra systems, sparkcassandra

Source: Internet
Author: User
Tags cassandra

Example of integrated development of Spring Boot with Spark and Cassandra systems, sparkcassandra

This article demonstrates how to use Spark as the analysis engine and Cassandra as the data storage, and use Spring Boot to develop the driver.

1. Prerequisites

  • Install Spark (Spark-1.5.1 is used in this article, for example, the installation directory is/opt/spark)
  • Install Cassandra (3.0 +)

Create keyspace

CREATE KEYSPACE hfcb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };

Create table

CREATE TABLE person ( id text PRIMARY KEY, first_name text, last_name text);

Insert Test Data

insert into person (id,first_name,last_name) values('1','wang','yunfei');insert into person (id,first_name,last_name) values('2','peng','chao');insert into person (id,first_name,last_name) values('3','li','jian');insert into person (id,first_name,last_name) values('4','zhang','jie');insert into person (id,first_name,last_name) values('5','liang','wei');

2. spark-cassandra-connector Installation

To enable Spark-1.5.1 to use Cassandra as data storage, add the dependencies of the following jar package (for example, place the package in the/opt/spark/managed-lib/directory, which can be arbitrary ):

cassandra-clientutil-3.0.2.jarcassandra-driver-core-3.1.4.jarguava-16.0.1.jarcassandra-thrift-3.0.2.jar joda-convert-1.2.jarjoda-time-2.9.9.jarlibthrift-0.9.1.jarspark-cassandra-connector_2.10-1.5.1.jar

Under the/opt/spark/conf directory, create a spark-env.sh file and enter the following content


3. Spring Boot Application Development

Add spark-cassandra-connector and spark Dependencies

<dependency>   <groupId>com.datastax.spark</groupId>   <artifactId>spark-cassandra-connector_2.10</artifactId>   <version>1.5.1</version>  </dependency>  <dependency>   <groupId>org.apache.spark</groupId>   <artifactId>spark-core_2.10</artifactId>   <version>1.5.1</version>  </dependency>  <dependency>   <groupId>org.apache.spark</groupId>   <artifactId>spark-sql_2.10</artifactId>   <version>1.5.1</version>  </dependency>

Configure the spark and cassandra paths in application. yml.

spark.master: spark://master:7077cassandra.host: hfcb

In particular, spark: // master: 7077 is a domain name rather than an ip address. You can modify the local hosts file to map the master and ip addresses.

Configure SparkContext and CassandraSQLContext

@Configurationpublic class SparkCassandraConfig { @Value("${spark.master}") String sparkMasterUrl; @Value("${cassandra.host}") String cassandraHost; @Value("${cassandra.keyspace}") String cassandraKeyspace; @Bean public JavaSparkContext javaSparkContext(){  SparkConf conf = new SparkConf(true)    .set("spark.cassandra.connection.host", cassandraHost)//    .set("spark.cassandra.auth.username", "cassandra")//    .set("spark.cassandra.auth.password", "cassandra")    .set("spark.submit.deployMode", "client");  JavaSparkContext context = new JavaSparkContext(sparkMasterUrl, "SparkDemo", conf);  return context; } @Bean public CassandraSQLContext sqlContext(){  CassandraSQLContext cassandraSQLContext = new CassandraSQLContext(javaSparkContext().sc());  cassandraSQLContext.setKeyspace(cassandraKeyspace);  return cassandraSQLContext; } }

Simple call

@Repositorypublic class PersonRepository { @Autowired CassandraSQLContext cassandraSQLContext; public Long countPerson(){  DataFrame people = cassandraSQLContext.sql("select * from person order by id");  return people.count(); }}

You can run it as in the general Spring Boot program.

Source Code address: https://github.com/wiselyman/spring-spark-cassandra.git


The above is an example of Spring Boot integrated development with Spark and Cassandra systems. I hope it will help you. If you have any questions, please leave a message, the editor will reply to you in a timely manner. Thank you very much for your support for the help House website!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.