Example of integrated development of Spring Boot with Spark and Cassandra systems, sparkcassandra
This article demonstrates how to use Spark as the analysis engine and Cassandra as the data storage, and use Spring Boot to develop the driver.
1. Prerequisites
- Install Spark (Spark-1.5.1 is used in this article, for example, the installation directory is/opt/spark)
- Install Cassandra (3.0 +)
Create keyspace
CREATE KEYSPACE hfcb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
Create table
CREATE TABLE person ( id text PRIMARY KEY, first_name text, last_name text);
Insert Test Data
insert into person (id,first_name,last_name) values('1','wang','yunfei');insert into person (id,first_name,last_name) values('2','peng','chao');insert into person (id,first_name,last_name) values('3','li','jian');insert into person (id,first_name,last_name) values('4','zhang','jie');insert into person (id,first_name,last_name) values('5','liang','wei');
2. spark-cassandra-connector Installation
To enable Spark-1.5.1 to use Cassandra as data storage, add the dependencies of the following jar package (for example, place the package in the/opt/spark/managed-lib/directory, which can be arbitrary ):
cassandra-clientutil-3.0.2.jarcassandra-driver-core-3.1.4.jarguava-16.0.1.jarcassandra-thrift-3.0.2.jar joda-convert-1.2.jarjoda-time-2.9.9.jarlibthrift-0.9.1.jarspark-cassandra-connector_2.10-1.5.1.jar
Under the/opt/spark/conf directory, create a spark-env.sh file and enter the following content
SPARK_CLASSPATH=/opt/spark/managed-lib/*
3. Spring Boot Application Development
Add spark-cassandra-connector and spark Dependencies
<dependency> <groupId>com.datastax.spark</groupId> <artifactId>spark-cassandra-connector_2.10</artifactId> <version>1.5.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.5.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.10</artifactId> <version>1.5.1</version> </dependency>
Configure the spark and cassandra paths in application. yml.
spark.master: spark://master:7077cassandra.host: 192.168.1.140cassandra.keyspace: hfcb
In particular, spark: // master: 7077 is a domain name rather than an ip address. You can modify the local hosts file to map the master and ip addresses.
Configure SparkContext and CassandraSQLContext
@Configurationpublic class SparkCassandraConfig { @Value("${spark.master}") String sparkMasterUrl; @Value("${cassandra.host}") String cassandraHost; @Value("${cassandra.keyspace}") String cassandraKeyspace; @Bean public JavaSparkContext javaSparkContext(){ SparkConf conf = new SparkConf(true) .set("spark.cassandra.connection.host", cassandraHost)// .set("spark.cassandra.auth.username", "cassandra")// .set("spark.cassandra.auth.password", "cassandra") .set("spark.submit.deployMode", "client"); JavaSparkContext context = new JavaSparkContext(sparkMasterUrl, "SparkDemo", conf); return context; } @Bean public CassandraSQLContext sqlContext(){ CassandraSQLContext cassandraSQLContext = new CassandraSQLContext(javaSparkContext().sc()); cassandraSQLContext.setKeyspace(cassandraKeyspace); return cassandraSQLContext; } }
Simple call
@Repositorypublic class PersonRepository { @Autowired CassandraSQLContext cassandraSQLContext; public Long countPerson(){ DataFrame people = cassandraSQLContext.sql("select * from person order by id"); return people.count(); }}
You can run it as in the general Spring Boot program.
Source Code address: https://github.com/wiselyman/spring-spark-cassandra.git
Summary
The above is an example of Spring Boot integrated development with Spark and Cassandra systems. I hope it will help you. If you have any questions, please leave a message, the editor will reply to you in a timely manner. Thank you very much for your support for the help House website!