Cassandra together spark big data analysis will usher in what changes?

Source: Internet
Author: User
Tags cassandra datastax databricks

The 2014Spark Summit was held in San Francisco, and the database platform supplier DataStax announced that, in collaboration with Spark supplier Databricks, in its flagship product DataStax Enterprise 4.5 (DSE), Cassandra The NoSQL database, combined with the Apache Spark Open Source Engine, provides users with real-time analytics based on memory processing.

Databricks is a company founded by the founder of Apache Spark. Speaking of this cooperation, DataStax Vice President John Glendenning said: "The integration of Spark and Cassandra, this is the first time in the database industry cooperation." ”

Cassandra is a distributed, highly scalable database that allows users to create online applications that process large amounts of data in real time.

Apache Spark is a processing engine for Hadoop clusters that can accelerate hadoop by up to 100 times times in memory and 10 times times faster when running on disk. Spark also provides features such as SQL, streaming data processing, machine learning, and graph computing.

The combination of Cassandra and Spark makes it easier to implement end-to-end analytics workflows. In addition, the analytical performance of transactional databases can be greatly improved, and enterprises can respond to customer needs more quickly.

The combination of Cassandra and Spark is the gospel for companies that need to deliver real-time recommendations and personalized online experiences to their customers.

Cassandra/spark application precedent for video analytics companies

The use of the Cassandra+spark architecture has precedent, and Ooyala is one of them. Ooyala is a video analytics provider. Ooyala handles 2 billion video events per day, and approximately 28TB of data is processed on approximately 220 nodes. But Harry Robertson, the head of Ooyala's technical team, can confidently say: "We're not just telling customers that your video has been played 100 times a few days, and we'll provide more detailed information, such as 80 plays from Beijing and 20 times from yahoo.com." "And it is the Cassandra cluster that sustains it all.

However, the ability to handle only big data is not enough, and Ooyala needs to turn the "mountain" of primitive events into small, actionable events. The company has previously considered Hadoop, but Hadoop is more scalable and less real-time. The real-time streaming framework, such as Storm, is also considered, but it has the advantage of dealing with fixed processes, and the ability to query with elasticity is poor. Finally, Ooyala selected the memory distributed computing framework Spark.

Now Ooyala is running the Spark/cassandra architecture.

Cassandra together spark big data analysis will usher in what changes?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.