OpenStack joins Apache top-level project Cassandra

Source: Internet
Author: User
Keywords Can level items server

Apache Cassandra is a highly performance, scalable, distributed NoSQL database with a flexible, simple partitioned row storage data model that can be used to deal with commercial servers and massive data storage across data centers without a single point of failure. It was originally developed by Avinash Lakshman (Amazon Dynamo's developer) and Prashant Malik on Facebook, designed to address their inbox-search issues, and then officially open source in July 2008, and since then, Thanks to the strong support of IBM, Twitter and Rackspace, Cassandra has been developing at an alarming rate since February 2010, Cassandra has become the Apache top project.

Cassandra abandons the widely used master-slave setting to support point-to-point clustering, which makes Cassandra without a single point of failure, and because there is no master server, it makes all slave servers useless when faced with a large number of requests. Any number of commercial server clusters can be integrated into the Cassandra cluster, although this architecture is more complex in the background, but our users are easy to operate. Because there is no need to differentiate between master and slave nodes, this will allow you to add any number of machines to any cluster in any data center, each server accepting requests from any client, and the server is equal.

What's Cassandra good at? Fast Read and write performance allows adding more machines reliable replication across data centers

...... Acid transaction processing (atomicity, consistency, isolation, and persistence) is not required at the database layer.

Cassandra is good at online transactions: Requests need to be fully executed within a short period of time, otherwise users will feel a delay, which needs to be performed in milliseconds, rather than hundreds of or thousands of milliseconds. Because of Cassandra's multiple cache levels, your data can be handled at an incredibly fast rate. Because of the Cassandra log structure storage design, each write operation is fast, and each write operation is submitted to the log, when downtime or data loss is difficult to accept, Cassandra is an excellent choice.

Cassandra is also very good at data management (analytics), and the current version, MapReduce, supports the storage of data. MapReduce is a Google-promoted algorithm that allows cross servers to parse queries on large datasets in parallel, not in real time, but it can handle large datasets to search for the information you need. Because Cassandra provides both online and analytics solutions, you can use a single technology to accomplish most of your data needs, which will help with development, QA, and operational efficiencies.

Cassandra and OpenStack

It should now be clear that with the OpenStack abstraction of the server infrastructure and the data centers that define the Cassandra needs, the development, deployment, and operation of all phases is simplified, Cassandra and OpenStack are at least conceptually well matched.

However, until recently, managing OpenStack Cassandra was still difficult. Using the orchestrator template can provide database instances, but it is largely impractical for end users to manage normal security policies, such as not accessing a database from a wide area network. However, the Trove OpenStack Dbaas solution has come out, providing an API to allow users to interact through IN-VM proxies, as well as through defined management interfaces.

Cassandra and OpenStack DBaaS

OpenStack Dbaas now supports the Apache Cassandra NoSQL database, and its first version will contain:

Provides cassandradb as a separate instance to support maintenance (start, stop, reboot, configuration) adjustment events

The improved OpenStack Juno version will contain:

Configuration management Backup (Nodetool snapshot + custom script) restore (custom script) incremental backup (cassandrax2.x.x or above)

Conclusion

Cassandra is a highly available, Internet-NoSQL database that differs greatly from those of traditional relational databases. The differences between Cassandra and relational databases can be considered to be their pros and cons, and the use of NoSQL does not preclude the use of rdbms--of course, it is also common to use a hybrid structure to use the appropriate database in a different solution depending on the situation.

When using NoSQL for the first time, developers may encounter many new concepts, such as large data and final consistency. When migrating from relational and robust consistency to NoSQL, the biggest shift may be to build applications for final consistency. Data modeling may be another area that developers need to understand.

Cassandra is used in a wide range of applications, especially for:

Very large volume of data user transactions very large demand data storage high reliability a Dynamic Data model, the data may be relatively unstructured, or its structure may change over time across the data center distribution

Now, the Apache Cassandra NoSQL database service is part of the OpenStack Database cloud service.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.