Cassandra uses pycassa to batch import data

Source: Internet
Author: User
Tags cassandra

This week, I took over the maintenance of a Cassandra system. One of them was to import the data of the application to the Cassandra cluster we maintained and provide HTTP access services for the application. This is my first time in contact with the kV system. It turns out that I have seen kV and nosql. But there is actually no practical experience. After two days of learning and taking over, I finally figured out how to use it in the production environment. Take a brief note here. This article includes the following content:

Cassandra introduction,

Cassandra related CLI

Cassandra's Python API, and provides an example of batch data import.


1. Cassandra Introduction

Cassandra is not a database, but a distributed network service composed of a bunch of database nodes. A write operation on Cassandra will be copied to other nodes, read operations on Cassandra are also routed to a node for reading. For a Cassandra cluster, it is relatively simple to expand the performance, just add nodes to the cluster.

Cassandra is a hybrid non-relational database, similar to Google's bigtable. Its main functions are richer than dynomite (Distributed Key-value storage system), but its support is not as good as that of document storage MongoDB (an open-source product between relational databases and non-relational databases, non-relational databases have the most abundant functions and are most like relational databases. The supported data structure is very loose and is similar to the JSON bjson format. Therefore, you can store complicated data types .) Cassandra was initially developed by Facebook and then transformed into an open-source project. It is an ideal database for networking and Social cloud computing. Based on Amazon's proprietary fully distributed dynamo, it integrates the data model of Google bigtable Based on column family. P2P decentralized storage. Dynamo 2.0 can be called in many aspects.

Compared with other databases, it has several outstanding features:

  1. Flexible mode: with Cassandra, such as document storage, you do not have to solve the fields in the record in advance. You can add or remove fields at will when the system is running. This is an amazing increase in efficiency, especially in large departments.
  2. Real Scalability: Cassandra is purely horizontal scaling. To add more capacity to the cluster, you can point to another computer. You do not have to restart any process, change application queries, or manually migrate any data.
  3. Multi-Data Center Identification: You can adjust the node layout to avoid a data center fire. A backup data center will have at least full replication of each record.

Some other features that make Cassandra more competitive:

  1. Range Query: if you do not like all key-value queries, you can set the range of keys to query.
  2. List Data Structure: in hybrid mode, you can add a super column to a 5-dimension table. This is very convenient for each user's index.
  3. Distributed write operation: You can read or write any data in a centralized manner at any time. And there will be no single point of failure.

To be continued. It's too busy recently.

Cassandra uses pycassa to batch import data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.