How to install and deploy Cassandra distributed NoSQL Database
Apache Cassandra is an open-source Distributed Key-Value storage system. It was initially developed by Facebook to store particularly large data. Cassandra is suitable for real-time transaction processing and provision of structured data. Cassandra's data model is a four-dimensional or five-Dimensional Model Based on the Column Family. It uses Memtable and SSTable for storage based on the data structure and features of Amazon Dynamo and Google's BigTable. Before writing data to Cassandra, you must first record the log (CommitLog) and then write the data to the Memtable corresponding to Column Family. Memtable is a memory structure that sorts data by key, when certain conditions are met, refresh the Memtable data to the disk in batches and store it as SSTable. This article describes how to install and configure Cassandra.
Note: by default, you have installed the JDK environment.
I. installation and configuration of Cassandra nodes
1. Download Cassandra
Wget http://www.apache.org/dyn/closer.cgi? Path =/cassandra/2.1.5/apache-cassandra-2.1.5-bin.tar.gz
2. decompress the file
Tar-zxvf apache-cassandra-2.1.5-bin.tar.gz
Music apache-cassandra-2.1.5-rc1 cassandra
3. Cassandra directory description
Bin stores scripts related to Cassandra operations
Conf directory for storing configuration files
Interface Cassandra's Thrift interface definition file, which can be used to generate interface code for various programming languages
Javadoc source code
Jar package required for lib Cassandra Runtime
4. Prepare the data storage directory for the Cassandra Node
# Modify preparation file storage-conf.xml
# Cd conf
<CommitLogDirectory>/data/db/lib/cassandra/commitlog </CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/data/db/lib/cassandra/data </DataFileDirectory>
</DataFileDirectories>
5. Modify the log preparation file log4j. properties
# Log Path
# Log4j. appender. R. File =/var/log/cassandra/system. log
# Log Path after Configuration:
Log4j. appender. R. File =/data/db/log/cassandra/system. log
6. Create a directory for storing data and logs.
# Mkdir-p/data/db/lib/cassandra
# Mkdir-p/data/db/log/Cassandra
7. After preparation, start Cassandra
# Bin/Cassandra
INFO 09:29:12, 888 Starting up server gossip
INFO 09:29:12, 992 Binding thrift service to localhost/127.0.0.1: 9160
# When you see the Echo information of these two lines, it indicates that Cassandra has been started successfully.
8. Connect to Cassandra and add and obtain data
# Bin/cassandra-cli -- host localhost -- port 9160
# Cassandra>
# Cassandra> set Keyspace1.Standard2 ['studenta '] ['age'] = '18'
# Value inserted
# Cassandra> get Keyspace1.Standard2 ['studenta ']
#=> (Column = age, value = 18, timestamp = 1272357045192000)
# Returned 1 results
9. Stop the Cassandra Service
# Ps-ef | grep cassandra
# Kill-9 16250
Ii. Supplement
Cassandra preparation documents storage-conf.xml related preparation instructions
# Storage-conf.xml
<! -- The node name displayed when the cluster is running -->
<ClusterName> Test Cluster </ClusterName>
<! -- Whether the node is automatically added to the cluster when it is started. The default value is false. -->
<AutoBootstrap> false </AutoBootstrap>
<! -- Cluster node configuration -->
<Seeds> <Seed> 127.0.0.1 </Seed> </Seeds>
<! -- Communication listening address between nodes -->
<ListenAddress> localhost </ListenAddress>
<! -- The cassandra client listening address based on Thrift. The cluster is set to 0.0.0.0, which indicates listening to all clients. The default value is localhost. -->
<ThriftAddress> localhost </ThriftAddress>
<! -- Client Connection port -->
<ThriftPort> 9160 </ThriftPort>
<! -- FlushDataBufferSizeInMB: writes data on memtables to the Disk. If the size exceeds the specified size (32 MB by default), data is written to the Disk,
After FlushIndexBufferSizeInMB exceeds the set duration (8 minutes by default), write the data from memtables to the disk. -->
<FlushDataBufferSizeInMB> 32 </FlushDataBufferSizeInMB>
<FlushIndexBufferSizeInMB> 8 </FlushIndexBufferSizeInMB>
<! -- Log synchronization mode between nodes. Default Value: periodic. When batch is started when CommitLogSyncPeriodInMS is configured, CommitLogSyncBatchWindowInMS -->
<CommitLogSync> periodic </CommitLogSync>
<! -- Log records are synchronized every 10 seconds by default. -->
<CommitLogSyncPeriodInMS> 10000 </CommitLogSyncPeriodInMS>
<! -- <CommitLogSyncBatchWindowInMS> 1 </CommitLogSyncBatchWindowInMS> -->
Quick Start to NoSQL databases. For details about how to download high-definition PDF, see
Basic knowledge about NoSQL Databases
Key to enterprise application of NoSQL
This article permanently updates the link address: