A relational database management system (RDBMS) is the most commonly used system for storing and using data, but the scalability of these databases is not very good for large amounts of data.
In recent years, the concept of NoSQL has been widely welcomed because of the increasing demand for substitute products for relational databases. The biggest motivation behind NoSQL is scalability. The NoSQL database solution provides a way to store and use large amounts of data, with less overhead, fewer workloads, better performance, and less downtime.
The Apache Cassandra is a NoSQL database based on columns. It was developed by Facebook to drive its Inbox search, and became an Open-source project for Apache. Twitter, Digg, Reddit and many other organizations are already starting to use it.
The Cassandra itself provides a very basic interactive command-line interface (CLI). Developers can use the CLI to connect to remote nodes in the cluster, create or update patterns, and set up and retrieve records.
The CLI is a useful tool for Cassandra administrators. Even if the underlying command is provided, it is a good example of how to implement the Cassandra client. To develop a custom Cassandra client or even an extended CLI tool, you must understand how the CLI works inside.
This article uses the Jarchitect tool and the Cqlinq language to parse the CLI's code base to explore the architecture model of the CLI. The Jarchitect tool is used to analyze code structures and to specify design principles for better code quality. With Jarchitect, software quality can be measured with code metrics, visualized using graph and tree graphs (treemap), and executed with standard and custom rules.
The following is an analysis of the dependency diagram:
Cassandra uses a number of well-known jar packages, such as ANTLR, log4j, slf4j, Commons-lang, and also uses a number of jars that you don't know about, such as the following:
Libthrift: It is an API across a variety of programming languages and use cases, with the goal of achieving the reliability and high performance of cross-language communication and data serialization as efficiently and seamlessly as possible.
Snakeyaml:yaml is a data serialization format designed for human readability and interaction with scripting languages. The Cassandra configuration file is in this format.
Jackson: A high-performance JSON processor.
Snappy: It is a fast compression/decompression program written in C + +, originally developed by Google, and Snappy-java is its Java version.
High-scale-lib: It is a collection of concurrent and highly extensible utilities designed to directly replace collection classes in package java.util.* or java.util.concurrent.*, while many CPUs have better performance when using collections concurrently.
The matrix diagram of the following figure is a more detailed description of the dependency weights between these jar files.