It was easy to choose a database two or three years ago. Well-funded companies will choose Oracle databases, and companies that use Microsoft products are usually SQL Server, while budget-less companies will choose MySQL. Now, however, the situation is much different.
In the last two or three years, many companies have launched their own Open-source projects to store information. In many cases, these projects discard traditional relational database guidelines. Many people refer to these items as NoSQL, the abbreviation for "not only SQL." While some NoSQL databases are simple, some NoSQL databases are extremely complex. However, their goal is to replace relational databases and achieve higher system performance.
NoSQL's supporters have succeeded in building faster, more scalable applications by abandoning the traditional architecture. But some conservative database administrators are dismissive. They believe that many of the problems that have been solved by SQL will become a stumbling block to NoSQL. NoSQL supporters don't care because they have different project needs and are now targeting new targets.
What's the difference between the NoSQL project? These new databases are built in their own way. Instead, the old SQL database aggregates a large number of features and a standard set of languages. Packages may pair keys with values, but they can be adjusted for different usage cases. The main variables are not data formats, but how often they are copied, stored, and segmented.
For example, do you store data that is often recovered, such as a personal e-mail address? Are some data stored for emergencies, such as log files? Do you want to have more users who use small-capacity data, or do you want to have only a few users with large-capacity data? If you lose a few lines of user data, does your behavior affect the survival of your users, and will these users sue you?
In the past, each architect was plagued by MySQL settings. Now, the architect can choose a new project. If your project needs match the performance of a new database, there is no doubt that there is a huge advantage in this confusion. If they are very regular, performance will be incredibly elevated. However, developers will not build a "fearless battleship" that solves all the problems.
Previously, developers created good cross-database libraries to eliminate differences and make them easier to convert. For example, many Java developers write code on the JDBC function library. These databases are well interoperable. None of the old libraries can work with these new databases. Although many projects use similar methods, porting one function library to another requires a lot of rewriting.
To make things worse, many ancillary projects have disappeared. There are many kinds of report generation tools, but none of the new databases can use these tools. They won't work if they don't make a toss. There may be hundreds or thousands of packages working with SQL, but few packages can help NoSQL. There are indications that this interoperability takes a long time to appear on the NoSQL. In addition, there are great differences in query language.
Cassandra
Facebook needs a faster, cheaper way to handle billions of status updates. So they started the project and eventually ported it to Apache, which is Cassandra. On Apache, it can get help from many communities. Cassandra is no longer just for Facebook, and many of the programmers who work for the project come from other companies. Now Datastax.com is working to provide business support for Cassandra.
Cassandra is an excellent tool for tracking large amounts of data such as status updates on Facebook. This tool can help create a computer network, and all calculations on the network have the same data. This means that each machine can be substituted for each other. Once the data passes through Peer-to-peer network nodes, their consistency is lost. The key is "final agreement", not "consistent". If you find that your status updates are disappearing on Facebook and then appearing again, you know what this means.
CouchDB
COUCHDB is used to store documents, and the biggest change is in queries. Instead of some basic query structures, COUCHDB searches for documents using two functions to navigate and reduce data. One document format and another to determine which documents to include. A skilled Oracle database operator who knows how to store the program will do the same thing. But navigating and reducing the structure will be an eye-opener for grassroots programmers. Ajax developers are now able to write fairly complex search programs that can write more complex logic.
The core of Couchdb is written by Erlang. But APIs and interfaces are JavaScript or JSON.
The JavaScript API simply strengthens the appeal of Couchdb to ordinary Web developers. These developers can store documents, or even the entire Web site, in a database.
MongoDB
MongoDB is a typical example of how JavaScript can master the world. The program gets the data formatted as JSON and stores it. Queries are the basic function of JavaScript, which is not much different from using the browser console, but simplifies something. The big difference is that MongoDB will create an index for your database, and if the index is created correctly, the feedback query results will be fast. In addition, the database can work in conjunction with a large number of other tools.
Redis
Like Couchdb and MongoDB, Redis is used to store documents and files that are organized by key values. Unlike other NoSQL databases, it stores not just strings or numbers, but also categories and unclassified string collections as values associated with keys. This feature makes it possible for users to provide more complex collection operations. Users no longer need to download data to compute the intersection because Redis can do this on the server.
Redis has spawned simple structures that do not have too many encodings. Luke Melia tracks audiences on its web site by creating a new collection. The combination of the last five sets identifies those who were online at that time. The intersection of this set of friends with a buddy list can generate an online buddy list. Such sets of operations have many applications. The Redis cluster reveals its powerful function.
Redis stores the data in memory, recording only the list of each change. Because of its powerful ability to write to the cache written on the hard disk, many people do not even refer to Redis as a database. Because the Redis only needs to wait before the data is read into memory, the speed is much faster than the traditional database, but the failure of the right time causes its potential application risk.
Riak
Riak is the most sophisticated design of data storage, with most of the functionality of other products and more control over replicas. Although the basic structure stores multiple pairs of key values, there are many options for restoring them and ensuring their consistency. For example, a write operation involves requiring Riak to confirm that the data was transferred successfully to other machines on the cluster. If you don't want to trust only a single machine, you can ask for it until two, three, or 54 machines write the data before sending the confirmation message. This is why the team can play the slogan "final consistency is not an excuse for data loss".
The data itself is not just written on the hard disk. This is only one option, but it's not the main one. Riak uses a plug-in storage engine (default Bitcask). The engine writes data to the hard disk in its own internal format. In addition, it has several options, including the InnoDB version used by MySQL. Riak's ability to cluster can ensure that everything is foolproof.
When fetching data, Riak eliminates any errors that may occur. If the target version of the two nodes is different, then Riak will either choose the latest upgrade or return the two target versions to your client code for decision. This is a useful option for discovering potential errors in data.
Neo4j
Among the few applications we have mentioned, Neo4j is one of the most distinctive. It can be used to store graphs instead of data. It stores graph data as nodes and side (relational) patterns. Social networking applications are its strengths. Neo4j is very new, and developers are still looking for better algorithms. In the new version, the developer starts experimenting with a new caching strategy: Because NEO4J can cache node information, the search algorithm runs fast. Developers also add new query languages that resemble XSL pattern matching.
NEO4J was supported by Neo Marvell. The company's commercial version of the database has backup, recovery, and complex monitoring capabilities.
Flockdb
Some people complain that the code is too complex, they think neo4j too complex, beyond their needs. Then they might as well try Flockdb. FLOCKDB is a real-time, distributed database that is the core component of the Twitter infrastructure. Twitter launched an Open-source project FLOCKDB based on Apache licenses more than a year ago. If you want to build your own Twitter, then you need to download the Gizzard tool, which is to split data across multiple flock. Because FLOCKDB stores the association between two nodes, many of us refer to FLOCKDB as the "graph database." However, it is also argued that this term applies only to complex tools such as neo4j.
How do I select the NoSQL database?
The question of how to choose the NoSQL database is not easy to answer. Many IT departments will randomly pick one, and sometimes the databases they choose do not meet their needs. Because good developers want to balance the benefits of the project, the availability of business support, and the quality of the document, it is difficult to choose the best database.
These databases all store large heaps of keys and values, but the real problem is how to properly distribute the load in the server and how to pass the changes to them. Another problem is hosting. It's very appealing that cloud services can do all the maintenance for you. NoSQL database data exchange is more difficult than SQL database. There is no standard query language in the world, nor a large virtual layer like JDBC. Nevertheless, the NoSQL database is already attractive enough for us.
(Responsible editor: admin)