15 NoSQL databases

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. MongoDB

Introduced

MongoDB is a database based on distributed file storage. Written by the C + + language. The main solution is the access efficiency of massive data, providing a scalable and high-performance data storage solution for Web applications. When the amount of data reached more than 50GB, MongoDB database access speed is more than 10 times times MySQL. MongoDB's concurrent read and write efficiency is not particularly good, according to the official performance tests show that approximately 5,000 to 15,000 read and write requests can be processed per second. MongoDB also comes with an excellent distributed file system Gridfs that can support massive data storage.

MongoDB also has a ruby project Mongomapper, a MongoDB interface that mimics Merb's datamapper, and is very simple to use, almost identical to Datamapper and very powerful.

MongoDB is a product between a relational database and a non-relational database, and is the most versatile and most like relational database in a non-relational database. The data structure he supports is very loose and is a JSON-like Bjson format, so you can store more complex data types. MONGO's biggest feature is that the query language he supports is very powerful, and its syntax is a bit like an object-oriented query language that almost implements most of the functionality of a relational database single-table query, and also supports indexing of data.

The so-called "set-oriented" (collenction-orented), meaning that data is grouped in a dataset, is called a collection (collenction). Each collection has a unique identifying name in the database and can contain an unlimited number of documents. The concept of a collection is similar to a table in a relational database (RDBMS), unlike it does not need to define any schema (schema).
Mode Freedom (schema-free) means that for files stored in a MongoDB database, we do not need to know any of its structure definitions. If necessary, you can store files of different structures in the same database.
The documents stored in the collection are stored in the form of key-value pairs. The key is used to uniquely identify a document as a string type, whereas a value can be a complex file type in each. We call this storage form Bson (Binary serialized dOcument Format).

The MongoDB server can run on Linux, Windows or OS X platforms, supports 32-bit and 64-bit applications, and the default port is 27017. It is recommended to run on a 64-bit platform because MongoDB supports a maximum file size of 2GB when running in 32-bit mode.

MongoDB stores the data in a file (the default path is:/data/db), which is managed using a memory-mapped file for increased efficiency.

Characteristics

It is characterized by high performance, easy to deploy, easy to use, and easy to store data. The main features are:

For collection storage, easy to store data for object types.
Mode of freedom.
Supports dynamic queries.
Supports full indexes, including internal objects.
Support Queries.
Supports replication and recovery.
Use efficient binary data storage, including large objects such as video.
Automatically handles fragmentation to support scalability at the cloud level.
Supports multiple languages such as ruby,python,java,c++,php,c#.
The file storage format is Bson (an extension of JSON).
can be accessed over the network.

Official website

http://www.mongodb.org/

2, CouchDB

Introduced

Apache CouchDB is a document-oriented database management system. It provides a REST interface with JSON as a data format to manipulate it and can manipulate the organization and rendering of documents through views. CouchDB is the top open source project of the Apache Foundation.

COUCHDB is a document-oriented database system developed with Erlang, and its data is stored in a format similar to the index file of Lucene. Couchdb The big point is that it is a new generation of storage systems for Web applications, in fact, the COUCHDB slogan is: The Next generation of Web application storage systems.

Characteristics

The main features are:

COUCHDB is a distributed database that distributes storage systems over n physical nodes, and coordinates and synchronizes the data read and write consistency between nodes. This, of course, can be achieved with the unparalleled concurrency features of Erlang. For the application of large-scale Web-based application documents, distributed can make it unnecessary to split the table as the traditional relational database, and make a lot of changes in the application code layer.
COUCHDB is a document-oriented database, storing semi-structured data, compared to the index structure of Lucene, especially suitable for storing documents, so it is suitable for CMS, telephone book, address of the application, in these applications, the document database is more convenient than the relational database, better performance.
COUCHDB supports rest APIs that allow users to manipulate COUCHDB databases using JavaScript, or to write query statements in JavaScript, and we can imagine How easy and convenient it would be to combine the Ajax technology with the CMS system developed by COUCHDB. In fact, the COUCHDB is just the tip of the Erlang application, and in recent years, Erlang-based applications have flourished, especially in large-scale, distributed applications based on the web, which are almost always the advantage of Erlang.

Official website

http://couchdb.apache.org/

3. Hbase

Introduced

HBase is a distributed, column-oriented, open-source database that comes from the Google paper "Bigtable: A distributed storage system of structured data" written by Chang et al. Just as BigTable leverages the distributed data store provided by the Google File system, HBase provides bigtable-like capabilities on top of Hadoop. HBase is a sub-project of the Apache Hadoop project. HBase differs from the general relational database, which is a database suitable for unstructured data storage. The other difference is that HBase is column-based instead of row-based patterns.

Hbase–hadoop Database is a highly reliable, high-performance, column-oriented, scalable distributed storage system that leverages HBase technology to build large-scale structured storage clusters on inexpensive PC servers. HBase is an open source implementation of Google BigTable, like Google BigTable using GFS as its file storage system, HBase uses Hadoop HDFs as its file storage system , Google runs mapreduce to handle massive amounts of data in bigtable, and HBase uses Hadoop MapReduce to handle massive amounts of data in HBase, and Google BigTable uses chubby as a collaborative service HBase uses zookeeper as its counterpart.

HBase Access Interface

Native Java API, the most routine and efficient way to access, is suitable for hadoop MapReduce job parallel batching hbase table data
HBase Shell,hbase's command-line tool, the simplest interface for hbase management use
Thrift Gateway, using Thrift serialization technology to support multiple languages such as C++,php,python, to access hbase table data online for other heterogeneous systems
Rest Gateway, which supports the rest-style HTTP API to access HBase, lifting language restrictions
Pig, you can use the Pig Latin streaming programming language to manipulate data in HBase, similar to hive, which is ultimately compiled into a mapreduce job to handle hbase table data for data statistics
Hive, the release version of the current hive is not yet supported for HBase, but HBase will be supported in the next version of Hive 0.7.0 and can be accessed using a similar SQL language to HBase

Characteristics

The main features are:

Supports billions of rows x millions of columns

Adopt a distributed Architecture Map/reduce

Optimize for real-time queries

High-performance Thrift gateways

Pre-contract query operations through server-side scanning and filtering

Supports XML, protobuf, and binary http

Shell based on Jruby (JIRB)

Changes to configuration and minor upgrades will be rolled back

There is no single point of failure

Random access performance comparable to MySQL

Official website

http://hbase.apache.org/

4, Cassandra

Introduced

Cassandra is a hybrid non-relational database, similar to Google's bigtable. Its main function is richer than dynomite (distributed Key-value Storage System), but the support is not as good as document storage MongoDB (open source product between relational database and non relational database, the most abundant function in non-relational database, most like relational database). The supported data structures are very loose and are JSON-like bjson formats, so you can store more complex data types. Cassandra was originally developed by Facebook and turned into an open source project. It is an ideal database for Internet social cloud computing. Based on Amazon's proprietary, fully distributed dynamo, the data model of Google BigTable is based on the column family (columns Family). Center-to-peer storage. Many aspects can be called Dynamo 2.0.

Characteristics

Compared with other databases, there are several salient features:

Flexible mode: Using Cassandra, like document storage, you don't have to resolve fields in the record in advance. You can add or remove fields whenever the system is running. This is an astonishing efficiency boost, especially in large deployments.
True extensibility: Cassandra is a purely horizontal extension. To add more capacity to a cluster, you can point to another computer. You don't have to restart any processes, change application queries, or manually migrate any data.
Multi-Datacenter Recognition: You can adjust your node layout to avoid a fire in a data center, and an alternate datacenter will have at least a full copy of each record.

Some other features that make Cassandra more competitive:

Range Query: If you don't like all of the key-value queries, you can set the range of the keys to query.
List data structure: In mixed mode you can add a super column to a 5-D. For each user's index, this is very convenient.
Distributed write operations: You can read or write any data at any time in any place. And there will be no single point of failure.

Official website

http://cassandra.apache.org/

5, Hypertable

Introduced

Hypertable is an open-source, high-performance, scalable database that uses a model similar to Google's bigtable. Over the past few years, Google has built three key pieces of scalable computing infrastructure designed to run on a PC cluster. The first key infrastructure is the Google File system (GFS), a highly available filesystem that provides a global namespace. It achieves high availability through file data replication across machines (and across racks), and thus is protected from many failures that traditional file storage systems cannot avoid, such as power, memory, and network port failures. The second infrastructure is a computing framework called Map-reduce, which works in close collaboration with GFS to help process the massive amounts of data collected. The third infrastructure is bigtable, which is an alternative to traditional databases. BigTable allows you to organize massive amounts of data through some primary keys and enable efficient queries. Hypertable is an open source implementation of BigTable and has made some improvements based on our ideas.

Characteristics

Main function Features:

Handling of Load Balancing

Version control and consistency

Reliability

Distribute to multiple nodes

Official website

http://hypertable.org/

6. Redis

Introduced

Redis is a key-value storage system. Similar to memcached, it supports storing more value types, including string (string), list (linked list), set (set), and Zset (ordered collection). These data types support Push/pop, Add/remove, and intersection-set and difference sets, and richer operations, and these operations are atomic. Based on this, Redis supports sorting in a variety of different ways. As with memcached, data is cached in memory to ensure efficiency. The difference is that Redis periodically writes the updated data to disk or writes the modified operation to the appended record file, and Master-slave (Master-Slave) synchronization is implemented on this basis.

Performance Test Results:

Set operations 110,000 times per second, get operations 81,000 times per second, the server is configured as follows:

Linux 2.6, Xeon X3320 2.5Ghz.

The StackOverflow website uses Redis as a cache server.

Characteristics

Main function Features:

Security

Master-slave replication

Run abnormally fast

Supports sets (also supports Union/diff/inter)

Support List (also supports queue; blocking pop operations)

Support for hash tables (objects with multiple domains)

Support for sorting sets (high score table, for range queries)

Redis Support Transactions

Support for setting data to outdated data (similar to fast buffer design)

Pub/sub allows users to implement message mechanisms

Official website

http://redis.io/

7. Tokyo Cabinet/tokyo Tyant

Introduced

Tokyo Cabinet (TC) and Tokyo Tyrant (TT) developers are Japanese Mikio Hirabayashi, mainly used in Japan's largest SNS website mixi.jp. TC appeared at the earliest, now is a very mature project, but also the Key-value database field is the largest hotspot, is now widely used in the site. TC is a high-performance storage engine, and TT provides multi-threaded high concurrent server, performance is also excellent, can handle 40,000 ~ 50,000 reads and writes per second.

In addition to supporting Key-value storage, the TC supports the Hashtable data type, so it is much like a simple database table and supports column-based conditional queries, paging queries, and sorting functions, basically equivalent to supporting single table's underlying query functionality. So you can simply replace the many operations of the relational database, which is one of the main reasons TC is popular. There is a ruby project Miyazakiresistance to encapsulate the hashtable operation of TT into the same operation as ActiveRecord, which is very efficient to use.

Characteristics

Tc/tt in the practical application of Mixi, storing more than 20 million data, while supporting tens of thousands of concurrent connections, is a proven project. The TC has a reliable data persistence mechanism while guaranteeing very high concurrent read and write performance, and it also supports the Hashtable of relational database table structure and simple condition, paging and sorting operations, which is a very good NoSQL database.

The main disadvantage of TC is that, after the amount of data reaches the billion level, the concurrent write data performance will be greatly reduced, the developer found in the TC inserted 160 million 2KB~20KB data, write performance began to drop sharply. That is, when the amount of data reached hundreds of millions of times, TC performance began to decline significantly, from the TC author's own Mixi data, at least the thousands data amount of time has not encountered such a significant write performance bottleneck.

Official website

http://fallabs.com/tokyocabinet/

8. Flare

Introduced

TC is Japan's largest SNS website mixi.jp development, and flare is Japan's second largest SNS website green.jp development. In short, flare is adding a scale (extensible) function to the TC. It replaced the TT section, which wrote the network server in addition to the TC. The main feature of flare is the ability to support scale, which adds a node server before the network service to manage multiple server nodes on the backend, so you can dynamically add database service nodes, remove server nodes, and support failover. If your usage scenario must allow the TC to scale, then flare can be considered.

The only drawback to flare is that he only supports the memcached protocol, so when you use flare, you can't use the TC table data structure, only using the TC's KEY-VALUE data structure to store it.

Characteristics

No relevant introductions were found.

Official website

http://flare.prefuse.org/

9, Berkeley DB

Introduced

Berkeley db (db) is a high-performance, embedded database programming library, and the C language, C++,JAVA,PERL,PYTHON,PHP,TCL, and many other languages have bindings. Berkeley db can hold any type of key/value pair, and you can save multiple data for one key. Berkeley DB can support thousands of concurrent threads to operate the database at the same time, supporting up to 256TB of data, widely used in various operating systems including most UNIX class operating systems and Windows operating systems, as well as real-time operating systems.

Berkeley DB was originally developed to replace the old Hsearch function with a large number of DBM implementations with the new hash access algorithm (such as the gdbm of the NDBM,GNU project at T-Dbm,berkeley), Berkeley The first release of DB appeared in 1991, when the B + Tree data access algorithm was also included. In 1992, the BSD UNIX 4.4 release included version Berkeley DB1.85. Basically think this is the first official version of Berkeley DB. In the middle of 1996, Sleepycat software company was established to provide commercial support for Berkeley DB. Since then, Berkeley DB has been widely used as a unique embedded database system. 2006 Sleepycat Company was acquired by Oracle Company, Berkeley DB became a member of Oracle database family, Sleepycat original developers continue to develop Berkeley DB in Oracle, Oracle continues its original licensing approach and has increased its development of Berkeley DB, continuing to improve Berkeley DB's reputation in the software industry. The current latest release version of Berkeley DB is 4.7.25.

Characteristics

Main Features:

Fast access speed

Save Hard disk space

Official website

Http://www.oracle.com/us/products/database/overview/index.html?origref=http://www.oschina.net/p/berkeley+db

10, Memcachedb

Introduced

Memcachedb is a distributed, key-value form of persistent storage system. It is not a cache component, but a reliable, fast, persistent storage engine based on object access. The protocol is consistent with memcache (incomplete), so many memcached clients can connect to it. Memcachedb uses Berkeley DB as a durable storage component, so many of Berkeley DB's features are supported.

Characteristics

Memcachedb is a distributed, key-value form of persistent storage system. It is not a cache component, but a reliable, fast, persistent storage engine based on object access. The protocol is consistent with memcache (incomplete), so many memcached clients can connect to it. 　　Memcachedb uses Berkeley DB as a durable storage component, so many of Berkeley DB's features are supported. We are standing on the shoulders of giants. Memcachedb's front-end cache is the memcached front end: memcached's network layer backend: BerkeleyDB storage

Write speed: From the local server through the Memcache client (Libmemcache) Set2 billion 16 bytes long key,10 bytes long record of value, time consuming 16,572 seconds, average speed 12,000 Records/sec.

Read speed: Records of 16 bytes long key,10 bytes of value from the local server via the Memcache client (Libmemcache), which takes 103 seconds and the average speed of 10,000 Records/sec. • Supported memcache commands

Official website

http://memcachedb.org/

11, Memlink

Introduced

Memlink is a high-performance, persistent, distributed Key-list/queue data engine developed by Tianya. As Memlink in the name shows, all data is built into memory, guaranteeing the high performance of the system (approximately redis several times) and using Redo-log technology to ensure data persistence. Memlink also supports the functions of master-slave replication, read/write separation, and list filtering operations.

Unlike memcached, its value is a list/queue. and provides features such as persistent, distributed. It sounds a bit like redis, but it claims to be better than Redis and has been improved and perfected in many of the areas where Redis is not doing well. The client development package provided includes C,python,php,java in four languages.

Characteristics

Characteristics:

Memory data engine for extremely efficient performance
List block chain structure, thin memory, optimize search efficiency
Node data items can be defined to support multiple filtering operations
Support Redo-log, data persistence, non-cache mode
Distributed, master-Slave synchronization

Official website

http://code.google.com/p/memlink/

12, Db4o

Introduced

"Using tables to store objects is like driving a car home, then splitting it into parts and putting it in the garage, and then assembling the car in the morning." But one wonders whether this is the most effective way to park a car. "–esther Dyson db4o is an open-source, pure object-oriented database engine that is easy to use for both Java and. NET developers. At the same time, DB4O has been validated by third parties as an object-oriented database with excellent performance, and the following benchmark diagram compares db4o with some traditional durable scenarios. Db4o ranked second in this comparison, just behind JDBC. With the benchmark results from Figure 1, it is worthwhile to savor the significant difference in performance between the HIBERNATE/HSQLDB scheme and the JDBC/HSQLDB scheme, which confirms the industry's concern for Hibernate. And Db4o's excellent performance, let us believe: more OO does not necessarily sacrifice performance.

At the same time, db4o is characterized by the need for DBA management, small footprint, which is ideal for embedded applications and Cache applications, so since Db4o was released, it has quickly attracted a large number of users to use db4o for a wide variety of embedded systems, including mobile software, medical devices, and real-time control systems. Db4o is developed and is responsible for business operations and support by Db4objects, an open source database company from Silicon Valley, California. Db4o is based on the GPL protocol. Db4objects was formed in 2004 under the leadership of CEO Christof Wittig, including Mark Leslie, CEO of Veritas software company, Vinod Khosla (one of Sun's founders), Sun Senior investment in Silicon Valley, the CEO of the company. There is no doubt that today Db4objects is one of Silicon Valley's hottest innovators in technology.

Characteristics

Db4o's goal is to provide a powerful, fit-embedded database engine that can work on devices, mobile products, desktops, and servers in a variety of platforms. The main features are: Open source mode. 　　Unlike other Odbms, db4o is open source software that drives the development of db4o products through the power of the open source community. Native database. Db4o is a 100% native object-oriented database that uses programming languages to manipulate databases directly. 　　Programmers do not have to do OR map to store objects, which greatly saves the programmer's development time in storing data. Performance. For DB4O's official benchmark data, db4o is 44 times times faster than a hibernate/mysql solution on some test lines! and easy to install, only about 400Kb. jar or. dll library files. In the next series of articles, we will focus only on the application of the Java platform, but in fact db4o will undoubtedly be well on the way. NET platform to work.

Figure: Official test data

Easy to embed. 　　Using db4o only requires the introduction of more than 400 K jar files or DLL files, memory consumption is very small. 0 management. 　　Use DB4O to achieve 0 management without DBAs. Supports multiple platforms. DB4O support from Java 1.1 to Java 5.0, and also support. NET platforms such as. NET, compactframework, Mono, etc., and can also run in CDC, Personalprofile, Symbian, Savaj 　　E and Zaurus, the J2ME dialect environment that supports reflection, can also be run in CLDC, MIDP, Rim/blackberry, Palm OS, which do not support reflection-enabled J2ME environments. Perhaps the developer would ask, what if the existing application environment already has a relational database? It doesn't matter, db4o DRS (db4o Replication System) enables two-way synchronization (replication) of db4o and relational databases, 3. DRS is based on Hibernate development, the current version is 1.0, and runs on Java 1.2 or later platforms, based on DRS for Db4o to Hibernate/rdbms, db4o to db4o, and Hibernate/rdbms Two-way replication to Hibernate/rdbms. DRS model

Figure: DRS model

Official website

http://www.db4o.com/china/

13, Versant

Introduced

The Versant object Database (v/od) provides powerful data management for a C + +, Java or. NET objects model that supports large concurrency and large-scale data collections.

The Versant object database is an object database management system (Odbms:object. Management system). It is used primarily in complex, distributed, and heterogeneous environments to reduce development and improve performance. This is especially useful when the program is written in Java and/or C + + languages.

It is a complete, electronic infrastructure software that simplifies the construction and deployment of transactional distributed applications.

As a superior database product, the Versant Odbms is designed to meet the needs of customers for high performance, scalability, reliability, and compatibility in heterogeneous processing platforms and enterprise-class information systems.

The Versant object database has been successful in providing reliability, integrity, and performance for enterprise business applications, Versant Odbms's high-performance multithreaded architecture, internal parallelism, The smooth client-server structure and efficient query optimization all embody its excellent performance and scalability.

The Versant object database includes the Versant odbms,c++ and Java language interfaces, the XML Toolkit, and the asynchronous replication framework.

Characteristics

First, the strong advantage

The Versant object Database8.0, suitable for databases with complex object models in the application environment, is designed to handle the often-needed navigational access of these applications, seamless data distribution, and enterprise-scale.

For many applications, the most challenging aspect is the inherent complexity of controlling the business model itself. The complexities of telecommunications infrastructure, transportation networks, simulations, financial instruments, and other areas must be supported, and the way in which complexity can be supported is to evolve applications as the environment and needs change. The focus of these applications is on the domain and the logic of these areas. Complex designs should be based on the object model. The architecture that mixes technical requirements such as persistence (and SQL) with the domain model can have disastrous consequences.

The Versant object database allows you to use objects that contain only domain behavior information, regardless of persistence. At the same time, the Versant object database provides seamless data distribution across multiple databases, high concurrency, fine-grained locking, top performance, and high availability through replication and other technologies. The Object Relational mapping tool in modern Java has simplified many mapping issues, but they do not provide the functionality and performance of seamless data distribution that versant can provide.

Second, the main characteristics

Transparent objects for C + +, Java, and. NET persist

Supports object persistence criteria, such as JDO

Seamless data distribution across multiple databases

Enterprise-class High availability options

Dynamic Mode Update

Less administrative work (or no need)

End-to-end object support architecture

Fine-grained concurrency control

multithreaded, multi-session

Support for international character sets

High-speed data acquisition

Third, the advantage

Fast storage, retrieval, and browsing of object hierarchies

More than 10 times times more performance than relational databases

Reduce development time

New features of four and 8.0

Enhanced multi-core linear scaling capabilities

Enhanced database management tools (monitoring, database inspection, data reorganization)

Support for LINQ-based. NET binding mechanisms

Support. NET and JDO applications database activity recording and analysis based on "Black Box" tool

V. Versant Object Database Features

Dynamic Mode Update

Versant supports slow mode updates, which means that when used, the object will be converted from the old mode to the new mode and no mapping is needed. All of these support database schema updates and agile development.

Seamless data distribution across multiple databases

The client interacts seamlessly with one or more databases. A single database is seamlessly federated, enabling you to partition data, improve read and write capabilities, and increase the overall size of the database. The distribution of data on these databases is transparent. They are combined to form a

Seamless database, providing great scalability.

concurrency control

Object-level locks ensure that only two applications are conflicting when they attempt to update the same object, which differs from the page-based locking mechanism. A page-based lock mechanism may cause the illusion of a concurrency hotspot.

Transparent C + + object persistence

C + + objects, STL classes, standard C + + collections such as dictionaries, mappings, mapping mappings, and so on, are stored as-is in the database. Status changes are tracked automatically in the background. When the related transaction commits, all changes are automatically sent to the database. As a result, a very natural, low-noise programming style can be created so that the application can be developed quickly and the application can be flexibly modified when the requirements change.

Transparent Java object Persistence

V/od's Jvi & JDO 2.0 API provides the persistence of transparent simple objects (POJO), including Java 2 persistence classes, interfaces, and any user-defined classes. State change

is automatically tracked in the background. After the transaction commits, all changes are automatically written to the database. As a result, you can get a lightweight programming style for both managed and unmanaged deployments.

Fully embedded versant can be embedded in the application and the database can scale to terabytes.

And can run autonomously, no management required.

Six, Enterprise-class features

Object end-to-end

Object end-to-end means that your application object exists on the client, on the network, and in the database. Unlike relational databases, there is no need for mappings or transformations between objects in memory and representations in the database.

The application's client-side cache transparently caches objects to increase speed. The database supports the object, which executes the query, builds the index, and enables the application to balance the process execution between it and the database. XA support makes it possible to reconcile with other transactional data sources.

VII. v/od 8 Database architecture

High Availability

Achieve high availability of databases through online database management.

Fault Tolerant Server

The fault-tolerant server option automates failover and data recovery in the event of hardware or software failure in the Versant database. A fault-tolerant server uses synchronous replication between two DB instances, and the fault-tolerant server also supports transparent resynchronization in the event of a failure.

Asynchronous data replication

Asynchronous data replication options support master-slave asynchronous replication and point-to-point asynchronous replication between multiple object servers. You can use asynchronous data replication to replicate data to a distributed recovery site or to replicate data across multiple local object databases to improve performance and reliability.

High Availability Backup

The High Availability Data backup option enables Versant to use the disk mirroring features of EMC Symmetrix or other enterprise storage systems to make online backups of large data volumes without compromising availability.

Online re-organization

The Versant database re-organization option is designed to remove the application of a large number of objects. It enables users to reclaim unused space in the database while keeping the database up and running, increasing the available space and improving the performance of the database.

Viii. Why should I use the Versant object-oriented database?

Accelerate time to market by shortening research and development times

The Object Relational mapping code may occupy 40% or more of your app. With the Versant object-oriented database, the mapping code is no longer needed.

Vastly improved performance and data throughput capabilities

When a complex memory object pattern is involved in an application, especially when associated access, the object database behaves better than mapping to a relational database. For example, when an application needs to retrieve an object from an object database, it can be found by simply executing a single query. When mapping to a relational database, if the object contains many-to-many associations, you must pass one or more connections to retrieve the data in the associated table. Using the object database, the retrieval of objects of general complexity is increased by three times times, and the retrieval of objects with high complexity, such as many-to-many associations, is improved by 30 times times faster. For collections and recursive connections, the rate of retrieval can be increased by 50 times times.

Quickly improve applications based on changes in requirements

Today, the pace of change required by business processes, structures and applications makes the ability to adapt to change extremely important. Object-relational mapping and other methods that apply to rigid storage structures make the changes difficult. The Versant object database greatly enhances your application's ability to meet current and future business needs.

Return on investment

When a user encounters a complex object model and a large data set, the object database is the preferred solution. The main advantage of the object database is that it reduces the size of the code, lowers research and development costs, reduces time to market, reduces or simply does not have management requirements, and reduces the cost of acquiring hardware and server software licenses. Performance benefits can also significantly reduce the cost of high-load motion applications. Large relational databases are expensive and expensive, and require costly hardware support

Official website

Http://www.versant.com/index.aspx

14, neo4j

Introduced

NEO4J is an embedded, disk-based, Java persistence engine that supports a full transaction, which stores data in an image rather than a table. NEO4J provides massive scalability to handle billions of node/relationship/attribute images on a single machine that can be scaled to run parallel to multiple machines. Compared with relational database, the graph database is good at dealing with a large number of complex, interconnected, and low-structured data, which changes rapidly and requires frequent queries-in relational databases, these queries result in a large number of table joins and therefore performance problems. NEO4J focuses on the performance degradation problems encountered by traditional RDBMS with a large number of connections when querying. By modeling data around a graph, neo4j traverses nodes and edges at the same speed, with no relationship between the traverse speed and the amount of data that constitutes the graph. In addition, NEO4J provides very fast analysis of graphics algorithms, recommender systems, and OLAP styles, all of which are not achievable in current RDBMS systems.

Neo is a network-network-oriented database-that is, it is an embedded, disk-based Java persistence engine with full transactional characteristics, but it stores structured data on the network rather than in tables. A network (mathematically called a graph) is a flexible data structure that can be applied in a more agile and rapid development model.

You can think of Neo as a high-performance graph engine with all the features of a mature and robust database. Programmers work in an object-oriented, flexible network structure rather than a strict, static table-but they can enjoy all the benefits of having full transactional features and an enterprise-class database.

With the use of "network-oriented databases," People are curious about Neo. In this model, the domain data is expressed as "node space"-The network of nodes, relationships, and attributes (key-value pairs), as opposed to traditional model tables, rows, and columns. A relationship is a first-level object that can be annotated by attributes, whereas a property indicates the context in which the node interacts. The network model perfectly matches the problem domains that are essentially inheritance relationships, such as semantic Web applications. The creator of Neo found that inherited and structured data did not fit the traditional relational database model:

1. The mismatch of object relationships makes it difficult and arduous to squeeze object-oriented "round objects" into relational "square tables", which can be avoided.

2. The nature of static, rigid and inflexible relational models makes it difficult to change schemas to meet changing business needs. For the same reason, when the development team wanted to apply Agile software development, the database was often dragged down.

3. Relational models are very inappropriate for expressing semi-structured data-and industry analysts and researchers believe that semi-structured data is the next big play in information management.

4. The network is a very efficient data storage structure. The human brain is a huge network, and the World Wide Web is also structured as a net, which is no coincidence. The relational model can express network-oriented data, but the relational model is very weak in the ability to traverse the network and extract the information.

Although Neo is a relatively new open source project, it has been applied in products with more than 100 million nodes, relationships, and attributes, and can meet the needs of enterprise robustness and performance:

Fully supports JTA and JTS, 2PC distributed acid transactions, configurable isolation levels, and large-scale, testable transaction recovery. These are not just verbal promises: Neo has been used in high-demand 24/7 environments for more than 3 years. It is mature and robust, fully reaching the threshold of deployment.

Characteristics

NEO4J is a Java-implemented, fully acid-compatible graphical database. The data is saved on disk in a format that is optimized for the graphics network. The neo4j kernel is an extremely fast graphics engine with all the features expected by the database product, such as recovery, two-phase commit, and XA compliance.

NEO4J can be used as an inline database without any administrative overhead, or as a standalone server, where it provides a widely used rest interface that can be easily integrated into PHP-based,. NET and JavaScript environments. But the main focus of this paper is to discuss the direct use of neo4j.

Typical data characteristics of the NEO4J:

• Data structures are not required and can even be completely out of the way, which simplifies schema changes and delays data migrations.

• Easy modeling of common complex domain datasets such as access control in CMS can be modeled into fine-grained access control tables, use cases for class object databases, Triplestores, and other examples.

• Typical areas of use such as Semantic web and RDF, Linkeddata, GIS, genetic analysis, social network data modeling, in-depth recommendation algorithms, and other areas.

Around the kernel, NEO4J provides a set of optional components. It is supported by the meta-model to construct the graphical structure, SAIL-a SPARQL-compliant RDF Triplestore implementation or a set of common graphics algorithms.

Performance?

It is difficult to give accurate performance benchmark data because they are associated with the underlying hardware, the datasets used, and other factors. Adaptive scale neo4j can handle graphs that contain billions of of nodes, relationships, and attributes without any additional work. Its read performance makes it easy to traverse 2000 relationships per millisecond (approximately 12 million traversal steps per second), which is completely transactional and has a hot cache for each thread. Using the shortest path calculation, the neo4j is even 1000 times times faster than MySQL when dealing with small graphs with thousands of nodes, and the gap is increasing as the graph size increases.

The reason for this is that, in neo4j, the speed of the graph traversal execution is constant, independent of the size of the graph. Unlike a common join operation in an RDBMS, this does not involve a set operation that degrades performance. neo4j traverse graphs in a deferred style-nodes and relationships are traversed and returned only when the resulting iterators need access to them, which greatly improves performance for large-scale deep traversal.

Write speed is very much related to the file system's lookup time and hardware. The Ext3 file system and SSD disks are a good combination, which results in about 100,000 write transactions per second.

Official website

http://neo4j.org/

15, BaseX

Introduced

BaseX is an XML database that stores condensed XML data, provides efficient XPath and XQuery implementations, and includes a front-end operator interface.

Characteristics

Basex a more significant advantage is that there is a GUI, the interface has a query window, you can use XQuery to query the relevant database XML file, there is a dynamic display of XML file hierarchy and node relationship diagram. But I also feel that this is the advantage, programming with the GUI is irrelevant.

Basex supports the storage of large XML documents more than Xindice, and Xindice does not support large XML very well, and is designed to manage collections of small and medium-sized documents.

BaseX is an XML database that stores condensed XML data, provides efficient XPath and XQuery implementations, and includes a front-end operator interface.

Official website

http://basex.org/

15 NoSQL databases

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

15 NoSQL databases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

15 NoSQL databases

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support