Why does MongoDB replace MySQL?

Source: Internet
Author: User
Tags mongodb sharding

MongoDB is a document-oriented database, which is currently developed and maintained by 10gen. It has rich functions and is complete and can completely replace MySQL. In the process of using MongoDB as a product prototype, we summarized some highlights of monogdb:

The JSON style syntax is easy to understand: MongoDB uses the JSON variant bson as the internal storage format and syntax. All operations on MongoDB use JSON-style syntax, and the data submitted or received by the client is displayed in JSON format. Compared with SQL, it is more intuitive and easy to understand and master.

Schema-less, supports embedding sub-documents: MongoDB is Schema-free document database. A database can have multiple collections, each of which is a collection of documents. The table and row of collection and document are not equal to those of traditional databases. Collection can be created at any time without prior definition.

Collection can contain document records with different schemas. This means that the document in your previous record has three attributes, and the document in the next record can have 10 attributes, attribute types can be basic data types (such as numbers, strings, dates, etc.), arrays or hash, or even a sub-document (embed document ). In this way, you can implement the denormalizing data model to improve the query speed.

Figure 1 MongoDB is a schema-free document database

Figure 2 is an example. Works and comments can be designed as a collection. Comments are embedded in the comments attribute of art as subdocuments, and comments are embedded in the replies attribute as subdocuments of comment subdocuments. According to this design pattern, you only need to retrieve all the relevant information by file ID. In MongoDB, we do not emphasize that data must be normalize. In many cases, we recommend de-normalize. developers can discard the limitations of various paradigms of traditional relational databases, you do not need to map all objects to a collection. You only need to define the top class. The document model of MongoDB allows us to easily map our own objects to collections for storage.

Figure 2 MongoDB supports embedding sub-Documents

Simple and Easy-to-use query method: the query in MongoDB is very comfortable. Without the syntax that is difficult to remember in SQL, JSON is directly used, which is quite intuitive. For different development languages, you can use its most basic array or hash format for query. With the added operator, MongoDB supports range query, regular expression query, and subdocument attribute query, which can replace the SQL query of most previous tasks.

crud is simpler and supports in-place update. An array can be automatically inserted or updated by the insert/update method passed to MongoDB. For the update mode, MongoDB supports an upsert option, that is, "If the record exists, it will be updated, otherwise, insert ". MongoDB's update method also supports modifier, which allows immediate updates on the server end, saving communication between the client and the server end. These modifer allows MongoDB to have a function similar to redis, memcached, and other kV features: Compared with MySQL, monodb is simpler and faster. Modifier is also a container that MongoDB can use to track user behavior. In practice, modifier is used to quickly save user interaction behaviors to MongoDB for later statistical analysis and customization.

All attribute types support indexes and even Arrays: This allows some tasks to be implemented. Very easy. In MongoDB, the "_ id" attribute is the primary key. By default, MongoDB creates a unique index for _ id.

server scripts and MAP/reduce: MongoDB allows Execute the script. You can write a function in JavaScript and execute it directly on the server. You can also store the function definition on the server and call it directly next time. MongoDB does not support transaction-level locking. You can use the server side script to implement customized "Atomic" operations. At this time, the entire MongoDB instance is locked. MAP/reduce is also an attractive feature in MongoDB. MAP/reduce can perform statistics, classification, and merge on tables with large data volumes, and complete Aggregate functions such as the groupby of the original SQL statement. In addition, Mapper and reducer use JavaScript to define server scripts.

High performance and high speed: MongoDB uses C ++/boost for writing. In most cases, the query speed is much faster than that of MySQL, And the CPU usage is very small. Deployment is also very simple. For most systems, you only need to download the Binary Package and decompress it to run it directly, almost with zero configuration.

Multiple replication modes are supported: MongoDB supports replication between different servers, including fault tolerance solutions for dual-host mutual backup.

Master-slave is the most common. You can use master-slave to back up data. In our practice, we use the master-slave mode. Slave is only used for backup, and the actual read/write operations are performed from the master node.

Replica pairs/replica sets allow two MongoDB instances to listen to each other, enabling fault tolerance for dual-host mutual backup.

MongoDB only supports limited dual-master mode (master-mast ER), the actual availability is not strong, can be ignored.

built-in gridfs, supporting large storage capacity: this feature is the most eye-catching It is also one reason for me to abandon other nosql. The specific implementation of gridfs is actually very simple. In essence, it is still to block files and store them to files. file and files. in the two collections of chunk, all mainstream driver implementations encapsulate gridfs operations. Since gridfs itself is a collection, you can directly define and manage the attributes of the file. With these attributes, you can quickly find the desired file and easily manage massive files, there is no need to worry about how to hash to avoid File System retrieval performance problems. Combined with the following auto-sharding, gridfs's scalability is enough for us to use. In practice, we use MongoDB's gridfs to store images and thumbnails of various sizes.

Figure 3 Auto-sharding structure of MongoDB

built-in sharding, which provides range-based auto shar Ding mechanism: a collection can be divided into several segments according to the record range and split to different shard. Shards can be combined with replication, and can be used with replica sets to achieve sharding + fail-over, and load balancing between different shard. The query is transparent to the client. The client performs query, statistics, mapreduce, and other operations. These operations are automatically routed to the backend data node by MongoDB. This allows us to focus on our own business, which can be easily upgraded when appropriate. MongoDB's sharding design capability can support up to 20 petabytes, which is sufficient to support general applications.

rich third-party support: MongoDB community is very active, and many development frameworks quickly provide mongdb support. Many well-known large companies and websites also use MongoDB in their production environments. More and more innovative enterprises are switching to MongoDB as a technical solution that works with Django and ror.

implementation result

the implementation of monodb is pleasant. We modified our php development framework to adapt to MongoDB. In PHP, MongoDB is queried and updated around array, and the implementation of Code is concise. Because table creation is not required, the time required to run the test unit in monodb is greatly shortened, and the efficiency of agile TDD development is also improved. Of course, because the document model of MongoDB is very different from that of relational databases, there are also a lot of confusions in practice. Fortunately, the MongoDB open source community has helped us a lot. In the end, we used two weeks to complete code porting from MySQL to MongoDB, which is much shorter than the expected development time. The test results show that the data volume is about million. When the database is GB, the consumption of read/write systems such as 2000rps and CPU is quite low (our data volume is still small, some companies have also demonstrated their typical case: MongoDB stores more than 5 billion million data records,> 1.5 TB ). Currently, we have deployed MongoDB together with other services, greatly saving resources.

tips

understanding the document model of MongoDB, starting from reality Drop the definition of relational database paradigm thinking, and redesign the class; JavaScript code running on the server should not use traversal to record such time-consuming operations; instead, MAP/reduce should be used to process such table data; the type of the attribute should be consistent with that of the query. If the string "1" is inserted, the number 1 does not match during query. To optimize MongoDB performance, you can start with disk speed and memory; mongoDB limits each document to a maximum of 4 MB. Enable embed document when the preceding conditions are met to avoid using databasereference; the internal cache can avoid n + 1 queries (MongoDB does not support joins ).

capped collection is used to solve high-speed writes, Such as real-time logs. In the case of a large amount of data, you must increase the oplogsize when creating a synchronization, and generate data files in advance to avoid client timeout. By default, the total number of collections and indexes cannot exceed 24000; the space for deleting data in the current version (

conclusion

the milestone of MongoDB is version 1.6, which is expected to be released in May., MongoDB sharding will be used in the production environment for the first time. As a beneficiary of MongoDB, we are also actively participating in MongoDB community activities to improve PERL/PHP's technical solutions for MongoDB. MongoDB-based open-source projects will be launched within 1.6.

For companies that have just started or are developing innovative Internet applications, MongoDB's fast, flexible, lightweight, and powerful scalability are suitable for our rapid product development and rapid iteration, adapt to the rapidly changing and updating needs of users.

All in all, MongoDB is a fully functional nosql product that is best suited to replace mysql. The combination of MongoDB + PERL/PHP/Django/ror will soon become the best combination for developing products of 3.0 and, just like replacing Oracle, DB2, and Informix with MySQL in the past, history is always surprisingly similar. Let's wait and see!

Introduction to: http://hi.baidu.com/ixigua

Pan fan(Nightsailer, N. S.), head of visual Chinese website technology, Co-Founder, has 1 dog and 2 cats. Responsible for website platform design and underlying product R & D. Current focus: Apps platform design, distributed file storage, nosql, high-performance post-modern Perl programming. Twitter: @ nightsailer blog: http://nightsailer.com/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.