Analyze the advantages and functions of memcached cache and MongoDB database in detail

Source: Internet
Author: User
Tags dba failover memcached mongodb sorted by name

Http://www.mini188.com/showtopic-1604.aspx

This article details the memcached and MongoDB some of the views, as well as the combination of applications have any benefits, hope to see everyone's comments and additions.
Memcached
The advantages of memcached I think the summary is mainly reflected in:
1) distributed. Can be composed of 10 machines with 4G of memory to form a 40G memory pool, if you feel not large enough to increase the machine, such a large pool of memory, can completely save most of the hot business data into the memory to block most of the database read requests, to release considerable pressure on the database.
2) Single point. If the Web server or app server is load balanced, the cache stored in the respective memory may be different, and if the data needs to be synchronized, is it more troublesome (self-expiring or distributed data Synchronization?). , even if the data does not need to be synchronized, users may be unfriendly to the user experience because of inconsistent data.
3) Strong performance. There is no doubt that compared to the database is actually, the root of the memory read-write and disk read and write efficiency of several orders of magnitude gap. Sometimes we complain about the database read and write too poor can look at the disk IO, if it is really a bottleneck to install what strong database estimate also can't file, strong not only is the database how much full use of memory.
However, it is also not recommended to use memcached instead of any cache in any situation:
1) If value is particularly large, it is not suitable. Because under the default compilation, memcached only supports 1M of value (the limit of key is not the biggest problem). In fact, from a practical point of view, it is not recommended to keep very large data in memcached, because there is a serialization process of deserialization, do not underestimate the CPU it consumes. Speaking of which, I have always felt that memcached is suitable for output-oriented content caching, rather than processing-oriented data caching, which is not suitable for large chunks of data to be taken out of processing and then put in, but is suitable to take out directly to the output, or take it out do not need to deal with the direct use.
2) If it is not allowed to expire, it is not suitable. Memcached expires for up to 30 days by default, and it recycles the least used data when the memory reaches the usage limit. Therefore, if we want to consider this as a static variable, we must have the process of re-initializing the data. Actually should think so, since is the cache is to get to save up, if there is not necessarily have a re-retrieve the process of the cache, rather than think that it exists forever.
There are, of course, some problems or best practices in the process of using memcached:
1) The problem of clearing some data. Memcached is just a key/value pool, a bus for anyone who can get on. I think it's easy to have problems with similar public resources if people use them according to their own rules. Therefore, it is better to use the concept of a similar namespace on the specification of the key value, each user can clearly know the scope of a function key, or prefix. The benefit is that if we need to empty, we can find our own batch of keys according to this specification and then go empty, instead of emptying all of them. Of course someone is using the concept of version upgrade, old key Let it pass, then naturally will empty, this is also a way. But the key has the standard always has the benefit, in the statistical also convenient point.
2) the organization of value. That is to say, the granularity of the data we store, such as whether to save a list, is a key value stored in one or the same as a key value, depending on the business. If the granularity is very small, it is best to obtain the time can be obtained in bulk, in the storage can also be stored in bulk. The fewer calls across the network the better, you can think about, if a page needs to output 100 rows of data, every data need to get once, a page to make hundreds of connections this performance will be problematic.
So what are the main functions of memcached?
In fact, I think I can think of the memory cache in the place we can consider whether it is possible to apply the distributed cache, but the main purpose is to block the front-end or the middle of the demand for reading to release the Web server app server and DB pressure.
Let's talk about MongoDB.
Mongodb
MongoDB is a relatively good non-relational database of the document-type database. Its advantages are mainly reflected in:
1) Open source. It means that even if we don't change it, we can dig it up, and MS SQL knows how to do it internally, in addition to looking at those documents.
2) free of charge. means that we can install a large number of instances on a large number of garbage servers, even if it is not very high performance, but also jiabuzhu a lot of points ah.
3) High performance. Other not compared to MS SQL, the same application (mainly write operations) a support 500 users hang up, one can support up to 2000. After millions of data, even without indexing, MS SQL's insertion performance is a mess. In fact, everything has a relative, in the complexity becomes perfect after the sacrifice of a part of the performance, MS SQL embodies a very strong security data integrity, this is not MongoDB can do.
4) Simple and flexible configuration. Configuring failover clustering and read-write separation of database replication in a production environment is a common requirement, and the cumbersome steps of MS SQL configuration are scary, and MongoDB can configure its own failover group within five minutes, and read and write separations take only a minute. Flexible body Now, we can configure a M one s, two M one s (two m write data will be merged into S to read), a M two s (one m write data on two s on the mirror), or even more than M multiple s (theoretically can create 10 m, 10 s, We just need to write to which m on a poll, we can rotation any s when we need to read it, and of course we know it's impossible to guarantee that all s have consistent data at the same time. Then you can also configure two m pairs as a set of failover clusters, then configure two sets of such clusters, and then corresponding to two s, that is, 4 m corresponding to 2 s, to ensure that the M point has failover.
5) flexible to use. In the previous article I mentioned that even the conversion of SQL to JS expression allows MongoDB to support the query of SQL statements, regardless of how the MongoDB on the query is still very convenient.
As I've said before, not all database applications are replaced with MongoDB, and the main drawbacks are:
1) Open source software features: Fast update, application tools are not perfect. Because the update is fast, our client needs to upgrade with its update to enjoy some new features, and updating fast also means that it is possible to lack some important functionality at a certain stage. In addition, we know that MS SQL provides very good GUI tools to maintain the database in Dev/dba/adm multiple dimensions. While MongoDB provides some programs, but not very friendly. Our DBA may be depressed to optimize MONGODB queries.
2) operation of the transaction. MongoDB does not support built-in transactions (no built-in transactions do not imply the ability to have a transaction at all) and is not suitable for some applications. But for the majority of Internet applications there is no such problem.
The following problems are encountered in the process of using MongoDB:
1) True scale-out? In the process of using memcached we have experienced this kind of cool, the basic can infinitely increase the machine to scale horizontally, because of what, because we are the client to determine the key value is saved on that instance, when the acquisition is also very clear on which instance it, even if a one-time access to multiple key values, it is the same. And for the database, we have a variety of ways to carry out the sharding, do not say the other, in the query when we get batch data according to certain conditions, how to deal with it? For example, we go to the Shard according to the user ID, and the query does not care about the user ID, care about the user's age and education level, finally sorted by name, where to fetch the data? Both client-based and server-based sharding are very difficult to do, and even with automated sharding performance is not necessarily guaranteed. The simplest is as far as possible according to the function to divide, then goes down is the historical data concept, really wants to achieve the real-time data scattered in each node, still is very difficult.
2) Multi-threading, multi-process. In the case where the write speed is less than expected, we open several threads at the same time, or open several MongoDB processes (the same machine), that is, multiple DB instances, and then write to different instances. Will this improve performance? Unfortunately, very limited, and even can not be said to improve at all. Why can I increase the write speed when I use memcached for multiple threads? That is because of the bottleneck of memory data exchange we did not reach, and for the disk, Io bottleneck of a few 10 trillion per second is very easy to achieve, once the bottleneck has been reached, no matter how many processes are open can not improve performance. Fortunately, MongoDB use memory map, see more memory use, in fact, I have a bit more confidence in it (memory consumption more I think the CPU is more likely to make it not idle), afraid of a db does not use what memory, look at the IO bottleneck, memory and CPU still eat not full.
Memcached and MongoDB co-ordination
In fact, with memcached and MongoDB we can even get more than 80% of applications out of the traditional relational database. I can imagine that they can actually work together to compensate for each other's shortcomings:
Memcached is suitable for saving value based on key, then sometimes we don't know what to read what key to do? I was wondering if MongoDB or database can be used as a raw data, this raw data is divided into the fields to be queried (index field) and the normal data field two parts, a large number of non-query fields in the memcached, small granularity to save, In the query when we query the database to find out what data to obtain, the General Query page will display 20-100 bar, and then a one-time from the memcached to obtain the data. In other words, MongoDB's reading pressure is mainly indexed fields, and the data field is only useful when the cache fails, using memcached to block most of the real data query. Conversely, if we want to empty the data in memcached, we know which key to empty.

Analyze the advantages and functions of memcached cache and MongoDB database in detail

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.