A few days ago on the hacknew appeared an article, the title is very sturdy, called "Don" T use http://www.aliyun.com/zixun/aggregation/13461.html ">mongodb", Its content is also directly expressed dissatisfaction with the MongoDB, the author enumerated the MongoDB in the use of various problems encountered. It even raises questions about its development team, saying they may be concerned only with benchmark data and not the security of user data. What a shout, Pit Daddy!
Latest news: The author of this article has admitted that the article is just one of his pranks, saying he just wants to do an experiment to show how easy it is to control a person's thinking. But the case he mentioned is not completely out of the way, such a prank article, although it really fooled us, but can make some blind friends more cautious. It's still good.
But soon, 10gen CTO @ehwizard saw the article and immediately responded to the various questions the author had mentioned. Ehwizard said he had rummaged through 1600 user case reports and found no cases of these problems that the author of the article had said (in fact, doubts about the veracity of the accusations). What unit are you in? )。 Then ehwizard and friendly said that if you encounter problems using MongoDB, you can always go to MongoDB's Google Group or MongoDB IRC to report.
MongoDB is being fired hot today, I believe such an article also really to some students poured cold water. So Nosqlfan will both PK views are put here, we can see for themselves, and even do experiments, in the use of NoSQL or other new technologies, but also to understand some of the possible problems.
The green section below is the original author of the MongoDB some of the accusations and questions, the red part of Nosqlfan's boring deduction, the remaining 10gen CTO ehwizard response.
1. MongoDB in order to look good in the benchmark, regardless of the unsafe scheme as its default configuration. (just yelling unscrupulous profiteers)
Ehwizard said, dude, you're a bit out of the way, MongoDB's choice of the default option, and benchmark is nothing at all, and not just the default solution, including the design of the API, as well as the mongodb of some other functional trade-offs, And benchmark have no half cents relationship. Of course, the setting of the default configuration still needs to be related to the user's main use scenario, MongoDB has undergone a lot of changes in use, making the corresponding default policy adjustments to these changes, it is indeed possible.
Of course, in other words, the implementation of MongoDB's implementation strategy itself is controllable. For example, you can choose the security level of the write operation, when you use the replica sets, you can set a write to sync to a certain number of machines before returning to success. (to the author of a big mouth, you really do not understand it or pretend not to understand it)
2. MongoDB data loss is serious and results in many situations
2.1 MongoDB often weird data loss
The response to this ehwizard is that we received bug reports about the problem of lost data, but we know MongoDB very well that all bugs were repaired almost at the first time after they were received. If you can give the use of the data you lost, we will try to find out the reason. If you really have a problem with losing data, please contact 10gen's engineer for bug fixes immediately, Ehwizard said. (Brother, there is a problem, look for the organization, not shameful)
2.2 When the journaling is not used, the data cannot be recovered if the MongoDB crashes
Ehwizard explained that this is the normal situation, for stand-alone use of MongoDB, do not use the journaling log itself is not recommended dangerous practices, in the 2.0 version, the journaling log has been opened by default. And if it is in the case of replica sets, you do not need to do data recovery, only need to sync from another node resync data.
2.3 Master-slave replication has problems, there is a loss of data operations, the master and the other has no synchronous check. And although the data is missing, it is still synchronized in the state.
Ehwizard says this should not happen, and if it does happen, it should be a serious bug.
2.4 Master-slave replication There is no reason for the interruption of the reality, no errors are directly interrupted
Ehwizard said it could happen, and there was a mistake, but the error message was not returned to the client. Because the copy operation itself is asynchronous, you can set the W parameter to 2 with the GetLastError command if you want the data to be replicated before returning.
3. MongoDB uses a global write lock for write operations, which is inefficient
Ehwizard admits this is indeed a long-standing problem for MongoDB, but there have been considerable improvements in the 2.0 version. It has been optimized for write operations that involve disk IO. In version 2.2, this optimization will be further advanced. (Dude, when it comes to collection's lock)
4. Large pressure ratio, the MongoDB auto-sharding function will be a problem, under the heavy load, add a sharding node is definitely a nightmare. Because at this time MongoDB just do chunk move, will affect the service itself, or can only do not move.
Ehwizard explains that if the system does reach the limit, it is not easy to do the chunk block move. On this topic he has already said on many occasions that his advice is to monitor the cluster as soon as possible, and not wait until the system has reached 100% load time to do the Add node operation. (Snack on your business growth, don't be as caught up as 4sq)
5. MONGOs is very unreliable, although the Mongod/config Server/mongos architecture looks beautiful, but MONGOs really does not give the force. When the pressure is slightly larger, the MONGOs often crashes, few days to collapse, and many hours to collapse. Sometimes the assertion is thrown and a key thread is killed, but the process still works, so restarting the management process does not always work.
Ehwizard said he did not know what his so-called key threads were, hoping to provide more details.
6. MongoDB once experienced a problem that caused all data to be deleted. This situation occurs in the MongoDB 1.6 version of the replica sets structure, because of the election strategy problems, resulting in the selection of an empty data node as a new primary, so that the nodes have data to delete their own data, our 700G data is lost. Fortunately, the problem was fixed in version 1.8.
Ehwizard said that he had reviewed the relevant reports and did not find anything to say, hoping to provide more details.
7 10gen People released something that was not yet published. As far as we know, there are some bugs that cause data problems in some stable versions, and we usually find them when we encounter these bugs. We bought 10gen Platinum service, but the results were just some hot patches they called the internal RC version, and we needed to put these patches on our online version. Oh, my God!
Ehwizard says we don't have any platinum contracts, and all the questions are fed back through the open Jira system. From the question of the proposed and modified, are on the Jira, (Binima officials of the property is also transparent). If you can't provide more information, this is really hard to discuss. Our usual practice is to notify the appropriate user as soon as possible after fixing the problem.
8. On the higher load machine, the synchronous work is quite waste firewood
The feeling should be too high, as I said before, synchronization defaults to asynchronous, if you want to confirm the success of the synchronization, you can use the GetLastError command to set the W parameter to 2.
The above problem may have some fixes, but I want to say, as a company, or should be the reliability of the service in the first place. I think that 10gen should be the following priority for MongoDB function development:
1. Do not lose data, must be very careful of the data 2. Do more testing to ensure reliability of 3. Do real multi-node extensibility 4. Besides the low delay of 5. Improve request performance for resources
In my opinion, 10gen eyes may care about the 5th, and the 1th estimate in their eyes even before du into.
See this, Ehwizard classmate disagree (this is from the moral level of doubt AH), he said 10gen is not as the author said, he said you can look at our bug fixes list, these are public, we never said secretly to get rid of a bug, Or just talk to a few special users about these bugs. If we really cared about read and write performance, we would have fixed the problem of wasting CPU long ago. If we really care so much about benchmark, we have already optimized the problem of the global lock, this thing has very big improvement to the benchmark result of multithreading. Not to mention the general benchmark are multi-threaded run, we do not care about the benchmark data. (My benchmark is already very good x)
MongoDB is really new and has a lot of problems. If you want to discuss some MongoDB related issues with us, our office is open to you and we will be very open to your questions, so if there is a problem, we are looking forward to communicating with you.