A few days ago in Hacknew appeared an article, the title is very strong, called "Don's use MongoDB", its content is also directly expressed dissatisfaction with MongoDB , the author enumerated the MONGODB uses the process encountered various problems . Even raised doubts about their development teams that they may only care about benchmark's data and not about the security of user data. What a cry for a pit daddy!
Latest news: The author of this article has admitted that the article is just one of his pranks, saying that he just wanted to do an experiment to show how easy it is to control a person's mind. But the case he has mentioned is not entirely out of the way, and such a prank article, while really bluffing us, can make some blind friends more cautious. It's good to have it.
But soon, the 10gen CTO @ehwizard saw the article and immediately responded to the questions the authors had mentioned. Ehwizard said he rummaged through 1600 user case reports and did not find cases where the author of the article described the problems (and actually questioned the authenticity of the accusations). which unit are you from? ). Then ehwizard and friendly, if you are having problems with MongoDB, you can always report to MongoDB's Google Group or MongoDB's corresponding IRC.
In MongoDB is being fired hot today, I believe such an article also really poured a cold water to some students. So Nosqlfan will both PK point of view here, you can see for yourself, or even do experiments, in the use of NoSQL or other new technologies, but also know more about some of the problems that may arise.
Below is the Green part of the original author to MongoDB some of the accusations and doubts, the Red part of Nosqlfan's boring deduction, the rest of the 10gen CTO ehwizard response.
1. MongoDB in order to look good on the benchmark, do not hesitate to use the unsafe scheme as its default configuration. ( just yell unscrupulous unscrupulous businessmen. )
Ehwizard said, "Dude, you're a little over the point, the choice of the default scheme for MongoDB, and benchmark have nothing at all, and not only the default scheme, including the design of the API, as well as some other features of MongoDB, And benchmark not a half-penny relationship. Of course, the default configuration is also required to be related to the user's main usage scenarios, MongoDB has experienced a lot of changes in the use of these changes to make the corresponding default policy adjustment, it is indeed possible.
Of course, the realization of MONGODB's implementation strategy itself is controllable. For example, you can choose the security level of the write operation, when you use the replica sets, you can completely set a write operation to sync to a certain number of machines before returning to success. ( to the author a big mouth, you really do not understand it or can not understand it)
2. MongoDB data loss is serious and causes a lot of situations
2.1 MongoDB often bizarre loss of data
The response to this ehwizard is that we have received bug reports about data loss, but we know MongoDB very well, and all bugs have been repaired in the first place since they were received. If you can give the usage scenario when you drop the data, we'll try to find out why. Ehwizard says if you really have a problem with data loss, please contact 10gen Engineer for bug fix right away. ( brother, have a problem, find the organization, not ashamed )
2.2 Data cannot be restored if MongoDB crashes when journaling is not used
Ehwizard explained that this is normal situation, for a standalone use of mongodb, not to use the journaling log itself is not recommended dangerous practices, after the 2.0 version, the journaling log is already open by default. In the case of replica sets and so on, you do not need data recovery at all, only need to resync data from another synchronization node.
2.3 There is a problem with master-slave replication, there is lost data operation, there is no synchronous check between master and slave. And although the data has been lost, it is still in sync with the normal state.
Ehwizard says this should not happen, and if it does, it should be a serious bug.
2.4 Master-slave replication has an unexplained interruption of reality, without any errors directly interrupted
Ehwizard said it could happen, that there might be a mistake in the middle, but that the error message was not returned to the client. Because the copy operation itself is asynchronous, you can set the W parameter to 2 by using the GetLastError command if you want the data to be copied after the synchronization is complete.
3. MongoDB uses a global write lock for write operations, which is inefficient
At this point, Ehwizard also admits that this is a problem that MongoDB has long been criticized for, but has made considerable improvements in the 2.0 version. Optimizations have been made to write operations that involve disk IO. In version 2.2, this optimization will be further advanced. ( dude, what time does the lock for collection come in? )
4. When the large pressure ratio is large, the auto-sharding function of MongoDB will be problematic, under the heavy load, adding a sharding node is definitely a nightmare. Because at this time MongoDB just to do chunk move, it will affect their services, or can only do not move.
Ehwizard explained that if the system does reach the limit, then it is not easy to do the move of the chunk block. On this topic he has already said on many occasions that his suggestion is to monitor the cluster as soon as possible, and not wait until the system has reached the 100% load to do the operation of adding nodes. ( Grow your business with dim sum, don't burn with 4sq. )
5. The mongos is very unreliable, although the architecture of the Mongod/config Server/mongos combination looks beautiful, but MONGOs does not give a lot of power. When the pressure is slightly higher, the mongos often crashes, less a few days to collapse, and a few hours to collapse. Sometimes a throw assertion is thrown and a key thread is killed, but the process still runs, so restarting the management process does not always work.
Ehwizard said he did not know what he called a key thread, and hoped to provide some more details about it.
6. MongoDB once had a problem, causing all data to be deleted. This situation occurs in the MongoDB 1.6 version of the replica sets structure, because of the election strategy problems, resulting in the selection of an empty data node as a new primary, so that those with data nodes have their own data to be deleted, our 700G data is thus gone. Fortunately, this issue was fixed in version 1.8.
Ehwizard said he looked at the relevant reports and did not find any questions, and hoped to provide more details.
7.10gen people have released something that is not yet published. As far as we know, in some stable versions there are some bugs that cause data problems, and usually we find them when we encounter these bugs. We purchased 10gen Platinum Services, but the results were just some of the hot fixes that they called the internal RC version, and we needed to hit the patches on our online version. Oh, my God!
Ehwizard says we don't have any platinum contracts, and all the questions are fed back through the open Jira system. From the question of the proposed and modified, are published on the Jira, ( Binima officials of the property is also transparent ). If you can't provide more information, this is really not good to discuss again. Our usual practice is to notify the appropriate user as soon as possible after fixing the problem.
8. On the high load machine, the synchronous work is quite waste wood
The feeling should be too high, as I said before, synchronization is asynchronous by default, and if you want to confirm that the synchronization is successful, you can set the W parameter to 2 by using the GetLastError command.
The above question may have been fixed, but I would say that, as a company, the reliability of the service should be the first place. I think 10gen should be based on the following priority to the development of MongoDB function:
- 1. Do not lose data, you must be very careful with the data
- 2. Do more testing to ensure reliability
- 3. True Multi-node extensibility
- 4. Except for low latency
- 5. Improve request performance for resources
In my opinion, 10gen eyes may be concerned about the 5th, and the 1th estimate in their eyes even before the San du into.
See this, Ehwizard classmate not calm ( Your sister's this is from the moral level questioned Ah ), he said 10gen is not as the author said, he said you can look at our list of bug fixes, these are public, we have never said that secretly to get rid of a bug , or only with some special users to explain these bugs. If we really cared about read and write performance, we'd have fixed the problem of wasting CPU. If we really care about benchmark, we have already optimized the problem of global lock, this thing benchmark the result of multithreading is very big improvement. Not to mention the general benchmark are multi-threaded run, we do not care so much about benchmark data. ( Lao Tzu's benchmark has been very good for X. )
MongoDB is really new, and there are a lot of problems. If you want to discuss some MongoDB related issues with us, our office is open for you, we will treat you with a very open mind, so if there is a problem, we look forward to communicating with you.
Source: news.ycombinator.com (at the same time this article has some translation from the hacker log)
With or without MongoDB? User PK 10gen CTO