ShardingPitfalls: PartI

Source: Internet
Author: User
ByAdamComerford, SeniorSolutionsEngineerShardingisapopularfeatureinMongoDB, rule. Thebenefitsofshardingforscalabilityarewellknown, andoftenoneoft

By Adam Comerford, Senior Solutions Engineer Sharding is a popular feature in MongoDB, primarily used for distributing data processing SS clusters for horizontal scaling. the benefits of sharding for scalability are well known, and often one of t

By Adam Comerford, Senior Solutions Engineer

Sharding is a popular feature in MongoDB, primarily used for distributing data processing SS clusters for horizontal scaling. the benefits of sharding for scalability are well known, and often one of the major factors in choosing MongoDB in the first place, but as you add complexity to a distributed system, you increase the chances of hitting a problem.

The good news is that percentage of the common issues people encounter when moving to a sharded environment are avoidable, and most of them can be mitigated if you have already hit them.

Forewarned is forearmed and so with that in mind, we want users to be aware of best practices and situations to avoid when introducing sharding into your environment. in this three part blog series we will discuss several pitfalls and gotchas that we have seen occur with some regularity among MongoDB users. we'll give an overview of the problem, how it occurs, how to avoid it and then discuss some possible mitigation strategies to employ if you have already run into this problem.

It shoshould be noted that some of these topics are worthy of full technical articles in their own right, which is beyond the scope of a relatively short blog post. think of these post as a good starting point and, if you have not yet hit any of these problems, an informative cautionary tale for anyone running a sharded MongoDB cluster. for additional details, please view the Sharding section in the MongoDB Manual.

Limitations of these topics are also covered as part of the M102 (MongoDB for DBAs) and M202 (Advanced Deployment and Operations) classes that are available for free on MongoDB University.

For our first set of cautionary tales we will focus on shard keys.

1. Using a monotonically increasing shard key (like ObjectID)

Although this is one of the most commonly covered topics on blogs, training material, MongoDB Days and more, the selection of a shard key remains a formidable exercise for the novice MongoDB DBA or developer.

The most common mistake we see is the selection of a monotonically increasing shard key when using range-based sharding rather than hashed sharding, which is a fancy way of saying the shard key value for new clients only increases. examples of this wocould be a timestamp (naturally) or anything that has a time component as its most significant component like ObjectID (first 4 bytes are a time stamp ).

Why is it a bad idea?

The short answer is insert scalability. if you select such a shard key, all inserts (new clients) will go to a single chunk-the highest range chunk, and that will never change. hence, regardless of how many shards you add, your maximum write capacity will never increase-you will only ever write new documents to a single chunk and that chunk will only ever live on a single shard.

Occasionally, this type of shard key can be the correct choice, but if so then you won't be able to scale for write capacity.

Possible Mitigation Strategies

  • Change the shard key-this is problematic with large collections, because the data essential has to be dumped out and re-imported

  • More specifically, use a hash based shard key, which will allow the use of the same field while providing good write scalability.

2. Trying to Change Value of the Shard Key

Shard keys are immutable (cannot be changed) for an existing document. this issue usually only crops up when sharding a previusly unsharded collection. prior to sharding, certain updates will be possible that are no longer possible after the collection has been sharded.

Attempting to update the shard key for an existing document will fail with the following error:

cannot modify shard key's value fieldid for collection: foo.foo

Possible Mitigation Strategies

  • Delete and re-insert the document to alter the shard key rather than attempting to update it in-place. it shoshould be noted that this will not be an atomic operation, so must be done with caution.

Now you have a better understanding of how to choose and change your shard key if needed. In our next post, we will go through some potential obstacles you will face when scaling your sharded environment.

If you want more insight on scaling techniques for MongoDB, view the slides and video from our recent webinar on how to achieve scale with MongoDB, which reviews three different ways to achieve scale with MongoDB.

Read Part II in the series and see what to look out for when running a sharded cluster

Original article address: Sharding Pitfalls: Part I. Thank you for sharing it with me.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.