Original: 6 Rules of Thumb for MongoDB Schema Design:part 3
By William Zola, leads Technical support Engineer at MongoDB
This article is the last article in the series. In the first article, I introduced three basic scenarios for modeling "one-to-many" relationships. In the second article, I covered the expansion of the underlying scenario: bidirectional correlation and inverse normalization.
The inverse paradigm allows you to avoid some application-level joins, but it also makes the update more complex and expensive. However, it is worthwhile to have redundant fields that read much more frequently than the update frequency.
If you haven't read the first two articles, welcome to the list.
Let's review these options
You can either take an inline, or create a reference to one end or N-end, or all three.
You can redundancy multiple fields on one side or N end
Here are some of the things you need to remember:
1, the priority is embedded, unless there is any compelling reason.
2. To access an object individually, the object is not intended to be embedded in other objects.
3, arrays should not grow indefinitely. If there are hundreds of document objects on the many side, do not embed them with a reference to the Objectid scheme, if there are thousands of document objects, then do not embed objectid arrays. Which scenarios to take depends on the size of the array.
4. Do not be afraid to apply layer-level joins: If the index is built correctly and the results are limited by the projection criteria (mentioned in chapter II), then the join at the application tier level will not be much larger than the join overhead in the relational database.
5, in the design of anti-paradigm, please confirm the read-write ratio. A field that is barely changed to be read only is suitable for redundancy into other objects.
6. How you model your data in MongoDB depends on how your application accesses them. The structure of the data should be adapted to the reading and writing scenarios of your program.
Design Guide
When you model a "one-to-many" relationship in MongoDB, you have a lot of options to choose from, so you have to be careful about the structure of the data. Here are some of the questions you need to think carefully about:
What is the size of the collection in the relationship: Is it a few, many, or very large?
For a one-to-many "many" end, do you need to access them individually, or will they only be accessed in the context of the parent object.
What is the ratio of read and write to the redundant fields?
Data Modeling Design Guide
In a couple of rare cases, you can embed an array in the parent document.
You can use arrays to refer to Objectid in a pair of data that is a lot or that requires a separate access to the "N" side. If you can speed up your access, you can also use the parent reference on the "N" side.
In a couple of very many cases, you can use the parent reference on the "N" side.
If you are going to introduce redundant inverse paradigm designs into your design, you must make sure that the redundant data is read much more frequently than it is updated. And you don't need strong consistency. Because the inverse-normalization design will allow you to pay a price for updating redundant fields (slower, non-atomized)
6 important rules of thumb in MongoDB database design, Part 1
6 important rules of thumb in MongoDB database design, Part 2
6 important rules of thumb in MongoDB database design, Part 3