Good writing, very detailed analysis of different scenarios of the different models
This blog post is translated from:
http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1?mkt_tok= 3rkmmjwwff9wsronsq7ldu%2fhmjteu5z14uusukgxhokz2efye%2blihetpodcmtcvnm7zydbceejhqyqjxpr3fldcn0tjurhtrcw%3d%3d
Note: This translation is not strictly translated, but is expressed as clearly as possible based on the understanding of the original. If you have any questions or concerns, please refer to the original.
Many friends who have just turned to mongodb development from traditional SQL development will ask a question: How to use MongoDB to express a one-to-many (1 to n) relationship in a traditional relational database?
Based on the rich expressive power of mongodb, we cannot say that we must use a standard method to model 1 to N. We'll start with 3 specific scenarios in a few moments.
First, we refine the scene for N in 1 to N. What magnitude does this n represent? Is it a few to dozens of? Or a few to thousands of? Or thousands of them?
1) 1 to N (n stands for several, or dozens of, not too many)
For example, each person will have multiple address. In this case, we modeled with the simplest embedded document.
{name: ' Kate Monster ', id: ' 123-456-7890 ', addresses: [{street: ' 123 Sesame St ', City: ' Anytown ', cc: ' USA '},{street: ' 1 Q ', City: ' New York ', cc: ' USA '}]}
This modeling approach incorporates the obvious advantages and disadvantages:
Pros: You don't need to execute a separate query to get all the address information for a person.
Cons: You can't manipulate address information as you do with standalone documents. You must first manipulate (for example, query) the person document before you can continue to manipulate the address.
In this example, we do not need to operate on the address independently, and the address information is only meaningful if it is associated with a specific person. So the conclusion is that using this embedded (embedded) modeling is well suited for person-address scenarios.
2) 1 to N (n for many, such as dozens of or even hundreds of)
For example, product and component (part), each product will have many components. In this scenario, we can model by reference, as follows:
Component (part): {_id:objectid (' AAAA '), PartNo: ' 123-aff-456 ', Name: ' #4 grommet ', qty:94,cost:0.94,price:3.99} product: {name: ' Left-handed smoke shifter ', Manufacturer: ' Acme Corp ', catalog_number:1234,parts: [//array of references to Pa RT Documentsobjectid (' AAAA '),//reference to the #4 grommet aboveobjectid (' f17c '),//reference to a different partobject ID (' D2AA '),//etc]}
First each part is present as a separate document. Each product contains an array type field (parts), which holds the number (_id primary key) of all the components that the product contains. When you need to query all the part information that the product contains based on a product number, you can do the following:
> Product = Db.products.findOne ({catalog_number:1234});//Fetch all the Parts that is linked to this product> prod Uct_parts = Db.parts.find ({_id: {$in: Product.parts}}). ToArray ();
The advantages and disadvantages of this modeling approach are also obvious:
Advantage: The component is present as a standalone document, and you can perform a separate operation on a part, such as a query or an update.
Cons: As above, you must pass two queries to find all the part information that a product belongs to.
In this case, the disadvantage is acceptable, and it is not difficult to implement itself. And, with this modeling, you can easily extend 1 to n into N to N, where a product can contain multiple parts, and a part can be referenced by multiple products (i.e. the same part can be used by multiple products).
3) 1 to N (this n represents a large number, such as thousands, or even larger)
For example, each host generates a large number of log messages (LOGMSG). In this case, if you are using embedded modeling, a host document can be very large and easily exceed MongoDB's document size limit, so it is not feasible. This is also not possible if you are modeling in the second, using arrays to hold all the _id values of the logmsg, as the log is a lot more easily than the document size limit if you simply reference Objectid. So at this point, we take the following approach:
Host (Hosts): {_id:objectid (' Aaab '), Name: ' goofy.example.com ', ipaddr: ' 127.66.66.66 '} log (logmsg): {time:isodate (" 2014-03-28T09:42:41.382Z "), message: ' CPU is on fire! ', Host:objectid (' Aaab ')//Reference to the host document}
In LOGSMG, we store a _id reference to the host.
In summary, when modeling a 1 to n relationship, we need to consider:
1) n represents a small order of magnitude, and an entity that is represented by N does not require a separate operation, embedded modeling can be used.
2) n represents a larger order of magnitude, or N represents an entity that needs to be operated on its own, modeled by using an array to hold a reference in 1.
3) n represents a very large order of magnitude, we have no choice but to add a reference to the 1 end at the N end.
MongoDB one-to-many relationship modeling