Using join operations in MongoDB

Source: Internet
Author: User

One of the biggest differences between SQL and NoSQL is that join is not supported, and in a traditional database, theSQL JOIN clause allows you to use a normal field, in two or more tables, for each row of data in a combined table. For example, if you have a table books andpublishers,你可以像下面这样写命令:

SELECT book.title, Publisher.namefrom bookleft JOIN book.publisher_id on publisher.id;

In other words, the publisher_id field in the Book table refers to thepublishers表中的id字典。这些都是很常见的例子:对于每个publisher都可以拥有成千上万本书,如果你想更新publisher的信息的时候,我们只需要更改一条记录。数据的冗余是很小的,因为我们不需要为每本书来重复更新他的publisher信息,这种技术已基本当做一种规范化的东西了。SQL数据库提供了一些列的规范与约束条件来保障数据关联性。

NoSQL = = No JOIN?

That's not all.

Document-oriented databases, such as MongoDB, are designed to store unstructured data, which, ideally, is not associated with each other in the data set, and if a piece of data contains two or more times, the data is duplicated. Because in most cases we still need data association, only a few cases will not need to correlate the data,

, it seems that the NoSQL features are disappointing. Fortunately MongoDB 3.2 introduces a new $lookup operation that provides an operation similar to the left OUTER join under two or more conditions.

MongoDB Aggregation

$lookup is only allowed in aggregation operations, think of him as a pipe operation: query, filter, combined results. The output of an operation is used as the next input. Aggregation is more difficult to understand than simple query operations, and these operations are often slow, but they are efficient, aggregation can be explained with a good example, assuming we use the user data collection to create a social platform, Store information for a single user in each separate document, such as:

{  "_id": ObjectID ("45b83bda421238c76f5c1969"),  "name": "User One",  "email: "[Email protected]",  "country": "UK",  "DOB": Isodate ("1999-09-13t00:00:00.000z")} 

We can add enough users to the user's collection, but each MongoDB document must have a _ID field value, and the _id field value is like a key in SQL, which is automatically added to the document when we don't explicitly specify _id. Our social networking site now requires a post collection, a combination of user reviews, a document that stores plain text, time, scoring, and a player reference written to the user_id field.

{  "_id": ObjectID ("17C9812ACFF9AC0BBA018CC1"),  "user_id": ObjectID (" 45b83bda421238c76f5c1969 "),  " date:isodate ("2016-09-05t03:05:00.123z"),  "text": "My Life Storiesso far,  "rating": "Important"}

We now want to show the 20 data that recently had important comments, which came from all users and was sorted by time. Each returned document should contain the text of the comment, the time the comment was posted, and the associated user's name and country.

The aggregate query for a MONGODB database is an array of pass-through pipeline operations, in which each operation is ordered in the order of the array. First, we need to extract all the documents from all of the post collections, which use $match memory accurately rating filtering.

{"$match": {"rating": "Important"}}

We now need to sort the filtered documents by time, using $sort operations.

{"$sort": {"Date":-1}}

Because we're going to just return 20 data, we can use $limit to limit the number of documents we need to process.

{"$limit": 20}

We now use the $lookup operation to connect data from the user collection, which requires an object of four parameters:

1. Localfield: The lookup field in the input document

2. From: The collection that needs to be connected

3. Foreignfield: Fields that need to be found in the From collection

4, as: The field name of the output

So here's what we do:

{"$lookup": {  "Localfield": "user_id",  "from": "User",  "Foreignfield": "_id ",  " as":" UserInfo "}}

In our output we will create a new field named UserInfo, which is an array in which each element is a matching element in the user collection.

"UserInfo": [  "name": "User One", ...}]

Between post.user_id and user._id, we have a one-to-two relationship because there is only a single user for each post. So our userinfo array will contain just one element, and we can say that using the$unwind操作来解构他并插入到一个自文档中。

{"$unwind": "$userinfo"}

Now the output will be transformed into a more commonly used structure:

"UserInfo": {  "name": "User One",  "email:" [email protected] ",  ...}

Finally we can use the return comment information in the pipeline, comment on the $project操作 time, comment on the user name, country etc.

{"$project":{  "text": 1,  "date": 1,  "Userinfo.name": 1,  "Userinfo.country": 1}}

Merge all of the above actions

Our final aggregate query matches the comments, sorted in order, restricts the latest 20 messages, connects the user's data, the flat user array, and finally returns only the necessary data we need, the total command is as follows:

db.post.aggregate ([  "$match": {"rating": "Important" }},  "$sort": {"Date":-1 }},  "$limit":  "$lookup":    {"Localfield": "user_id",    "from": "User" ,    " Foreignfield ":" _id ",    " as":" UserInfo "}  ,  " $unwind ":" $userinfo " },   "$project": {    "text": 1,    "date": 1,    "Userinfo.name": 1 ,    " Userinfo.country ": 1  }}]);

The result is a collection of 20 documents, for example:

[  {    "text": "The latest post",    "date:isodate (" 2016-09-27t00:00:00.000z "),    "UserInfo": {      "name": "User One",      "country": "UK"    }  },  {    "text": "Another Post",    "date:isodate (" 2016-09-26t00:00:00.000z "),    " UserInfo ": {      " name ":" User One ",      " country ":" UK "    }  }  ...]

MongoDB's $lookup is very useful and efficient, but the above-mentioned example is just a combination of set queries. He is not an alternative to a more efficient join clause in SQL. And MongoDB also provides some restrictions, if the user collection is deleted, the post document will still be retained.

Ideally, this $lookup operation should not be used frequently, and if you need to use it frequently, you are using the wrong data store (database): If you have associated data, you should use the associated database (SQL).

In other words, $lookup is a new addition to MongoDB 3.2, and he solves some disappointing questions when using some small, correlated data queries in a NoSQL database.

Using join operations in MongoDB

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.