MongoDB Learning--Aggregation

Source: Internet
Author: User

The recent new project to use MySQL, while not yet forgotten, summarizes the following MongoDB aggregations.

aggregation refers to the various operations that can handle batch records and return results. MongoDB provides rich aggregation operations for performing calculation operations on datasets. Performing aggregation operations on an Mongod instance can greatly simplify the code for your application and reduce the consumption of resources.

In MongoDB, aggregation operations such as queries use the documents in the collection as input, and the final result outputs one or more documents.

Aggregation pipeline

The aggregation pipeline is a framework based on the concept of data processing pipelines. Converts a set of documents to the final aggregated result by using a multi-stage pipeline. (Aggregate details)

Aggregation pipelines are an alternative to mapping simplification. and aggregation pipelines are a preferred solution for aggregation tasks, because the complexity of the mapping simplification may not be guaranteed.

The maximum amount of memory that can be used for each step of the aggregation pipeline is 100MB. If a step exceeds this limit, MongoDB will get an error. If you need to process large amounts of data, you can use the allowdiskuse option, where the pipeline writes the data to a temporary file.

Some pipeline stages can use a pipe expression as an operator. A pipe expression can make a specified conversion to an input document. A pipe expression uses a document structure body and can contain other expressions.

A pipe expression can only manipulate documents in the current pipeline and cannot access other documents: An expression operation can complete the conversion of a document in memory.

In general, an expression is stateless and is only evaluated during aggregation, except for the cumulative operator expression.

With the cumulative operation of the $group operator, you need to maintain your own state (such as total, maximum, minimum, and related data) during the pipeline process document.

Mapping simplification

Mapping simplification is a data processing method that transforms large amounts of data into valuable * aggregation * results. In MongoDB, use the mapReduce command to perform the operation of the mapping simplification.

Take a look at the following map simplification operations:

In this mapping simplification operation, MongoDB performs a *map* operation on each input document, such as a document in the collection that satisfies the query criteria. The map operation outputs the result of a key-value pair. For those keywords that have multiple values, MongoDB performs a *reduce* operation that collects and compresses the final aggregated results. MongoDB then saves the results in a collection. The degenerate function can also output the result to the *finalize* function, further processing the results of the aggregation, of course, this step is optional.

In MongoDB, all the mapping functions are written using JavaScript and run in the mongod process. The map simplification operation uses a set of Chinese documents as the * input *, and can perform arbitrary sorting and qualification operations before the mapping phase. The mapReduce command can either return the result as a document or write the result to the collection. The input collection and the output collection can be shards.

If you select the map simplification operation to return results immediately, these documents must be within the Bson document size limit, the current limit is 16MB.

Single-Purpose aggregation

Aggregation refers to a large class of methods that perform operations on data sets, which perform specific steps on the input data to calculate a result. MongoDB provides a set of aggregation methods that perform specific operations on a dataset.

Although they have a limited scope of use compared to aggregation pipelines and mapping, the names of these methods are very intuitive to express their functions and are very easy to understand.

1) Total

MongoDB can return the total number of documents that meet the query criteria. In addition to the Count command, the count () method in the MONGO script and the Cursor.count () method all get the total number of documents.

Example

There is now only one of these documents in the collection named Records :

{a:1, b:01, b:11, b:42, B:2}

The following operation counts the number of documents in the collection and eventually returns the number 4:

Db.records.count ()

The following action counts the number of documents with the value of field a as 1 , and eventually returns 3:

Db.records.count ({a:1})
2) Remove Duplicates

Removing a recurrence returns a record of the query to the specified field value that is not duplicated. In the MONGO script, use the distinct command or the db.collection.distinct () method to perform the de-weight. Take a look at the following examples of removing duplicates:

Example

There is now only one of these documents in the collection named Records :

{a:1, b:01, b:11, b:11, b:42, B:22, B:2}

Take a look at the following steps to remove a duplicate of field B using the db.collection.distinct () method:

Db.records.distinct ("B")

The result of this operation is:

[0, 1, 4, 2]
3) Grouping

The grouping operation groups The queried documents according to the given field values. The grouping operation returns an array of documents in which each document contains the calculated results of a set of documents.

You can use the Group command or the db.collection.group () method in the MONGO script to work with grouped functions.

The Group command cannot be run on a shard collection. It is important to note that the result set size of the group operation cannot exceed 16MB.

Example

There is now a collection called Records , which contains the following documents:

{a:1, count:41, count:21, count:42, Count:32, count:11, count:5
    4, Count:4}

Consider grouping the documents in the collection with the group command, where the field a value is less than 3 and you need to count the Count fields for each group:

Db.records.group ({   1},   3}   ,function(cur, result) {Result.count + = Cur.count},   0}})

The result of this grouping operation is:

[  1, count:15},  2, Count:4}]

Excerpt reference: MongoDB Chinese Document

MongoDB Learning--Aggregation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.