MongoDB by Time Clustering Java

Source: Internet
Author: User

When storing into MongoDB is a string type of time, small tips:

1. It is not possible to use MongoDB's time keyword directly on this domain by Time clustering (weekly, monthly), because MongoDB has its own temporal type, and currently it only recognizes its own time type.

2. If you have a simple clustering of time, such as clustering by year, month, day, hour, minute, second, we can use MongoDB's substr keyword to simulate the type of time that MongoDB can recognize.

For example, 2015-03-02 22:53:45---> 2014 so intercept year,2014-03-02 is to intercept day.

3. It's a bit of a hassle to cluster on a quarterly or a weekly basis, so we need to take a step-by-month, day-to-day clustering to get the intermediate results of clustering to two clusters in Java. For example, when clustering in the week, you need to follow day, and then combine the calendar of Java to draw the week

db.myObject.aggregate(

{$project:{New_time_stamp:{$substr:["$time _stamp",0,Ten]}}},

{$group:{_id:"$new _time_stamp","Count":{$sum:1}}});

MongoDB Clustering

MongoDB clustering can be divided into three sub-operations, namely match, project, group

Three sub-operations are represented as dbobject types, aggregation accept list<dbobject> parameters, so three operations are allowed to be tied.

Here is an example.

$match: {type: ' Airfare '}, type is a domain, and airfare is a value, this requires an exact match. If match is more complicated, you can write it that way.

$match: {type: ' Airfare ', date: {$gte: ' 2015-03-03 ', $lte: ' 2015-03-05 '}}

Note that the match content is not an array, but a comma-separated object

$project: {id: {$substr: ["$date", 0, 4]}

Pass along all the documents with only the specified field to the next stage of the pipeline. By default, the _id property is passed to the next stage. can be done by

"_id": the _id attribute is explicitly deleted.

Project can rename values and Concat, substr, add, mutiply, mod actions

$group The most important, clustering operations.

The group must have the _id attribute, which identifies those attributes that are clustered. However, the value of _id can be null and used for summing and averaging.

The property name can be modified at the same time in group

$group: {

$_ID: {month: {$month: "$date"}, Day: {$day: "$date"}},

$totalPrice: {$sum: {$multiply: ["$price", "$quantity"]}},

$averagePrice: {$avg: ["$price"]},

Count: {$sum, 1}

}

_id, Totalprice, Averageprice, Count will be returned as a property, respectively.

The additional properties also include

$sort, Sort by a property

{$sort: {age:1, Money:-1}}

$skip, $limit indicates how many entries are crossed or returned only.

$unwind to disassemble the elements of the array.

$out represents a new collection for the result of the output

{$out: "authors"}, it must be the last stage of the last pipeline

A clustering example, fully implemented with Java

DBObject fields =NewBasicdbobject ("URL","$uri") DBObject Project=NewBasicdbobject ("$project", Fields) DBObject IDfield=NewBasicdbobject ("_id",NewBasicdbobject ("URL","$url")); Idfield.put ("Count",NewBasicdbobject ("$sum",1));D bobject Group=NewBasicdbobject ("$group", IDfield);D bobject sort=NewBasicdbobject ("$sort",NewBasicdbobject ("Count", -1)); List<DBObject> pipeline =Arrays.aslist (Project, group, sort); Aggregationoutput Output=Collection.aggregate (pipeline); for(DBObject result:output.results ()) System. out. println (Result)

The DBObject assembly in the above code is cumbersome and does not see the JSON schema, because a better approach is to convert from JSON

DBObject project = Json.parse ("{$project: {"url": \" $uri \ "}}" ) dbobject Group   = Json.parse ("{$group: {_id:{"url"   "$url"}}}") dbobject sort    = Json.parse ("  {$sort: {count:-1}}")

MongoDB by Time Clustering Java

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.