The previous article introduced how to conduct distributed storage of Relational Data Based on Mongodb. With storage, queries will be involved. Although it can be queried in a common way, we will introduce how to use the MapReduce function provided in MONGODB for query today.
I have written an article about MongoDb MapReduce before,
Today we will introduce how to perform mapreduce query based on the sharding mechanism. In the official documents of MongoDB, the following sentence is used:
Sharded Environments
In sharded environments, data processing of map/reduce operations runs in parallel on all shards.
That is, the map/reduce operation runs on all shards in parallel.
The following describes how to construct a mapreduce query using the environment set up in the previous article:
First of all, the sharding-based mapreduce and non-sharding data have some differences in the return structure. Currently, I have noticed that the custom json format is not supported for returned data, that is, the following method may cause problems:
Return {count: total };
Note: The above situation is currently found in my test environment, for example:
You need to change it to return count;
The following is the test code. First, query the corresponding quantity by post id (query instances by Group ):
Public partial class getfile: System. Web. UI. Page
{
Public Mongo {get; set ;}
Public IMongoDatabase DB
{
Get
{
Return this. Mongo ["dnt_mongodb"];
}
}
/// <Summary>
/// Sets up the test environment. You can either override this OnInit to add initim initialization.
/// </Summary>
Public virtual void Init ()
{
String ConnectionString = "Server = 10.0.4.85: 27017; ConnectTimeout = 30000; ConnectionLifetime = 300000; MinimumPoolSize = 512; MaximumPoolSize = 51200; Pooled = true ";
If (String. IsNullOrEmpty (ConnectionString ))
Throw new ArgumentNullException ("Connection string not found .");
This. Mongo = new Mongo (ConnectionString );
This. Mongo. Connect ();
}
String mapfunction = "function () {\ n" +
"If (this. _ id = '000000') {emit (this. _ id, 1) ;}\ n" +
"};";
String performancefunction = "function (key, current) {" +
"Var count = 0;" +
"For (var I in current) {" +
"Count + = current [I];" +
"}" +
"Return count; \ n" +
"};";
Protected void Page_Load (object sender, EventArgs e)
{
Init ();
Var mrb = DB ["posts1"]. MapReduce (); // attach_gfstream.files
Int groupCount = 0;
Using (var mr = mrb. Map (mapfunction). Reduce (reducefunction ))
{
Foreach (Document doc in mr. Documents)
{
GroupCount = int. Parse (doc ["value"]. ToString ());
}
}
This. Mongo. Disconnect ();
}
}
The following is the running query result:
Next, we will demonstrate how to return the queried post information and load it into the list set. Here we only query two posts with the ID of 548110 and 548111:
String mapfunction = "function () {\ n" +
"If (this. _ id = '000000' | this. _ id = '000000') {emit (this, 1) ;}\ n" +
"};";
String performancefunction = "function (doc, current) {" +
"Return doc; \ n" +
"};";
Protected void Page_Load (object sender, EventArgs e)
{
Init ();
Var mrb = DB ["posts1"]. MapReduce (); // attach_gfstream.files
List <Document> postDoc = new List <Document> ();
Using (var mr = mrb. Map (mapfunction). Reduce (reducefunction ))
{
Foreach (Document doc in mr. Documents)
{
PostDoc. Add (Document) doc ["value"]);
}
}
This. Mongo. Disconnect ();
}
The following is the running query result:
The map/reduce method has many other methods. If you are interested, you can take a look at the following links:
Http://cookbook.mongodb.org/patterns/unique_items_map_reduce/
Http://www.mongodb.org/display/DOCS/MapReduce
And the article I wrote earlier: http://www.cnblogs.com/daizhj/archive/2010/06/10/1755761.html
Of course, some temporary files will be generated when mongos performs map/reduce operations, such:
I guess these temporary files may improve the performance of the system again (but not observed at present ).
Of course, for the gridfs System of mongodb (which can be used to build a distributed file storage system, I have already introduced it in this article and I have also tested it, but unfortunately it was not successful, it often reports errors, such:
Thu Sep 09 12:09:29 Assertion failure _ grab client \ parallel. cpp 461
It seems that when the mapreduce program is linked to mongodb, there will be some problems, but I don't know if it is the cause of its own stability, or my machine environment settings (memory or 64-bit system mongos and 32-bit client connection problems ).
Well, today's article is here first.
Link: http://www.cnblogs.com/daizhj/archive/2010/09/09/1822264.html
BLOG: http://daizhj.cnblogs.com/
Author: daizhj, Dai zhenjun