R Performance Test--rmongo for MongoDB

Last Update:2014-12-28 Source: Internet

Author: User

Keywords Nbsp; name insert whether when

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

At the beginning of September, rhttp://www.aliyun.com/zixun/aggregation/13461.html ">mongodb officially released the revised version, which means The language of the numerical calculation can also be in line with the NoSQL product, but in view of my side does not have the company really to use the Union of R and MongoDB, so in the efficiency question, we also dare not take lightly, therefore did one such test.

The test environment is 8 cores, 64-bit machines. The library used for testing is a collection without sharding, about 30G. Used to store data such as user preferences, tag information, and so on.

Library (RMONGODB)     mongo <- mongo.create ()   if (mongo.is.connected MONGO )  {     ns <-  ' rivendell.user '      print (' Query for a field without an index, query one ')      Print (System.time (P <- mongo.find.one (Mongo,ns,list (friend=600)))      Print (' query for an indexed field, multiple, Without buffer ')      print (System.time p <-  Mongo.find (Mongo,ns,list (friend=600)))      print (' Check for Cache policy ')      print (system.time (P <- mongo.find (Mongo,ns,list (friend=600)))       Print (' query for an index-less field, multiple, Has buffer ')      buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' Friend ', 600L)      Query  <- mongo.bson.from.buffer (BUF)      print (System.time p <- mongo.find ( mongo,ns,query))      print (' SeeIs there a caching strategy ')      buf <- mongo.bson.buffer.create ()      Mongo.bson.buffer.append (buf, ' Friend ', 600L)      Query <- mongo.bson.from.buffer (BUF)      Print (System.time p <- mongo.find (mongo,ns,query))       Print (' Greater than query, query a record ')      print (System.time p <- mongo.find.one (mongo,ns,list) (Friend =list (' $gt ' =600l)))      print (' greater than record, query multiple records ')      print (System.time cursor  <- mongo.find (mongo,ns,list friend=list (' $gt ' =600l)))      Mongo.cursor.destroy (cursor)       Print (' query for an indexed record ')      print (System.time p <-  Mongo.find.one (mongo,ns,list (' _id ' =3831809l)))      print (' query indexed Records ')      print ( System.time (P <- mongo.find (mongo,ns,list (' _id ' =3831809l)))       print (' Insert a record ') &NBSP;&NBSp;   Buf <- mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', " Huangxin ")      mongo.bson.buffer.append (buf, ' age ', 22L)      p <-  Mongo.bson.from.buffer (BUF)      print (System.time (Mongo.insert (mongo,ns,p))       Print (' Find the record just inserted ')      print (System.time p <- mongo.find.one (mongo,ns,list) (' Name ' = ' huangxin '))      if (!is.null (p))      {         print (' Success ')     }       print (' BULK insert ')       buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      Mongo.bson.buffer.append (buf, ' age ', 22L)      P1 <- mongo.bson.from.buffer (BUF)        buf <- mongo.bson.buffer.create ()      Mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      mongo.bson.buffer.append (buf, ' age ', 22L)       P2 <- mongo.bson.from.buffer (BUF)       buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      Mongo.bson.buffer.append (buf, ' age ', 22L)      P3 <- mongo.bson.from.buffer (BUF)        Print (System.time (Mongo.insert.batch (Mongo,ns,list (P1,P2,P3)))       Print (' Find the record just in bulk ')      print (System.time (cursor <- mongo.find) (mongo,ns,list (' name ' = ') Huangxin '))       i <- 0      while (Mongo.cursor.next (cursor))      {         i <- i + 1     }       Print (i)       print (' batch update ')      print (systeM.time (Mongo.update (mongo,ns,list (name= ' huangxin '), List (' name ' =  ' Kym ')))       print (' See if the update was successful ')      print (System.time (p <- mongo.find.one mongo,ns,list (' name ' = ' Kym '))      if (!is.null (p))      {         print (' success ')     }       print (' bulk deletion ')      print (System.time mongo.remove (mongo,ns,list) ( Name= ' Kym ')) &nbsp}       Print (System.time p <- mongo.find.one ( ' Name ' = ' Kym '))      if (!is.null (p))      {         print (' Success ')     } 

[1]  "Query a field without an index, query for a"  user system elapsed  0.000 0.000 0.115  [1]  "Query for a field without an index, multiple, without buffer"  user system elapsed  0.000 0.000 32.513  [1]  "See if there is a caching strategy"  user system elapsed  0.000 0.000 32.528  [1]  "Query for a field without an index, multiple, has buffer"  user system elapsed  0.000 0.000 32.685  [1]   "See if there is a cache policy"  user system elapsed  0.000 0.000 33.172  [1]  "is greater than the query, Query a record " user system elapsed  0.000 0.000 0.001  [1] " is greater than the record, query multiple records "  user system elapsed  0.000 0.000 0.014  [1]  "Querying an indexed record"  user  system elapsed  0 0 0  [1]  "query indexed records"  user system elapsed  0  0 0  [1]  "Insert a record"  user system elapsed  0 0 0  [1]  " Find just insertedRecord " user system elapsed  0.00 0.00 35.42  [1] " Success " [1] " Bulk Insert " user system elapsed  0 0 0  [1] " find records that have just been inserted in bulk  user  system elapsed  0.004 0.000 35.934  [1] 7  [1]  "batch Update"  user  system elapsed  0.000 0.004 0.000  [1]  "View Update Success"  user system  Elapsed  0.000 0.000 67.773  [1]  "Success"  [1]  "Bulk deletion"  user system  elapsed  0 0 0  user system elapsed  0.000 0.000 91.396 

What I have not understood before is why greater than and equal to, the gap will be so much worse. Later, when I was using Python to do the same test, I found that Python's efficiency is the same, so this proves that this is not a mongodb problem, and I do not believe that at the database level, the driver of a language will have so much difference.

Later I discovered a difference between Python and r about MongoDB driver. First of all, Python find is not to pull the query to the whole of the data set back, but to return a cursor, that is, he executed the find command does not consume time, and if you add while Cursor.next (), will actually execute the query.

But R is not the same, R will first consider the size of the dataset (or otherwise), and then return cursor as the case may be, or pull the whole dataset back. If we calculate the previous while Mongo.cursor.next (cursor), then we will find that the efficiency difference is not obvious in the operation greater than and equal to.

In practice, BULK Insert is a very common application scenario, but for R or Matlab language, the efficiency of the cycle has been a mishap, so next, I will try to use the Apply series to solve the cycle of R language, if the actual operation found feasible, Then it is worth trying to use the parallel Computing library of Mutilab and so on to give full play to multi-core efficiency.

Original link: http://www.cnblogs.com/kym/archive/2011/09/26/2191501.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More