R Performance Test--rmongo for MongoDB

Source: Internet
Author: User
Keywords Nbsp; name insert whether when

At the beginning of September, rhttp://www.aliyun.com/zixun/aggregation/13461.html ">mongodb officially released the revised version, which means The language of the numerical calculation can also be in line with the NoSQL product, but in view of my side does not have the company really to use the Union of R and MongoDB, so in the efficiency question, we also dare not take lightly, therefore did one such test.

The test environment is 8 cores, 64-bit machines. The library used for testing is a collection without sharding, about 30G. Used to store data such as user preferences, tag information, and so on.

Library (RMONGODB)     mongo <- mongo.create ()   if (mongo.is.connected MONGO )  {     ns <-  ' rivendell.user '      print (' Query for a field without an index, query one ')      Print (System.time (P <- mongo.find.one (Mongo,ns,list (friend=600)))      Print (' query for an indexed field, multiple, Without buffer ')      print (System.time p <-  Mongo.find (Mongo,ns,list (friend=600)))      print (' Check for Cache policy ')      print (system.time (P <- mongo.find (Mongo,ns,list (friend=600)))       Print (' query for an index-less field, multiple, Has buffer ')      buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' Friend ', 600L)      Query  <- mongo.bson.from.buffer (BUF)      print (System.time p <- mongo.find ( mongo,ns,query))      print (' SeeIs there a caching strategy ')      buf <- mongo.bson.buffer.create ()      Mongo.bson.buffer.append (buf, ' Friend ', 600L)      Query <- mongo.bson.from.buffer (BUF)      Print (System.time p <- mongo.find (mongo,ns,query))       Print (' Greater than query, query a record ')      print (System.time p <- mongo.find.one (mongo,ns,list) (Friend =list (' $gt ' =600l)))      print (' greater than record, query multiple records ')      print (System.time cursor  <- mongo.find (mongo,ns,list friend=list (' $gt ' =600l)))      Mongo.cursor.destroy (cursor)       Print (' query for an indexed record ')      print (System.time p <-  Mongo.find.one (mongo,ns,list (' _id ' =3831809l)))      print (' query indexed Records ')      print ( System.time (P <- mongo.find (mongo,ns,list (' _id ' =3831809l)))       print (' Insert a record ')  &NBSp;   Buf <- mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', " Huangxin ")      mongo.bson.buffer.append (buf, ' age ', 22L)      p <-  Mongo.bson.from.buffer (BUF)      print (System.time (Mongo.insert (mongo,ns,p))       Print (' Find the record just inserted ')      print (System.time p <- mongo.find.one (mongo,ns,list) (' Name ' = ' huangxin '))      if (!is.null (p))      {         print (' Success ')     }       print (' BULK insert ')       buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      Mongo.bson.buffer.append (buf, ' age ', 22L)      P1 <- mongo.bson.from.buffer (BUF)        buf <- mongo.bson.buffer.create ()      Mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      mongo.bson.buffer.append (buf, ' age ', 22L)       P2 <- mongo.bson.from.buffer (BUF)       buf <-  Mongo.bson.buffer.create ()      mongo.bson.buffer.append (buf, ' name ', ' Huangxin ')      Mongo.bson.buffer.append (buf, ' age ', 22L)      P3 <- mongo.bson.from.buffer (BUF)        Print (System.time (Mongo.insert.batch (Mongo,ns,list (P1,P2,P3)))       Print (' Find the record just in bulk ')      print (System.time (cursor <- mongo.find) (mongo,ns,list (' name ' = ') Huangxin '))       i <- 0      while (Mongo.cursor.next (cursor))      {         i <- i + 1     }       Print (i)       print (' batch update ')      print (systeM.time (Mongo.update (mongo,ns,list (name= ' huangxin '), List (' name ' =  ' Kym ')))       print (' See if the update was successful ')      print (System.time (p <- mongo.find.one mongo,ns,list (' name ' = ' Kym '))      if (!is.null (p))      {         print (' success ')     }       print (' bulk deletion ')      print (System.time mongo.remove (mongo,ns,list) ( Name= ' Kym ')) &nbsp}       Print (System.time p <- mongo.find.one ( ' Name ' = ' Kym '))      if (!is.null (p))      {         print (' Success ')     } 

 

[1]  "Query a field without an index, query for a"  user system elapsed  0.000 0.000 0.115  [1]  "Query for a field without an index, multiple, without buffer"  user system elapsed  0.000 0.000 32.513  [1]  "See if there is a caching strategy"  user system elapsed  0.000 0.000 32.528  [1]  "Query for a field without an index, multiple, has buffer"  user system elapsed  0.000 0.000 32.685  [1]   "See if there is a cache policy"  user system elapsed  0.000 0.000 33.172  [1]  "is greater than the query, Query a record " user system elapsed  0.000 0.000 0.001  [1] " is greater than the record, query multiple records "  user system elapsed  0.000 0.000 0.014  [1]  "Querying an indexed record"  user  system elapsed  0 0 0  [1]  "query indexed records"  user system elapsed  0  0 0  [1]  "Insert a record"  user system elapsed  0 0 0  [1]  " Find just insertedRecord " user system elapsed  0.00 0.00 35.42  [1] " Success " [1] " Bulk Insert " user system elapsed  0 0 0  [1] " find records that have just been inserted in bulk  user  system elapsed  0.004 0.000 35.934  [1] 7  [1]  "batch Update"  user  system elapsed  0.000 0.004 0.000  [1]  "View Update Success"  user system  Elapsed  0.000 0.000 67.773  [1]  "Success"  [1]  "Bulk deletion"  user system  elapsed  0 0 0  user system elapsed  0.000 0.000 91.396 

What I have not understood before is why greater than and equal to, the gap will be so much worse. Later, when I was using Python to do the same test, I found that Python's efficiency is the same, so this proves that this is not a mongodb problem, and I do not believe that at the database level, the driver of a language will have so much difference.

Later I discovered a difference between Python and r about MongoDB driver. First of all, Python find is not to pull the query to the whole of the data set back, but to return a cursor, that is, he executed the find command does not consume time, and if you add while Cursor.next (), will actually execute the query.

But R is not the same, R will first consider the size of the dataset (or otherwise), and then return cursor as the case may be, or pull the whole dataset back. If we calculate the previous while Mongo.cursor.next (cursor), then we will find that the efficiency difference is not obvious in the operation greater than and equal to.

In practice, BULK Insert is a very common application scenario, but for R or Matlab language, the efficiency of the cycle has been a mishap, so next, I will try to use the Apply series to solve the cycle of R language, if the actual operation found feasible, Then it is worth trying to use the parallel Computing library of Mutilab and so on to give full play to multi-core efficiency.

Original link: http://www.cnblogs.com/kym/archive/2011/09/26/2191501.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.