MySQL, MongoDB, or hadoop-record

Source: Internet
Author: User

This article is based on a discussion on a Google forum. I would like to know the comparison of MongoDB, hadoop, and MySQL and how to use them properly.

The problem is as follows:

This is a discussion on the Google group of MongoDB-user. LZ is a technician of an advertising company who needs to store and analyze the log information of 0.5 billion MB rows (rows). He mentioned this issue in group, and put forward three ideas:

  1. Store the data of 50 fields in each row in row of row 0.5 billion in MySQL-index each field.
  2. Same as above, but it is used to store hundreds of millions of rows of data in MongoDB and create an index.
  3. Load all rows into hadoop for data analysis through mapreduce.

This is a common problem. Let's try to think about the solution first, and then read the methods of the gods following the post.

The following is the address of this discussion:

Http://groups.google.com/group/mongodb-user/browse_thread/thread/632d1648707e51d9/9a504c99168cf4e7? PLI = 1

Suggestion:

1. MongoDB is used as the storage layer and hadoop Processing

2. MongoDB is not suitable for data that is often inserted for update (requires manual sharding (do you need to do it now ?)), MongoDB after 1.6 already supports sharding.

3. MongoDB's JavaScript-based mapreduce is a lightweight thread that may become a bottleneck for data processing, but you can write a mapreduce program by yourself.

4. What does hadoop-plugin mean? Is MongoDB providing inputreader, inputsplit, etc? A: These guys really want to write a jar package on hadoop ....

5. I think hadoop + MongoDB will be a very prefect combination,

The query is like this: select OS, sum (IMPs), sum (CLICKS) from table where cc = 'us' and ismobile = 1 group by OS;

Storage in MongoDB

Use Mongo as the input into hadoop, using Mongo's index for CC = "us" & ismobile = 1

Use hadoop to do the Aggregation

Store the result in your d

6. Some people think it is inappropriate, because MongoDB's query indexes and the like may not be as good as MySQL. Therefore, MongoDB may just do the storage work in the early stage!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.