Big data: development or change?

Source: Internet
Author: User
Keywords Large data then or the implementation side can

Whether you're using relational database systems, hash tables, or other structures to maintain your data, you must have heard about NoSQL and big data. At present, companies such as Google, Yahoo and Amazon are already developing or using large data/nosql solutions. But apart from some very specific cases, are these big data implementations really that useful? In a recent article, Capgemini Consulting's Steve even pointed out that sometimes big data could be a scam, or at least not a panacea, to solve the problems of the original relational database management system, which you may have noticed:

I noticed that the hype about big data in the market has become rampant. Some companies view this explosive growth of capacity as part of a continuation of history, new technology and new methods, but rather rather than change. Admittedly, the Map reduce technology is cool, but it's technically more difficult than SQL and database design, so it also means that the technology is far from being a commercial panacea.

Steve went on to point out that memory database technologies that can be used to store extremely important data sets of a certain size (based on a relational database management system) will soon become a reality. He explains his point by quoting an article that discusses how Yahoo used a significantly modified http://www.aliyun.com/zixun/aggregation/14171.html ">" A few years ago. Postgres implementation to store 2PB data:

Below is the main point of the big data: more than 95% of it is just a continuous increase in the number of data, which is matched with enhanced processing power and storage capacity, or at least it grows. (...... Of course, optimizing indexes can be more difficult, and you may want to move data back and forth to a solid-state drive, but in strict terms, the amount of data becomes "bigger" rather than a simple data movement.

We used to hear about the same thing from Mike Stonebraker, who says that many users will benefit from methods such as a rebuilt relational database management system and column storage to make as much use of main memory and solid-state drives as possible while still maintaining traditional strong consistency, acid semantics , and in some cases you can use SQL. But Steve then re-emphasized the map reduce technology, and think that the model behind this implementation requires you to have a different way of thinking about how to store, query, and manipulate data, and in some ways it becomes more difficult for users to integrate this solution into their existing investment environment.

Just as not many people are able to think in a multi-threaded way, so many people can think in the way of map reduce.

When we often hear about new implementations, or where vendors expect to be able to motivate us to adopt their solutions, where are the big data? According to Steve's Point of view:

we find that people use large data in the same way as SOA, and then claim to "integrate Hadoop" or "integrate social media" or, alternatively, "We've built a connector." Take a look at the story that just made you drop your glasses. It's just an old-fashioned school enterprise application integration (EAI) connector, but it's connected to a new data source or new ETL connector.

This may be a general statement, but it also shows some facts. Because there is so much hype, and too many vendors in their own implementation of the nosql/large data labels, but in fact these implementations for the task at hand, then in this "new data solution" behind the loss of core information risk? As Steve points out, this situation may be similar to the early application of SOA, when vendors put the SOA tag on their solutions, but in reality most scenarios are not SOA at all. So how do you measure exactly what you need is a big data solution, or is it a big scam (as Steve says)? Steve made some suggestions that could be used at least when evaluating vendor solutions. These include:

can you use "big database" instead of "Big data"? If you can, it's just an update. "Advanced" can be simplified to "we just got an enterprise application integration connector"? Is it the same as the 2009 product, only a large data/nosql label on the new product? Is there any way to move the processing process to the data instead of moving the data around? This is a practice that many people have suggested in the past, including Jim Grey.

Unfortunately, these "rules" are unscientific and require some degree of subjective judgment. So are there any other rules available? If you have migrated from a traditional relational database management system to another platform, what do you use to determine the need for migration and how do you choose the specific implementation to migrate to? Is this migration work successful? If it doesn't work, why?

(Author Mark Little translator Li Wei)

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.