Future predictions: Hadoop will not be able to handle large data alone
Source: Internet
Author: User
KeywordsLarge data they alone forecast for the future
Hadoop will not be able to handle large data alone
"Hadoop and MapReduce models are definitely one way to solve big data problems," Sriram said. But what you need to keep in mind is that Hadoop is just as good for batch processing as it is now. I believe that soon, we need to be able to process this data in real time. "Sriram, a Hadoop consultant, is not saying that this ubiquitous platform is slow. With such a powerful framework, a large amount of data can be processed within a minute, but that is not always good enough. How to solve this problem?
Shaun Connolly, vice president of Hortonworks company strategy, points out that Hadoop has been constantly getting faster and more flexible. "We are now increasingly explicit about optimizing the NoSQL database used by Hadoop. It can take advantage of memory processing, so that requests can be returned faster without using bulk processing. If you use yarn, you can actually do more interactive queries based on memory. "In addition, there is a surge of streaming analysis tools or processes that rely on technologies such as storm, and developers can embed them into Hadoop using a yarn architecture." Today's large data users using Hadoop are studying near-real-time performance. However, this is not 100% real time, and one important difference is that when an organization uses a computer to make instant, quick decisions, it must refer to a long time ago analysis, which may have been artificially compromised.
This is where the lambda architecture comes in. It allows enterprise organizations to separate incremental data from their large amounts of data for individual processing. Most of the data goes into the batch system, and a "speed layer" is called "velocity level" for real-time processing of data. NoSQL databases (most of them) have their own ecosystems because they provide a dedicated tool to manage data to suit specific cases.
Integration will be critical, but none of the tools works for everyone.
When it comes to providing aid to Hadoop, well-designed tools are dramatically increasing in large data airborne at an alarming rate. Elasticsearch,pentaho, as well as many other tools that cover the entire large data ecosystem in different niche markets. But the next big step is how to get them to work together better. Until this stage arrives, the management of large data will be more casual.
Of course, this does not mean that an integrated product will always be suitable for all business models. Data comes in many forms, and every business organization wants to use that information to do different things. Organizations will need to use a variety of different ways to process their data, depending on the source of the data, the format, why they collect it, how they want to store it, how they want to analyze it, and how quickly they need to deal with it. We want to remain modular while consolidating. This will allow organizations to create the right tools for their own unique use cases without having to be restarted every time.
Software engineers who are familiar with large data technology will have a great demand
Mohan points out that one of the most significant challenges in large data spaces should be associated with minimal talent pools. "There are not many people with this experience. "This does not mean that software engineers need to go to school and get a doctorate." Skilled workers do not need a ph. D. degree to understand big data. However, they do need to acquire knowledge and professional skills. The goal, says Sriram, is that any software engineer willing to invest in time and energy will be able to achieve it. The classroom is not necessarily the only starting point. Through efforts to realize the size of relational database and transition to a relational database, so that it has to master large data problems to lay a solid foundation.
What Dr. Mohan is doing is preparing the future work world for today's software engineers. He will offer two educational opportunities in Boston's Big Data Techcon: Hadoop's data transfer tool and MapReduce introduction. For those who want to be in the job market for the next few years, it's time to start.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.