Http://www.aliyun.com/zixun/aggregation/14417.html ">apache Hadoop Distributed File processing system is good, and it is gaining attention." However, it also has disadvantages. Some organizations find that starting with Hadoop requires rethinking the software architecture, and the data skills it needs are necessary.
For some, one of the problems with the batch model of Hadoop is that it estimates the time it will take to process batches between data acquisition. This is the case for many businesses, when they operate locally, or have a lot of business during the day, but rarely at night (if any). If the night window is large enough to handle the data accumulated the day before, everything will be fine. While Windows downtime is small or non-existent for some businesses, even using Hadoop's high-performance processing, they still get more data in a day than they can handle within 24 hours.
For organizations that can accept small windows, adding a method based on data processing components might help, Gigaspaces's chief technology officer, Nati Shalom, wrote in a recent blog about using Hadoop faster. By constantly processing incoming data into useful packages and deleting static data that does not require enterprise processing (or reprocessing), you can significantly accelerate the batch process of their large data.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.