Spark replaces MapReduce as the Apache top project
Source: Internet
Author: User
KeywordsLevel item become already run
The Apache Spark is a memory data processing framework that has now been upgraded to a Apche top-level project, which helps to improve spark stability and replace mapreduce status in the next generation of large data applications.
Spark has recently been very strong, replacing the mapreduce trend. This Tuesday, the Apache Software Foundation announced Spark upgraded to a top-level project.
Due to the mapreduce and ease of use of performance and speed, spark currently has a large user and contributor community. This means that spark more in line with the next generation of low latency, real-time processing, iterative computing large data applications requirements.
Spark, the founder of the University of California, Berkeley, has now started a company called Databricks to promote the commercialization of spark.
Technically, Spark is a stand-alone project, but designed to work with the Hadoop Distributed File System (HDFS), which can be run directly on HDFS, SIMR users can mapreduce the cluster without administrator privileges and installation, and benefits from yarn ( Next Generation Hadoop Resource planner and resource Manager, Spark can now run on the same cluster as MapReduce. The Hadoop enterprise application Pioneer Cloudera has started offering spark enterprise application support to customers.
Although many new projects (such as Hortonworks's Stinger) adopt different processing frameworks, mapreduce and spark lack many tools (such as pig and casading), and for certain batch tasks, MapReduce is still a good choice. As Cloudera co-founder Mike Olson points out: MapReduce has a lot of legacy workloads that won't be transferred for a short time, even if spark.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.