The Apache Spark abbreviation Spark,spark is an open source cluster computing environment similar to Hadoop, but there are some differences between them, and these useful differences make Spark more advantageous in some workloads, in other words, Spark With the memory distribution dataset enabled, it can optimize the iteration workload in addition to providing interactive queries.
The Apache Spark is implemented in the Scala language, which uses Scala as its application framework. Unlike Hadoop, Spark and Scala can be tightly integrated, and Scala can manipulate distributed datasets as easily as local collection objects.
Although the Apache Spark is created to support iterative operations on distributed datasets, it is actually a complement to Hadoop that can be run in parallel in the Hadoo file system. This behavior can be supported by a Third-party cluster framework named Mesos. Spark is developed by the University of California-Berkeley AMP Lab (algorithms, Rogue, and arranges lab) to build large, low-latency data analysis applications.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.