Currently, the Hadoop distribution has an open source version of Apache and a Hortonworks distribution (HDP Hadoop), MapR Hadoop, and so on. All of these distributions are based on Apache Hadoop.
"Editor's note" Recently, MAPR has formally integrated the Apache drill into the company's large data-processing platform, and opened up a series of large database-related tools. Today, in the highly competitive field of Hadoop, open source has become a tool for many companies, they have to contribute more code to protect themselves, but also through open source to attack other companies. In this case, Derrick Harris made a brief analysis on Gigaom. Recently, Mapr,apache Drill Project founder, has ...
April 19, 2014 Spark Summit China 2014 will be held in Beijing. The Apache Spark community members and business users at home and abroad will be gathered in Beijing for the first time. Spark contributors and front-line developers from AMPLab, Databricks, Intel, Taobao, NetEase, and others will share their Spark project experience and best practices in production environments. MapR is well-known Hadoop provider, the company recently for its Ha ...
MAPR today updated its Hadoop release, adding Apache Drill 0.5 to reduce the heavy data engineering effort. Drill is an open source distributed ANSI query engine, used primarily for self-service data analysis. This is the open source version of Google's Dremel system, which is used primarily for interactive querying of large datasets-which support its bigquery servers. The objective of the Apache Drill project is to enable it to scale to 10,000 servers or more servers, while processing in a few seconds ...
VMware today unveiled the latest open source project--serengeti, which enables companies to quickly deploy, manage, and extend Apache Hadoop in virtual and cloud environments. In addition, VMware works with the Apache Hadoop community to develop extension capabilities that allow major components to "perceive virtualization" to support flexible scaling and further improve the performance of Hadoop in virtualized environments. Chen Zhijian, vice president of cloud applications services at VMware, said: "Gain competitive advantage by supporting companies to take full advantage of oversized data ...
This year, big data has become a topic in many companies. While there is no standard definition to explain what "big Data" is, Hadoop has become the de facto standard for dealing with large data. Almost all large software providers, including IBM, Oracle, SAP, and even Microsoft, use Hadoop. However, when you have decided to use Hadoop to handle large data, the first problem is how to start and what product to choose. You have a variety of options to install a version of Hadoop and achieve large data processing ...
Large data is one of the most active topics in the IT field today. There is no better place to learn about the latest developments in big data than the Hadoop Summit 2013 held in San Jose recently. More than 60 big data companies are involved, including well-known vendors like Intel and Salesforce.com, and startups like SQRRL and Platfora. Here are 13 new or enhanced large data products presented at the summit. 1. Continuuity Development Public ...
Spark is a memory-based, open-source cluster computing system designed for faster data analysis. Spark was developed using Scala by Matei, AMP Labs, University of California, Berkeley. The core part of the code is only 63 Scala files, which is very lightweight. Spark provides an open source clustered computing environment similar to Hadoop, but Spark performs better on some workloads based on memory and iteratively optimized designs. & nbs ...
Top Ten Open Source technologies: Apache HBase: This large data management platform is built on Google's powerful bigtable management engine. As a database with open source, Java coding, and distributed multiple advantages, HBase was originally designed for the Hadoop platform, and this powerful data management tool is also used by Facebook to manage the vast data of the messaging platform. Apache Storm: A distributed real-time computing system for processing high-speed, large data streams. Storm for Apache Had ...
Large data is one of the most active topics in the IT field today. There is no better place to learn about the latest developments in big data than the Hadoop Summit 2013 held in San Jose recently. More than 60 big data companies are involved, including well-known vendors like Intel and Salesforce.com, and startups like SQRRL and Platfora. Here are 13 new or enhanced large data products presented at the summit. Continuuity Development Company Now ...
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.