Discover most active apache projects, include the articles, news, trends, analysis and practical advice about most active apache projects on alibabacloud.com
Currently, the Hadoop distribution has an open source version of Apache and a Hortonworks distribution (HDP Hadoop), MapR Hadoop, and so on. All of these distributions are based on Apache Hadoop.
Spark is a memory-based, open-source cluster computing system designed for faster data analysis. Spark was developed using Scala by Matei, AMP Labs, University of California, Berkeley. The core part of the code is only 63 Scala files, which is very lightweight. Spark provides an open source clustered computing environment similar to Hadoop, but Spark performs better on some workloads based on memory and iteratively optimized designs. & nbs ...
Absrtact: 7 years ago, one of the ideas, the success of today's popular social network and microblogging service--twitter. Twitter now has more than 200 million monthly active subscribers, and about 500 million tweets are sent every day. Behind all this is the support of a large number of open source projects. Twitter, known as the "Internet SMS Service", allows users to post no more than 140 tweets, the idea from Twitter's co-founder, Jack Dorsey, which was dubbed "the dumbest Ever" by analysts 7 years ago ...
This paper is an excerpt from the book "The Authoritative Guide to Hadoop", published by Tsinghua University Press, which is the author of Tom White, the School of Data Science and engineering, East China Normal University. This book begins with the origins of Hadoop, and integrates theory and practice to introduce Hadoop as an ideal tool for high-performance processing of massive datasets. The book consists of 16 chapters, 3 appendices, covering topics including: Haddoop;mapreduce;hadoop Distributed file system; Hadoop I/O, MapReduce application Open ...
There's a joke in the IT community: how do you call programmers who wrote six patches to the Linux kernel? The answer is: The person hired. Did you understand? Let's ask Linux Fundation colleague Greg Kroah-Hartman to explain: "The joke is that amateur developers will not develop more than five kernel patches because when you develop five patches, you tend to Will get a job invitation. In fact, this is not a joke, because such things too much. "This may be those in the open source ...
The end of 2013, we based on the past year's user access, exchange and sharing and the project itself update frequency and other aspects of the open source China's nearly 30,000 open source software statistics, so that the top 10 most popular open source software, for reference only. The list is mainly for domestic open source software, the list of 10 open source software is not the same type, although put together is not very scientific. We only select from a few angles, including user access, software updates, and user discussion of the software. 1. Goagent ...
The recent investment in cloud computing by major giants has been very active, ranging from cloud platform management, massive data analysis, to a variety of emerging consumer-facing cloud platforms and cloud services. And the large-scale data processing (Bigdata 處理) technology which is represented by Hadoop makes "Business king" Change to "data is king". The prosperity of the Hadoop community is obvious. More and more domestic and foreign companies are involved in the development of the Hadoop community or directly open the software that is used online. The same year with ...
The appearance of MapReduce is to break through the limitations of the database. Tools such as Giraph, Hama and Impala are designed to break through the limits of MapReduce. While the operation of the above scenarios is based on Hadoop, graphics, documents, columns, and other NoSQL databases are also an integral part of large data. Which large data tool meets your needs? The problem is really not easy to answer in the context of the rapid growth in the number of solutions available today. Apache Hado ...
HBase provides both scalability and the economics of sharing the same infrastructure as Hadoop, but does its flaws rip off its hind legs? The NoSQL expert laid out the debate frame. HBase is part of the world's most popular large data-processing platform, Apache Hadoop, modeled after Google BigTable. But can this lineage guarantee hbase a dominant role in the competitive and fast-growing NoSQL database market? Michael of the MAPR company.
Guide: As we all know, the big data wave is gradually sweeping all over the world. And Hadoop is the source of the Storm's power. Microsoft is an unprecedented partner with the Apache Hadoop community. Microsoft's move is to build a Microsoft-branded Hadoop biosphere, leveraging its own advantages in the software world. Today, Microsoft has put Hadoop at the heart of its big data strategy. The reason for Microsoft's move is to have a fancy for had ...
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.