Topic Center

Contact Sales

Home > Developer > Web Develop

List of Apache

Last Update:2015-02-06 Source: Internet

Author: User

Tags apache solr zookeeper elastic search sqoop value store accumulo

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

From http://projects.apache.org/indexes/quick.html

[Now, Future ], 2015-02-06 update.

Apache Accumulo	The Apache accumulo sorted, distributed Key/value Store is based on Google ' s BigTable design. It is built on top of Apache Hadoop, Zookeeper, and Thrift. It features a few novel improvements on the BigTable design in the form of Cell-level access labels and a server-side prog Ramming mechanism that can modify key/value pairs at various points in the data management process. Categories:database Languages:java Pmc:apache Accumulo
Apache Ambari	Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple. Categories:big-data Languages:java, Python, JavaScript Pmc:apache Ambari
Apache Avro	Apache Avro is a data serialization system. Categories:library, Big-data LANGUAGES:C, C + +, C #, Java, PHP, Python, Ruby Pmc:apache Avro
apache Chukwa	chukwa are an open source data collection system for monitoring large Distributed systems. Chukwa is built on top of the Hadoop distributed File System (HDFS) and map/reduce framework and inherits Hadoop ' s Scalabi Lity and robustness. Chukwa also includes a? Exible and powerful toolkit for displaying, monitoring and analyzing results of the collected data. categories:hadoop Languages:java, Javascript Pmc:apache chukwa
Apache Drill	Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop D ATA Storage Systems. It was inspired on part by Google ' s Dremel. Categories:big-data Languages:java Pmc:apache Drill
Apache giraph	Apache Giraph is a iterative graph processing system built for high scalability. For example, it's currently used at Facebook to analyze the social graph formed by users and their connections. Categories:big-data Languages:java Pmc:apache Giraph
Apache Hadoop	Hadoop is a distributed computing platform. This includes the Hadoop distributed Filesystem (HDFS) and an implementation of MapReduce. Categories:database Languages:java Pmc:apache Hadoop
Apache Hama	The Apache Hama is an efficient and scalable general-purpose BSP computing engine which can being used to speed up a large VA Riety of compute-intensive analytics applications. Categories:big-data Languages:java Pmc:apache Hama
Apache HBase	Use Apache HBase Software if you need random, realtime Read/write access to your Big Data. This project's goal is the hosting of very large tables--billions of rows X millions of columns--atop clusters of Comm Odity hardware. HBase is a open-source, distributed, versioned, column-oriented store modeled after Google ' s bigtable:a distributed Stor Age System for structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides bigtable-like c Apabilities on top of Hadoop and HDFS. Categories:database Languages:java Pmc:apache HBase
Apache Hive	the Apache Hive (TM) Data Warehouse software facilitates querying and managing large datasets residing in Distributed storage. Built on top of Apache Hadoop (TM), it provides * tools-to-enable easy data extract/transform/load (ETL) * A mechanism to Impose structure on a variety of data formats * Access to files stored either directly in Apache HDFS (TM) or in other DAT A storage systems such as Apache HBase (TM) * Query execution via MapReduce Hive defines a simple sql-like query language, Called HiveQL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who is familiar with the MapReduce framework to being able to plug I n their custom mappers and reducers to perform more sophisticated an analysis of that is not being supported by the built-in CAPAB Ilities of the language. HiveQL can also is extended with custom scalar functions (UDF's), aggregations (UDAF ' s), and table functions (UDTF ' s). Categories:database Languages:java Pmc:apache Hive
Apache Lucene Core	Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It's a technology suitable for nearly any application that requires Full-text search, especially cross-platform. Categories:database Languages:java Pmc:apache Lucene
Apache Mahout	Scalable Machine Learning Library Categories:library Languages:java Pmc:apache Mahout
apache Nutch	Apache Nutch is a highly extensible and scalable open source web crawler software project. Stemming from Apache Lucene, the project had diversified and now comprises-codebases, namely:nutch 1.x:a well mature D, production ready crawler. 1.x enables fine grained configuration, relying on Apache Hadoop data structures, which is great for batch processing. Nutch 2.x:an Emerging alternative taking direct inspiration from 1.x, but which differs in one key area; Storage is abstracted away from any specific underlying data store by using Apache Gora for handling object to persistent Mappings. This means we can implement a extremely flexibile model/stack for storing everything (fetch time, status, content, parsed Text, Outlinks, InLinks, etc.) into a number of NoSQL storage solutions. Being pluggable and modular of course has it ' s benefits, Nutch provides extensible interfaces such as Parse, Index and Sco Ringfilter ' s for custom implementations e.g. Apache Tika for parsing. Additonally, Pluggable indexing exists for Apache SOLR, Elastic Search, etc. Nutch can run on a single machine, but gains a lot of it strength from running in a Hadoop cluster Categories:web-framework Languages:java Pmc:apache Nutch
Apache Oozie	Oozie is a workflow scheduler system to manage Apache Hadoop jobs. Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java Map-reduce, streaming map-reduce, Pig, Hive, Sqoop and DISTCP) as well as system specific jobs (such as Java programs and Shell scripts). Categories:big-data Languages:java, JavaScript Pmc:apache Oozie
Apache Pig	Apache Pig is a platform for analyzing large data sets this consists of a high-level language for expressing data analysis Programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs was that their structure was amenable to substantial parallelization, which in turns en Ables them to handle very large data sets. Pig ' s infrastructure layer consists of a compiler that produces sequences of map-reduce programs. Pig ' s language layer consists of a textual language called Pig Latin, which has the following key properties: * Ease of PR Ogramming. It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations is explicitly encoded as data flow sequences, makin G them easy to write, understand, and maintain. * Optimization opportunities. The which tasks is encoded permits the system to optimize their execution automatically, allowing theUser to focus on semantics rather than efficiency. * Extensibility. Users can create their own functions to do special-purpose processing. Categories:database Languages:java Pmc:apache Pig
Apache Spark	Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala and Python as well as a rich set of libraries including stream processing, Machin e Learning, and graph analytics. Categories:big-data Languages:java, Scala, Python Pmc:apache Spark
Apache Sqoop	Apache Sqoop (TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. Categories:big-data Languages:java Pmc:apache Sqoop
Apache Storm	Apache Storm is a distributed real-time computation system. Similar to about Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general PRI Mitives for doing real-time computation. Categories:big-data Languages:java Pmc:apache Storm
Apache ZooKeeper	Apache ZooKeeper is a effort to develop and maintain an Open-source server which enables highly reliable distributed coor Dination. Categories:database Languages:java Pmc:apache ZooKeeper

List of Apache

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

apache versions list apache list files apache list modules location of apache use of apache server index of apache server what use of apache

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

List of Apache

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support