Open source Big Data processing tool: Query engine: Phoenix, Stinger, Presto, Shark, Pig, Cloudera Impala, Apache Drill, Apache Tajo, hive streaming: Facebook Puma, Twitter Rainbird, YAhoo S4, Twitter storm iteration calculations: Apache Hama, Apache Giraph, Haloop, Twister Offline Computing: Hadoop mapreeduce, Berkeley Spark, datatorrent key value store: Leverdb, Rocksdb, Hyperdex, Tokyocabinet, Voldemort, Amazon Dynamo, Tair, Apache Accumulo, Redis Table storage: oceanbase, Amazon SimpleDB, Vertica, Cassandra, foundationdb, hbase file storage: CouchDB, MongoDB, Tachyon, KFS, HDFS Resource management: Twitter Mesos, Hadoop yarn Log collection system: Facebook Scribe, Cloudera Flume, Logstash, Kibana messaging system: STORMMQ, ZeroMQ, RabbitMQ, Apache ActiveMQ, Jafka, Apache Jfaka Distributed services: Zookeeperrpc:apache Avro, Facebook Thrift Cluster Management: Nagios, Ganglia, Apache Ambari Infrastructure: Leverdb, Sstable, Recordio, Flat buffers, Protocol buffers, consistent Hashing, Netty, Bloomfilter Search engines: Nutch, Lucene, Solrcloud, SOLR, Elasticsearch, Sphinx, senseidb Data mining: Mahoutiaas:openstack, Docker, Kuberbetes, Imctfy Monitoring management: Dapper, Zipkin
Large Data processing tools summary (no full, only more full ^_^)