China large Data Technology conference PPT download

Source: Internet
Author: User
Keywords China large Data technology conference Huawei
Hadoop originated in the 2002 Apache Nutch Project, one of the Apache Lucene subprojects. In 2004, Google published a paper on OSDI titled "Mapreduce:simplified Data 處理 on SCM clusters", inspired Doug Cutting and others began to implement the MapReduce Computing Framework and NDFS (Nutch distributed File System) to support Nutch's main algorithm. 2006 gradually became a set of complete and independent software, named Hadoop.


in early 2008, Hadoop became the Apache top project. Also in this year, the first China large data technology conference held in Beijing. Over the past six years, Hadoop has leapt from an obscure rookie to a yellow elephant in large data fields.


below for the previous China large data technology conference PPT Essence Part I:


Memsql co-founder and CTO Nikita Shamgunov Analytic "real-time data analysis"


Temp_13101415146899.pdf Nikita The current era is the end of Moore's law, pointing out that today's calculations are not as fast as they used to be, but the pace of data growth has not stopped at all, and the diversity of data has exploded. He argues that the biggest challenge of today's big data technology is the problem of latency, especially data latency and query latency. After comparing Twitter's Strom and Cloudera's Impala technology, he described Memsql's performance in detail.


Alibaba Group data exchange platform Senior expert Chang: Large Data exploration


Temp_13101415141131.pdf Chang said Alibaba would build a data exchange platform in which everyone could get valuable data, but also to contribute their own data, which would form a blue sea, playing with data like a bank.


Ted Yu: How to apply HBase
in an enterprise

temp_13101415172243.pdf Ted has 14 software development experience, and more than two years of hbase development experience, 2011 became HBase Code submitter and PMC members.


Hortonworks Dai Jianyong: Interpreting the performance optimization of Apache Pig


temp_13101415177946.pdf Dai Jianyong from the full use of combiner, the construction of rule-based Optimizer, the use of column pruner and the use of push up filter angle, in the partition pruning, compressed intermediate files, Combining MapReduce job and control consolidation granularity illustrates how to optimize the performance of Apache Pig.


Huawei Senior Technical manager Anoop Sam John:hbase's Level two index


temp_13101415173453.pdf Anoop introduces Huawei's optimization work for hbase in practical engineering, and shares the experience of Huawei based on open source community construction and long-term project accumulation. In addition, Anoop Sam John also focuses on the HBase two-level indexing capability built by Huawei.


NetEase Senior Engineer Gu Feiyong: Mass data porter--datastream


temp_13101415181030.pdf Gu Feiyong discusses the causes, structure and characteristics of datastream, key technology points sharing, application scenarios and future prospects. The main two key technologies for large data are data collection and data integration and analysis. Gu Feiyong introduced NetEase in the data collection has done a relatively complete platform, but not with the backend data analysis to form a complete large data platform.


, professor of computer science at the University of Wisconsin, Miron Livny: Opportunities and challenges when Condor encounters Hadoop


temp_13101415181150.pdf Integrated Hadoop's supercomputer cluster Condor formed a very powerful computer system capable of dealing with complex problems such as the human genome. It replaces traditional high-performance computing with high throughput, which fits most research's high throughput requirements rather than instantaneous processing speed.


Yahoo! Barcelona research scientist Flavio Junqueira:apache bookkeeper--High performance reliable pre-write log


Temp_13101415189355.pdf designed bookkeeper for efficient sequential writing, good fault tolerance and scalability. Its structure consists of them: bookie (storage node), Ledger (log document), Ensemble (with a set of bookie storage ledger).


Facebook Research and development manager Shao: Puma and data Superhighway--real-time data flow and analysis


temp_13101415189033.pdf Shao Describes the use cases of Facebook's analytics tools and real-time data, as well as the structure and differences between PUMA2 and PUMA3 for scalable data streams.


, senior research and development engineer of Baidu Infrastructure Department Liu Jingrong: HDFS Transparent compressed storage and compressed transmission


temp_13101415185482.pdf to save more storage space, to avoid the compression process affecting computing operations, and to make this process transparent to users, Baidu in the HDFs under the use of transparent compressed storage and compressed transmission technology.


Facebook company Jerry Chen/liyin Tang: Building a key business communication system on HBase


temp_13101415191745.pdf Facebook chose HBase because of its high throughput, very good random read performance, good scalability, and automatic provisioning, strong compatibility and HDFS benefits. Facebook typically stores Sgt Message,message metadata and search indexes in HBase.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.