"Bdtc Sneak peek" Shao: Use large data to fix Dropbox system operation

Source: Internet
Author: User
Keywords Large data DROPBOX DROPBOX BDTC monitoring system system operation BDTC sneak peek DBTC2014
Tags application applications big data cloud cloud storage company computer computer science

December 2014 12-14th, hosted by the China Computer Society (CCF), CCF large data Expert committee, the Chinese Academy of Sciences and CSDN co-organizer of the 2014 China Large Data Technology conference (DA data Marvell Conference 2014,BDTC 2014 will be opened at Crowne Plaza Hotel, New Yunnan, Beijing. The General Assembly lasts three days to promote the development of large data technology in industry applications. To set up a "large data infrastructure", "large Data ecosystem", "large Data Technology", "large Data Application", "large data internet finance technology", "intelligent information processing" and many other theme forums and industry summits. Sponsored by the China Computer Society, CCF large data committee of experts, Nanjing University with the co-organizer of the "2014 second CCF large data academic conference" will also be convened, and the technical conference to share the theme of the report.

The Conference will invite top experts and front-line practitioners in nearly 100 foreign data technology fields to discuss the latest development of OSS, YARN, Spark, Tez, HBase, Kafka, oceanbase, etc., Nosql/newsql, memory calculation, The development trend of flow calculation and graph computing technology, OpenStack ecosystem for large data computing needs, and large data visualization, machine learning/depth learning, business intelligence, data analysis, the latest industry applications, sharing the actual production system of technical characteristics and practical experience.

Before the meeting, CSDN and the conference's "Big Data Application" Forum speaker Dropbox Research and development manager Shao a simple communication. As the world's leading cloud storage and sharing platform, Dropbox has very high requirements for the stability of the system. A good monitoring system can improve the quality of work and life of engineers. Shao says existing monitoring systems have two problems with large-scale monitoring: scalability and usability. Dropbox by comparing the common large data monitoring system architecture, a hybrid architecture is proposed to achieve maximum scalability. Shao will share Dropbox's integration practices and next steps in detail at the "2014 China Large Data Technology conference" on December 14. For more information, click here to sign up!

Shao

Dropbox Research and development manager

Shao is a research and development manager for Dropbox, a US cloud storage company, and a member of the project Management Committee of the large data Open-source platform Hadoop. Shao joined Dropbox in March 2014, responsible for distributed database storage and system monitoring. Earlier, Shao was the early Chinese engineer and engineering manager of Facebook, and worked for 6 years at Facebook, participating in and responsible for the development and large-scale application of HIVE,SCRIBE,PUMA,MYSQL,ROCKSDB projects. Shao a master's degree in Management engineering and science at Stanford University, a master's degree in computer Science from UIUC and a bachelor's degree in computer science from Tsinghua University. Shao has won awards in a number of international programming competitions, including TopCoder semifinalist, Google Code Jam finalist, ICPC Finals2001 11th place, IOI 1999 Gold Medal.

Shao interview transcript as follows:

CSDN: What large data technologies have you used in your company? What are your satisfaction with these technologies and where are you dissatisfied?

Shao: Dropbox uses the following large data-related Open-source projects: Hadoop (HDFS and Map-reduce), Hive, Scribe, HBase, Kafka, Presto, Zipkin. We are more satisfied with most of the technology, the main dissatisfaction is in the stability, is the problem of how to quickly resolve.

CSDN: According to your understanding, the current similar enterprises, in the data, the biggest difficulties encountered?

Shao: Challenges exist in the following three areas:

from the software point of view, most of the Open-source software is not mature enough, but too much emphasis on performance rather than stability. This has spawned more proprietary systems, and on the other hand, the consulting market for open source software has become much bigger. From a hardware perspective, cloud platforms are increasingly used, while traditional deployments in their own data centers are increasingly scarce. Due to the difficulties of deployment and so on, the platform of large data will be transferred to the cloud more and more quickly. From the user's point of view, the biggest difficulty is ease of use and transparency. Many large data technologies are too low-level to solve a problem directly. It is important that developers continue to develop on large data platforms.

CSDN: What are some of the technologies you are looking at and studying in large data areas, and why are you bullish on them?

Shao: We are currently focusing on the following technologies:

Spark,sparksql is suitable for the response speed of improving the operation efficiency of Data Warehouse, Elasticsearch is suitable for the collection, indexing and querying of semi-structured data, Opentsdb and Influxdb, and is suitable for the collection and storage of operational data; Grafana , which is suitable for visualization of operational data.

CSDN: Please talk about the topic you are about to share at this conference.

Shao: The topic that will be shared with you is the application of large data in the field of Dropbox operation. First, we introduce why the use of large data to do operational dimension is very important, as well as the current situation in the industry, and then we will be more common use of large data to do the operation of the architecture, and put forward Dropbox system architecture, and finally introduce our company in this regard the next step of the plan.

CSDN: Which listeners should know these topics best? What topics can you share to help your audience solve problems?

Shao: An engineer who focuses on large data applications and focuses on system operation engineers.

The National large Data Innovation Project selection activity is now in full swing, details click here.

The 2014 China Large Data Technology Conference (Marvell conference 2014,BDTC 2014) will be held at Crowne Plaza Hotel, New Yunnan, December 12, 2014 14th. Heritage since 2008, after seven precipitation, "China's large Data technology conference" is currently the most influential, the largest large-scale data field technology event. At this session, you will not only be able to learn about Apache Hadoop submitter uma maheswara Rao G (a member of the project Management Committee), Yi Liu, and members of the Apache Hadoop and Tez Project Management Committee Bikas Saha and other shares of the general large data open source project of the latest achievements and development trends, but also from Tencent, Ali, Cloudera, LinkedIn, NetEase and other institutions of the dozens of dry goods to share. There are a few discount tickets for the current ticket purchase.

Free Subscribe to the "CSDN large data" micro-letter public number, real-time understanding of the latest big data progress!

CSDN large data, focus on large data information, technology and experience sharing and discussion, to provide Hadoop, Spark, Impala, Storm, HBase, MongoDB, SOLR, machine learning, intelligent algorithms and other related large data views, large data technology, large data platform, Large data practice, large data industry information and other services.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.