Druid founder Eric Tschetter details of the source real-time large data analysis system Druid

Source: Internet
Author: User
Keywords Cloud computing open source Hadoop druid

Druid is an open source data analysis and storage system designed for real-time search queries on large cold datasets, providing cost-effective and online real-time data ingestion and arbitrary data processing, and maintaining 100% uptime in the face of code deployments, machine failures, and other product system contingencies.

October 25, 2014, the "Big Data Summit" hosted by ebay and CSDN in Shanghai, Druid founder Eric Tschetter delivered a speech "Druid Tour, a framework for real-time data analysis data storage," after Eric received a CSDN interview.

very skillful, in Ali has an open source Java database driver also called Druid, Ali Druid Project Wen Shao also received CSDN interview.

Eric Tschetter, who studied at the University of Texas at Austin, received a master's degree in computer science at the Tokyo National Institute of Informatics. Later, in Silicon Valley, Eric joined the Ning, a social networking platform founded by Marc Andreessen, which was taken from Chinese pinyin, and later Eric joined LinkedIn to participate in the "arranges you Know" product After leaving LinkedIn, Eric became Metamarkets's first full-time employee and developed druid there. At present, Eric Works for a nonprofit organization, Tidepool, to provide open source medical digital applications for diabetics.


Druid founder Eric Tschetter

Druid is an open source distributed real-time processing system designed to quickly process large scale data and enable rapid query and analysis. Providing a cheaper option for large data processing, the only Open-source product in the field. Druid also UI some of the basic functions to provide services to non-technical personnel. When it comes to Druid's most similar project, Eric thinks it's Google's Powerdrill.

MapReduce and BigTable's paper spawned the fact standard Hadoop for large data processing. After the advent of Dremel and Powerdrill, many people are curious about what open source big data technology will rise again, Druid is one of them? Application Scenario

The most Druid applications are similar to the metamarkets scenarios-advertising analysis, Internet advertising system monitoring, metrics, and network monitoring. And ebay has also planned to use Druid in the production environment.

Development Team

Druid is currently hosted on GitHub, with 44 contributor,1000+ concerns, Druid's main contributors, including Metamarkets,netflix, Yahoo and some Silicon Valley start-ups. DRUID developers interact and support Druid development through the DRUID forum. The author has just looked at the druid Google Group, and has recently maintained a more active discussion.

Eric says that whenever they learn something new or have a new idea, they always try to practice it as soon as possible. So since March 2011 the first code submitted to date, Druid has greatly improved. For example, the storage of data, about 9 times, the query process has changed about 3-4 times, the coordination of the various nodes around the change is about 3 times, but the principle of each node to make a thing has not changed. Eric says there may be more changes in the future, but the basic structure will not change.

The Chinese elements of

Druid


Chinese engineer Fangjin Yang (Yang Yanjin), with Eric in charge of Druid's major development work

A few months after Eric started the Druid Project, Fangjin Yang joined the project. In the years that followed, Eric and Fangjin developed druid. Eric and Fangjin have so far been the main contributors to Druid. This year, Eric and Fangjin began working with some Chinese companies to help them assess druid and answer questions about Druid. According to Eric, in China, cloud Wide World (Xian) Network Technology Co., Ltd. Yeahmobi is using Druid.

Documentation and Support

Perhaps thanks to Eric's work as a translator after graduating from college, Druid's documentation is detailed and organized. Eric says the proudest thing about the project is to open it up and other people can solve a lot of problems just through Druid and some related documents.

At the same time, Eric's development team, through a mailing list (druid-development@googlegroups.com) for Druid users to provide support services, but there is no dedicated for-profit companies to provide support.

Druid's future plans

Druid's future plan is to continue to maintain the healthy growth of this open source project. Around Druid has gathered a number of engineers from different companies. Every engineer, every company, wants to see Druid bring new things, their needs are sometimes the same, sometimes different, but with teamwork, we can make Druid better. So Eric wants Druid to be a common project, to form a community that will guide the direction of Duid.

Eric's outlook for the future reminds me of the current development of Docker, and Eric says that if you can build an ecosystem like Docker around Druid, it will be a great success.

At the moment, Druid has not made a public roadmap, but Druid has started working on the relevant work and is trying to collaborate with Metamarkets, Yahoo, Netflix and ebay, while Eric says it will also refer to other Druid technical practitioners ' suggestions.

the future of large data technology: a long time must be divided, long time will be combined?

When it comes to the future of big Data technology, Eric reviews the history of relational databases in the 60, 70, and even 80. At that time, there are many kinds of database types, such as Object database, relational database and so on, the final relational database becomes mainstream, and other types of database are disappeared or marginalized. Until about 2006 years, the relational database still dominates, in fact the 70, 80 's database type, all is based on "with the storage medium interaction very expensive" the assumption design. But as storage becomes cheaper and more affordable, the assumption is not established, and the corresponding design architecture needs to be adjusted, resulting in nosql. Eric thinks big Data technology is based on that. Now, everyone is based on the new hardware environment, to find the best solution, the database technology has entered a new round of "schools of contention" phase, especially in recent years, a variety of database technology emerged. Eric believes that in about 5-10 years, the database technology will also enter a new round of integration phase, then large data technology will have a clear direction of development, perhaps according to your application scenario, there will be someone to provide you with the best solution.

Asked: "Do you think Druid will be the direction of the future?" Eric said frankly: "I don't know, but I hope so." Druid just to solve the existing problems provided a new way of thinking, right or not I am not sure, but I know it solves the problems of many companies such as Metamarkets. But does it solve all the problems? The answer is no, so I don't know which direction the database technology will merge in the future. ”

CSDN invites you to participate in China's large data award-winning survey activities, just answer 23 questions will have the opportunity to obtain the highest value of 2700 Yuan Award (a total of 10), speed to participate in it!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.