Interview Long: Hadoop is the standard for future large data

Source: Internet
Author: User
Keywords Big data more China

November 2013 22-23rd, as the only large-scale industry event dedicated to the sharing of Hadoop technology and applications, the 2013 Hadoop China Technology Summit (Chinese Hadoop Summit 2013) will be held at four points by Sheraton Beijing Group Hotel. At that time, nearly thousands of CIOs, CTO, architects, IT managers, consultants, engineers, enthusiasts for Hadoop technology, and it vendors and technologists engaged in Hadoop research and promotion will join the industry.

The Haoop China Technology Summit was hosted by Chinese Hadoop Summit Expert committee, organized by IT168, Itpub, Chinaunix, and the media in charge of the drainage. The conference will uphold the theme of "effectiveness, application and innovation", aiming to promote the ability and level of Chinese enterprise users to improve the application of Hadoop, reduce the application threshold of Hadoop technology and the threshold of investment budget, and popularize the application value of large data through open and extensive sharing and exchange.

At the upcoming 2013 Hadoop China Technology summit, the reporter interviewed the General Assembly expert committee member--hadoop Big Data Red Elephant (redhadoop) Cloud Teng company founder Long. He is the founder of Easyhadoop Open source community, Hadoop cloud computing lecturer, focusing on the popularization and promotion of Hadoop large data technology, dedicated to making Hadoop large data application simpler. In the interview, he briefed reporters on his own story with Hadoop, as well as the status and future of Hadoop.

A bond with Hadoop

From the initial knowledge, the beginning with Hadoop to Easyhadoop, and then to Redhadoop, Long and Hadoop have forged an indissoluble bond. Initially, as a technology enthusiast, Long began to focus on Google's three papers (GFS, BigTable, MapReduce) and used the Lucene class library for the subsequent blog search engine and the core development of automatic classification, which came from the same author as Hadoop--doug Cutting.

Long gets the chance to do hadoop from scratch when it comes to doing search engine work in storm video. Before starting the Hadoop program, he and his colleagues tried several data warehousing schemes without success, and eventually took the risk of deciding on Hadoop. Through the research on the architecture of Taobao data platform, the core members of the project gradually designed Cronhub Scheduler, COMETL data analysis, Friday report platform, Phphiveadmin platform, and migrated the original data platform to the new platform.

In order to make other people less detours, Long registered domain name, the automatic deployment of the script open to users to download, so many people get help. The Easyhadoop community, with the help of Friends, has held 9 technical gatherings so far, with a single group of about 2000 people.

In May this year, Long established the Redhadoop company to form the first batch of development teams. After several months of effort, the company released the Redhadoop Enterprise Edition 1, and in the follow-up will launch the Redhadoop Personal Edition for personal learning, Getting more people to learn and use Hadoop is something redhadoop has been pursuing and working on.

With the fate of Hadoop, Long summed up to: "Overall to go a lot of detours, but the end of the way, a word: do have to adhere to the end, willing to explore, adventurous and willing to share the mentality, and constantly summed up and optimize, and to share their understanding to more people." Cultivate an open research and development team, discover each person's unique value, let each person send out their own light and heat. Give yourself opportunities and create opportunities for others. ”

Hadoop is the standard for future big data

Turning to the application of Hadoop, Long that Hadoop has developed from Internet applications and has been widely used in internet companies. For example, Baidu has tens of thousands of nodes cluster, Taobao has thousands of nodes to store dozens of p cluster. At present, Hadoop in the non-Internet industry has begun to explore and use, mainly to supplement the original IoE platform to deal with the problem of mass logging, for data Warehouse platform construction. One of the telecommunications sector has been more mature, in the traffic, power and other fields are following, in the technical selection of relatively conservative banking industry is also using Hadoop to do backup, but the overall business model is not a breakthrough.

If the Hadoop platform is to be applied to a large scale in the non-Internet industry, it needs to be improved in data security and ease of use, and more SQL-like Easy-to-use query interfaces are needed. After Hadoop 2.0 came out, I hope Hive 2.0 can also be platform development, support more storage engine, a platform hive will bring more surprises. Long that Hadoop is the standard for future large data and has developed into a distributed operating system platform.

▲hadoop Big Data Red Elephant (redhadoop) Cloud Teng Company founder Long

For Hadoop beginners, Long suggest to practice and share more, be enthusiastic and brave. Hadoop has a U-learning curve that is difficult at first, such as the Hadoop installation, deployment, commissioning, and testing phases. Hadoop requires a combination of multiple components, each with dependencies, and it's not easy to determine whether or not to succeed. When it's time to get through the trial it's easy, ordinary sql, scripts, MapReduce can deal with some statistical work. When the cluster scale becomes bigger, the platform of the cluster develops, and the machine learning and the depth customization of all walks of life become more difficult. One of the biggest feelings is to dare to try, dare to deploy the system online.

As one of the organizers of this conference, Long hopes that this conference will be able to deepen into the industry, provide the whole technology development of the industry at the same time, can excavate more cases of industry, set up more industry successful use Hadoop typical. Looking forward to this conference Hortonworks Jeff brought the Hadoop2.0 boom, Hadoop enthusiasts more involved in it.

It is reported that Hadoop China Technology Summit 2013 is based on the Hadoop platform for the first large-scale industry-wide data Industry Technology Summit, the General Assembly will be around the Hadoop ecosystem to carry out a full range of technology sharing, discussion and results show. The topics of the Conference will cover the following seven major areas: Hadoop technology innovation, Hadoop infrastructure deployment and optimization, virtualization and Hadoop, Hadoop applications in the Internet, Hadoop applications in the non-Internet industry, and integration of Hadoop with the existing IT architecture of the enterprise, Big data start-ups and investments.

More highlights at the Hadoop China Technology Summit (Summit 2013), Haidian, Beijing, November 2013 22-23rd.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.