Interview Zhou Jincho: We are more inclined to greenplum to solve the problem of data tilt

Source: Internet
Author: User

Zhou Jincho, worked for listening to the cloud, maintaining the uptime of MySQL and Greenplum, and investigating database technology solutions for listening to cloud business scenarios.

Listen to Cloud Week gold

On September 24, Greenplum will be able to participate in an offline event in Beijing, where he is entitled to share in the practice of real-time analysis of cloud big data. On this occasion, he shared some of the experiences and experiences of PG and his work.

Free Registration Link: http://click.aliyun.com/m/6101/

Body:

Zhou Jin can just participate in the work is to do system operations, and then slowly contact with a variety of databases, began to interest in the database, after a period of accumulation after a turn to the DBA.

"When I joined the listening cloud, it was a fast-growing phase of business, and the backend of our applications and databases was a big test. Most of the time last year we were scaling up, and our MySQL cluster was expanded from the first few instances to the current hundreds of instances. "He went through the explosion of listening to cloud traffic.

And it is this growth, let Zhou Jincho and PG have close contact: "A module of the single table data volume up to company aims Yangzhou, MySQL shared Way has not guaranteed query performance, so the Greenplum MPP scheme to solve performance problems." ”

"The volume of the expansion is relatively large throughout the process. And with the huge amount of data, the data skew problem caused by MySQL shared with us caused a lot of trouble. Currently, we have made a customization of MySQL middleware, which allows the data of a specified user to be routed to a separate instance and then vertically extends the configuration of that instance. But now we are more inclined to greenplum the solution, the reasonable involved distribution key is can completely avoid the problem of data tilt. ”

So what he is sharing is Greenplum's practice of listening to real-time analysis of cloud big data, covering specific application scenarios Greenplum selection, and the performance comparison with the original MySQL architecture after migrating to the Greenplum architecture.

In addition, Zhou Jincho also talked about why he liked Golang's programming style, the experience of listening to the database management platform inside the cloud, and the view that Uber switched from PG to MySQL in the last period.

For more specific information, please see the following full interview:

Cloud Habitat: Please introduce yourself and the work you are doing.

Zhou Jincho: My name is Zhou Jincho, and I am currently working in listening cloud. Listening to the cloud is a company that has been in the field of APM for 10 years. I joined in the early 15 to listen to the cloud, fortunate to experience the explosion of the volume of cloud business growth.

Listening to the Cloud backend current database schema is mainly MySQL distributed cluster, and some of the data is the use of Greenplum scheme. The backend of our upcoming CDN Controller product is a postgresql+citus distributed solution.

At present, the main work is to maintain the normal operation of MySQL and Greenplum, as well as to investigate suitable for listening to cloud business scenarios of database technology solutions.

Cloud-dwelling community: How did you get on the DBA path? What are the highlights of the current work?

Zhou Jincho: The first time to participate in the work is to do system operations, and then slowly contact with a variety of databases, began to interest in the database, after a period of accumulation after the turn to the DBA.

When I joined the listening cloud, it happened to be listening to the rapid growth of the cloud business, the back end of our application and the database has been a big test, most of the time last year to do the expansion, our MySQL cluster from the initial number of instances of the expansion into the current hundreds of instances.

This year we have done some optimizations, such as replacing InnoDB instances of the online MySQL instance with the Tokudb storage engine, compressing the data and improving performance significantly. Migrate a portion of the business data that was originally on MySQL to Greenplum, with query performance up to hundreds of times times higher. Of course this is only in our scenario, the single-node mysql versus the Greenplum cluster, MySQL is still very good db.

Cloud Habitat: You mentioned that you prefer the Golang programming style, can you talk about the reasons? You also use Golang to develop a database management platform for listening to the cloud, please describe the next platform, and some of the memories of development.

Zhou Jincho: Golang syntax is simpler than Python, the programming style tends to be scripted but much more powerful than the shell, native concurrency becomes model and cross-platform features that I think Golang can be a sword in daily operation.

Database cluster size is large, it is not possible to do human inspection of hundreds of nodes per day, and later contacted the Golang Web framework Beego, so decided to write a database management platform. The platform collects data from hundreds of nodes in the MySQL cluster, QPS, TPS, slow SQL, and then presents it as a graph on the page, as well as some aggregated report data, such as the increment of data per business library per month and a list of instances of slow SQL TOP12 per day. Analysis rollup for slow SQL, support for viewing slow SQL execution plans.

The data query extracts the window, supports the data query and exports it in Excel format. There are some monitoring of our automatic maintenance table partitioning.

Cloud habitat: As a large application performance testing platform in China, what is the evolution of listening to the cloud on the database? What are the challenges and how to solve them?

Zhou Jincho: Listening to the cloud database went through the evolution of the distributed architecture of MySQL stand-alone to MySQL sub-Library, then the data volume continued to expand, and compression engine was used to compress the data. The single-table data volume of a module is up to company aims Yangzhou, and the MySQL shared approach cannot guarantee query performance, so the Greenplum MPP scheme is used to solve performance problems.

The whole process of the expansion of the workload is relatively large. And with the huge amount of data, the data skew problem caused by MySQL shared with us caused a lot of trouble. Currently, we have made a customization of MySQL middleware, which allows the data of a specified user to be routed to a separate instance and then vertically extends the configuration of that instance. But now we are more inclined to greenplum the solution, the reasonable involved distribution key is can completely avoid the problem of data tilt.

Cloud habitat: When did you come into contact with the Greenplum program and PG? What experiences have been accumulated in the application today?

Zhou Jincho: Contact Greenplum and PG has a few months time, at present Greenplum just on the production, in the early research time accumulated some experience in the use of the scene, for gpdb maintenance experience, is accumulating the process.

Cloud Habitat: Next, how will you embrace PG?

Zhou Jincho: We have a new product backend db that uses the JSONB feature to the latest version of PostgreSQL, taking into account both performance and operational costs. At the moment, there is no alternative to PG, so we will adopt the Citus+postgresql solution.

Cloud Habitat: What will your shared content be included in the salon below this period? As a technical person who has just come into contact with PG, do you have any message to the attendees?

Zhou Jincho: The main share is that greenplum in the practice of real-time analysis of cloud big data, will share our specific application scenarios Greenplum selection, and migrated to the Greenplum architecture and the original MySQL architecture performance comparison.

PostgreSQL is growing fast, and more and more companies in the country are trying to use PostgreSQL. Some of the features of PG are also quite attractive, and it is hoped that more and more users will share their experience and make the PG community more and more good.

Cloud habitat: Last: As a MySQL DBA, what do you think of the last time Uber switched from PG to MySQL?

Zhou Jincho: Uber may be misleading about the selection of DB, and Internet companies will have technical iterations at different stages of the evolution of the architecture, often looking for new technical solutions to address some of the pain points of the moment, so it's the best thing to do for yourself.

MySQL may be more suitable for Uber's current business scenario, and it is said that Uber has migrated from MySQL to PG before, so it's hard to say it's not a personal feeling for Uber dba.

But the impact of this article is still very bad.

Zhou Coquin will be on September 24 in open source database enterprise Application practice meeting with everyone face-to greenplum technology, welcome to participate free of charge,

Interview Zhou Jincho: We are more inclined to greenplum to solve the problem of data tilt

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.