From Alibaba's perspective, do large application databases choose Oracle MySQL or NoSQL?

Source: Internet
Author: User
Tags dba mysql query mysql database advantage

As one of the first practitioners who introduced MySQL databases to Alibaba, I have been gradually "eroded" by MySQL from Oracle to its peripheral applications ", until the core system is replaced with multiple stages of MySQL. At first, this was an exciting process. Although the DBA team was also uneasy, the overall process was quite smooth. However, as the process continues to push forward, the pain is coming, especially the impact on the development team is getting bigger and bigger. Then... (I will not elaborate on things later)

Now let's look back and think carefully. There are no problems with direction decision-making, but there are still many questions about the actual implementation process and strategy. For example, the cost comparison calculation method and the execution scope are all-in-one.

It is precisely because of these "preconditions" that we made the following controversial sharing at the OTN carnival in 2013, I have a chat about my views on the selection of products in the database field involved in the "O" in the hot "de-IOE.

As Alibaba's "IOE" campaign is becoming increasingly popular in the community, there is a wave of "xxx" technology in China. Not only Internet companies, including operators and financial institutions, have begun to join this trend. Oracle databases, as the "O" in the movement, naturally become the target of the public, and many CIOs and CTOs all show a pair of fast-moving expressions. In actual application scenarios, how should we choose the data access software?

Around to 10 years ago, the world suddenly talked about NoSQL overnight. The voice of relational databases to be replaced by NoSQL is everywhere. Almost everyone is advocating the advantages of NoSQL, however, so far, we have not seen any database software market that has suffered a big impact from NoSQL. Cassandra, which was so popular at the moment, also switched from the original application scenario of the old family Facebook to HBase. Since the transition from a relational database MySQL to a NoSQL-like Twitter, after a variety of "pain" experiences, it has returned to the embrace of MySQL...

As an architect, what is the basis for making a correct decision in the face of so many choices? The following are the three-step decision-making ideas commonly used in my experience.

I. System Comparison

Feature Differences

Oracle is undoubtedly the most comprehensive feature. Both OLTP and OLAP scenarios have good technical support. MySQL, as a representative of open source database software, common functions of relational databases are also fully covered. However, the Hash Join feature, which is indispensable for OLAP scenarios, is indeed a major obstacle to the OLAP path of MySQL; most NoSQL products do not support non-K/V data access, and fewer products support multi-dimensional filtering.

Therefore, from a functional perspective: Oracle> MySQL> NoSQL

Performance strength

Based on past experiences in some tests and practical application scenarios, performance can be compared from the following three perspectives based on the same hardware resources:

Writing: due to the architecture and optimization of NoSQL in terms of data storage and logging, NoSQL has a great advantage over Oracle and MySQL. However, the difference between MySQL and Oracle is not very big. For the moment, we think that the two are tied together.

Therefore, from the perspective of writing performance: NoSQL> Oracle = MySQL


Simple query

There has been a long controversy over the performance of simple queries. Some people have tested the results of Oracle being inferior to those of MySQL, and some have also tested the results of MySQL being inferior to that of Oracle. In fact, there may be no problems in both tests. The real problem lies in the differences in their test scenarios. Especially, the difference in concurrency may have a great impact on the test results. When the high concurrency increases (such as 128), MySQL will gradually show a poor performance. As for NoSQL, at least in my testing scenarios, the performance is mostly worse than that of the previous two. Of course, there will certainly be a large number of NoSQL fans who will jump out of opposition, but remember that what we want is not a Cache product or the capabilities of large-scale clusters.

Therefore, from the perspective of simple query performance: Oracle> MySQL> NoSQL

Complex query (with at least Join)

NoSQL products do not support Join, so there is no doubt that the MySQL Query optimizer is based on a relatively small amount of statistics, when the Query complexity is very high, the execution plan is not the best choice, and Oracle makes its Query optimizer smarter due to a large amount of statistics support, it provides better performance for complex queries.

Therefore, the performance of complex queries: Oracle> MySQL> NoSQL

Scalability

Scalability or scalability convenience has always been an important factor affecting the selection of architects. After all, our data generation speed is getting faster and faster, and it is often difficult to solve the problem through a single machine.

From the perspective of scalability convenience, most NoSQL products have good distributed support solutions, which is undoubtedly the best choice. Due to Oracle's strict requirements on data consistency and architecture restrictions, the scalability is slightly weaker than that of MySQL.

So in terms of scalability: NoSQL> MySQL> Oracle

Maintainability

These 1.1 are the most important factors for O & M personnel. After all, any software system must be maintained later.

Due to the relatively short development time, NoSQL products have less support for maintainability. Although most of them provide some corresponding gadgets, they are generally too simple, therefore, this is weaker than the mature MySQL and Oracle databases. Oracle is undoubtedly the most comprehensive task for later maintenance. It is well-developed in terms of running status tracking and basic backup recovery.

So in terms of maintainability: Oracle> MySQL> NoSQL

Business support

NoSQL products currently have very few commercial support, and there are not many options for MySQL localization commercial support. Oracle commercial support is widely selected, regardless of large companies or startup teams.

Therefore, in terms of business support: Oracle> MySQL> NoSQL

Software cost

There is no much controversy in this regard: Oracle> MySQL = NoSQL

Talent environment

This is a factor that many people will ignore, but it may have a huge impact on subsequent use and maintenance. Oracle has been a leader in the database field for many years. Therefore, the entire Oracle DBA industry is relatively mature and the talent system is relatively stable. As a rookie, MySQL database has already attracted many people, but in terms of quantity and quality, it is far inferior to Oracle DBA. NoSQL talents are even more scarce.

From the perspective of talent environment: Oracle> MySQL> NoSQL

II. Scenario analysis

Consistency requirements

No matter when you ask any business party, it will tell you that the data in the system cannot be lost, and real-time feedback on changes is required at any time. But in fact, when you ask questions in another way and tell them that in extreme cases (such as power outages), you must ensure that data will not be lost, which will increase the cost by tens of millions, the answer may be totally different at this time. Therefore, when understanding the requirements of the business side for data consistency, we must clearly define the strong relationship, distinguish the data level, and achieve the maximum information transparency, in order to dig out the clearest requirements. Oracle is undoubtedly the most reliable one to protect data consistency.

Concurrency scale

The concurrency scale will test our scalability. If the concurrency scale is large, we need a good scalability to ensure the demand for subsequent concurrency growth. Selecting a system that is difficult to expand will lead to a high level of time cost and economic cost in the subsequent growth of concurrency.

Logical complexity

Obviously, if the business logic is too complex, at least NoSQL is definitely not the right choice. As for MySQL or Oracle, it is time to test the functions and performance of the two.

Total capacity

The data capacity scale of our system will naturally affect the software selection. The scale is very large. We must use distributed system support. At least we should also score the database and table sharding, at this time, the scalability will fully show its advantages.

III. Balanced decision-making

After "system comparison" in the first step and "scenario analysis" in the second step, we have accumulated sufficient information for system selection, is it possible to select a proper system?

At this time, we may find that our quantity is large, but we hope to maintain it easily and easily. At this time, we are faced with the following problems: NoSQL is more suitable for the general election of data size, and it is Oracle's advantage to facilitate maintenance. What should we do? Or if we have such a scenario: a transaction system has a large concurrency, a high requirement on data consistency, and the business logic is not simple, what should we do? Oracle can provide us with good data protection and the best performance in the face of complicated logic, but it will face cost pressure during expansion. MySQL can provide better expansion solutions, and the cost is relatively low. NoSQL cannot solve complicated logic business scenarios.

Similar problems may frequently appear in front of our architects. We need to weigh the pros and cons and make a balanced decision, so that we can select a more appropriate system to meet the business needs as much as possible. In some cases, you may have to sacrifice a very small number of business needs in exchange for greater system development, rather than affecting the overall selection in order to preserve the needs of certain extreme scenarios.

Summary

The diversified trend of data storage software is inevitable. Whether it is the traditional leading Oracle, the open source model MySQL, and the NoSQL rookie, it has its own characteristics, but it also has its shortcomings. No one is omnipotent, and no architecture can cope with all problems.

As an architect, we need to do the following:

1. Do a good job in your daily work to obtain the information in the first step "system comparison" above.

2. In the face of specific business needs, fully explore the most realistic needs, and make all kinds of advantages and disadvantages information transparent

3. Make a balance in the final decision, and weigh the advantages and disadvantages from the perspectives of demand implementation, cost control, and maintenance management.

4. The desire to learn new technologies cannot affect the selection results, nor fear the use of new technologies.

Oracle, MySQL, and NoSQL are just a piece of software. The actual use effect depends more on the user's capabilities. A good user can make full use of its advantages to avoid its weakness, and ultimately maximize the value.

Finally, in the selection process, we need to fully absorb industry experience, and not be able to interact with the cloud. Don't see other people's "go to O" movement is huge, just kill Oracle with a single stick, you only see what others want you to see.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.