The process of recording a database from a library to a sub-library __ Database

Source: Internet
Author: User
Tags redis

Preface

Currently in the project team, because the project is in a business outbreak, the daily data growth has given us a database and even the system has caused a lot of uncertainties, the early rely on optimizing the business and SQL and so on temporarily can support. However, recently found that some table data to 500w+ after the query statistics performance seriously decreased, peak period of a lot of SQL blocking cases such as:


The disaster caused by this blockage is a snowball, because the more piled up basically the database has been dragged to death, so we face the problem of database segmentation. Technology Selection

Now that we have to divide the database into a list of databases, the cluster is indispensable, so how do our projects deal with these clusters? I have researched the following to complete this function (only for Java projects)

Middleware

For example, Taobao Open source Cobar, and later open source community based on Cobar to do two times mycat (personal advice if the use of middleware can be considered mycat)

Open source tools in jar form

such as Taobao's Tddl, and when it comes out of the open, SHARDING-JDBC, etc.

Dynamic Data source

Specify a data source to perform operations on different libraries and tables according to your business

Given the top three I finally chose the third way of looking at the most low: I said the basis of my own choice:

The above Kaiyuan products Taobao Cobar is not maintained, TDDL Open source and the internal version is not the same as there are a lot of bugs we have a shortage of manpower, it is estimated that there is no time to climb the pit. SHARDING-JDBC, just launched to have no project combat experience. Mycat is actually I found a better solution, but Mycat is to give the agent mode of maintenance, if the maintenance is not efficient, performance will not be too high, based on the ability of our team and I finally gave up.

The reason for choosing a Dynamic Data source: mainly because the technology is relatively simple, the business code is less modified, more controllable, reducing the risk of adding middleware or third-party tools. (to affirm that our project does not use JPA things at all, the compensation mechanism used for things is not to be taken into account here)

Primary key generation Policy

Now that you want to divide the tables so that the global unique primary key is also a problem we need to consider, I know and have experience with the following several technologies:

Problem

Feasibility

Based on Redis

Single point problem, Redis reboot problem, etc.

Higher, the company has projects to use

given to db (to be taken out each time a multiple use is generated)

Single point problem, concurrent quantity problem

Low concurrency, small amount of data can be used

Uuid

Take up the storage space is larger, not sortable, can not show the trend of growth

Higher

Twitter snowflake

There may be duplication problems after XX years, you need to configure production parameters

High, distributed no single point of failure problem, time is incremental. Recommended

Method based on DB step size

Not all databases are supported

Low

My choice is snowflake.

Implementation Details

Because of the sub-Library, queries and additions require a primary key with the allocation policy.

Add process


Query logic



Our performance problems can be solved by the sub tables, but there are a number of other problems: I've summed up the pit after the sub-table

1. After the separation can only be directly in the slicing key query, in order to avoid sweeping all fragmentation, if the non-piecewise key query, in the OLTP environment to go search engine. Database and search engine synchronization data by Binlog 2. According to different dimensions, such as buyer dimension and seller dimension check order. In addition to taking a search engine, you can also write an order data in a different system. 3.ID has to pass the ID generator. 4. There are hot data issues, such as a super buyer, bought a lot of goods, but there are not hot buyers, no orders. The solution is two, the hot data is taken out and put to separate system. or block pieces of data, such as 10 kinds of goods to calculate a block, but the details of this method I forget, just heard people share. 5. Cross-Library transaction issues, NPC generally not, compensation is a method, TCC is a method, TCC variants, such as Saga, such as XTs, trying to deliver is a method 6. Data expansion problem, you can see Ali's foolish. I personally think it is more reliable to stop the maintenance at night. 7. Paging pits, the early stage can be used middleware limit, the medium-term to go search engines, late OLAP 8. Availability issues, dependent on database high-availability scenarios. It is said that the sharding algorithm will cause some partitioning errors caused by network jitter 9. Configuration Center problem. Try to use the configuration center, do not use zookeeper 10. Non-agent mode, that is, JDBC routing mode each client will open pool to DB, the database may die on the database connection, one way is Custom Mysql, set high and low water level, Queues database connections that exceed database processing capabilities. The second approach is to do MySQL proxy over the JDBC routing mode

I pulled out my current project a simple model to put out: the source I will put on git above

Https://github.com/bingzhilanmo-bowen/spring-multi-datasource


                                                                    &NB Sp                                   &NB Sp                                   &NB Sp                                   &NB Sp         If you have any questions you can contact me:

                                                                    &NB Sp                                   &NB Sp                                   &NB Sp                                   &NB Sp          mail:bingzhilanmo@gmail.com

Welcome to join the Java Advanced Group: 329019348


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.