The process of recording a database from a library to a sub-library _

The process of recording a database from a library to a sub-library __ Database

Last Update:2018-07-26 Source: Internet

Author: User

Tags redis

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Preface

Currently in the project team, because the project is in a business outbreak, the daily data growth has given us a database and even the system has caused a lot of uncertainties, the early rely on optimizing the business and SQL and so on temporarily can support. However, recently found that some table data to 500w+ after the query statistics performance seriously decreased, peak period of a lot of SQL blocking cases such as:

The disaster caused by this blockage is a snowball, because the more piled up basically the database has been dragged to death, so we face the problem of database segmentation. Technology Selection

Now that we have to divide the database into a list of databases, the cluster is indispensable, so how do our projects deal with these clusters? I have researched the following to complete this function (only for Java projects)

Middleware	For example, Taobao Open source Cobar, and later open source community based on Cobar to do two times mycat (personal advice if the use of middleware can be considered mycat)
Open source tools in jar form	such as Taobao's Tddl, and when it comes out of the open, SHARDING-JDBC, etc.
Dynamic Data source	Specify a data source to perform operations on different libraries and tables according to your business

Given the top three I finally chose the third way of looking at the most low: I said the basis of my own choice:

The above Kaiyuan products Taobao Cobar is not maintained, TDDL Open source and the internal version is not the same as there are a lot of bugs we have a shortage of manpower, it is estimated that there is no time to climb the pit. SHARDING-JDBC, just launched to have no project combat experience. Mycat is actually I found a better solution, but Mycat is to give the agent mode of maintenance, if the maintenance is not efficient, performance will not be too high, based on the ability of our team and I finally gave up.

The reason for choosing a Dynamic Data source: mainly because the technology is relatively simple, the business code is less modified, more controllable, reducing the risk of adding middleware or third-party tools. (to affirm that our project does not use JPA things at all, the compensation mechanism used for things is not to be taken into account here)

Primary key generation Policy

Now that you want to divide the tables so that the global unique primary key is also a problem we need to consider, I know and have experience with the following several technologies:

	Problem	Feasibility
Based on Redis	Single point problem, Redis reboot problem, etc.	Higher, the company has projects to use
given to db (to be taken out each time a multiple use is generated)	Single point problem, concurrent quantity problem	Low concurrency, small amount of data can be used
Uuid	Take up the storage space is larger, not sortable, can not show the trend of growth	Higher
Twitter snowflake	There may be duplication problems after XX years, you need to configure production parameters	High, distributed no single point of failure problem, time is incremental. Recommended
Method based on DB step size	Not all databases are supported	Low

My choice is snowflake.

Implementation Details

Because of the sub-Library, queries and additions require a primary key with the allocation policy.

Add process

Query logic

Our performance problems can be solved by the sub tables, but there are a number of other problems: I've summed up the pit after the sub-table

1. After the separation can only be directly in the slicing key query, in order to avoid sweeping all fragmentation, if the non-piecewise key query, in the OLTP environment to go search engine. Database and search engine synchronization data by Binlog 2. According to different dimensions, such as buyer dimension and seller dimension check order. In addition to taking a search engine, you can also write an order data in a different system. 3.ID has to pass the ID generator. 4. There are hot data issues, such as a super buyer, bought a lot of goods, but there are not hot buyers, no orders. The solution is two, the hot data is taken out and put to separate system. or block pieces of data, such as 10 kinds of goods to calculate a block, but the details of this method I forget, just heard people share. 5. Cross-Library transaction issues, NPC generally not, compensation is a method, TCC is a method, TCC variants, such as Saga, such as XTs, trying to deliver is a method 6. Data expansion problem, you can see Ali's foolish. I personally think it is more reliable to stop the maintenance at night. 7. Paging pits, the early stage can be used middleware limit, the medium-term to go search engines, late OLAP 8. Availability issues, dependent on database high-availability scenarios. It is said that the sharding algorithm will cause some partitioning errors caused by network jitter 9. Configuration Center problem. Try to use the configuration center, do not use zookeeper 10. Non-agent mode, that is, JDBC routing mode each client will open pool to DB, the database may die on the database connection, one way is Custom Mysql, set high and low water level, Queues database connections that exceed database processing capabilities. The second approach is to do MySQL proxy over the JDBC routing mode

I pulled out my current project a simple model to put out: the source I will put on git above

Https://github.com/bingzhilanmo-bowen/spring-multi-datasource

&NB Sp &NB Sp &NB Sp &NB Sp If you have any questions you can contact me:

&NB Sp &NB Sp &NB Sp &NB Sp mail:bingzhilanmo@gmail.com

Welcome to join the Java Advanced Group: 329019348

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More