12306 How should such a super-large real-time transaction system be designed? It's not hard.

Source: Internet
Author: User

12306 How should such a super-large real-time transaction system be designed?
%
I heard Ali wants to help 12306 redesign the booking system, but 12306 is this system really so difficult? Are those who write software not to think about this problem?


Obviously, 12306 this system is different from the general relational database, so the system architecture design must consider the actual application business process and the nature of the data itself.


Let's just say, I don't see how difficult this problem is, in order to design this system, here are a few things to consider:


1, assume that the system must allow 100 million people at the same time to rob tickets, in fact, even if 1 billion people do not have a relationship, I do not believe that the system can not expand horizontally. That means that if it is a super single-core CPU machine, the average transaction must be completed in 1/100 million seconds, 1/10000 milliseconds. While the average medium CPU machine assumes that it can be completed in 10 milliseconds, this performance capacity needs to be expanded 100,000 times times by horizontal expansion and vertical expansion.


2, with the previous theoretical upper limit analysis, the rest of the matter is good to do: how do we do this?

2.1 Vertical expansion, so that the performance of the server host increased 10 times times, this should be possible;


2.2 using in-memory databases and memory calculations, transactions should not require synchronous disk IO, but can be asynchronously synchronized to disk, as most NoSQL database engines do, using the Wal log

Of course, the disk itself should be SSD, no doubt about it.


2.3 Shunt of the network
On the network communication with the core server, you should not use one TCP connection per transaction, obviously you can use an encrypted TCP connection and do everything on this connection.
Or you can use a UDP-like datagram, with each booking request encapsulated into a large packet.
The current remaining ticket query may not be able to do real-time updates, but at least can guarantee 1 minutes or 30 seconds to update, this is only a busy time, not busy time do not need to do so.
In the actual UI interaction experience, you can make a "pessimistic view", that is, if the user sees another ticket, and he placed the order within 30 seconds, then the system should be sure that he can buy tickets. Of course, you can also modify the UI interface, using an automatic rule engine similar to "smart proxy" so that you can book tickets without first querying. After all, a normal read-and-write type of transaction actually leads to a read lock, potentially reducing the concurrency program.

Network packets should be able to be routed to different CPU cores at the kernel level, so that if each machine is 16 cores, it can further increase the scalability by 10 times times.


2.4 Avoid the overhead of locking in large concurrency cases
The Seven Qiniu Xu Xiwei (using the Go language as the server backend development language) thought locking was inevitable, really?
In fact, it is possible to reduce unnecessary global locks through a more granular materialized view database. For example, multiply each trip by two adjacent stations as a small operating unit, rather than a global "ticket" for the addition and subtraction of read and write operations.
In other words, if someone buys a ticket a-->b, in the middle through the C, D station, the database exists a->c, c->d, d->b a total of 3 data items (each seat is a separate data item), and the entire booking transaction through the sequence of these data items in the lock-free atomic operation. Note that the seats purchased should not be randomly allocated, random distribution is generally pseudo-random, in the case of large concurrency there must be a great probability of conflict.

In this granularity, there is basically no overhead due to lock-in design


2.5 There are now 1000 times times the horizontal expansion factor. It's time to consider distributed system load balancing
First, considering the different ticketing terminals, the 12306 system should be deployed as a hierarchical topology, where a transaction terminal (with better performance and faster connection to the core server network) can provide a transaction agent directly to the next tier. At this level, I still consider the whole system is centralized, but for some local non-core traffic network, can be completely issued to a certain level to handle.
Second, a certain message queue can be introduced, instead of a synchronous request, the Lock-free queue is the most common use for Message Queuing, and the data structure store of the queue can reduce the network and memory overhead caused by the instantaneous concurrency to a minimal program, provided that the queue is not based on global locks (that is, semaphores).


As for the core design of 12306, I think it's all right to talk about it. The key is what I said before about materialized views, or the core concept of "minimizing transactional operational data items." As a generic relational database technology, this need may not be met. But perhaps this can be said by customizing the MySQL database storage engine.


Based on this "materialized view", or the technology called "minimizing transactional operational Data items", using sequential Lock-free test and setup, there is basically no concurrency access violation caused by traditional Mutexlock protection shared variable read-write technology.

12306 How should such a super-large real-time transaction system be designed? It's not hard.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.