12306 ticket pool architecture (I)

Source: Internet
Author: User

Recently, someone has been thinking about the design of the ticket pool in the forum. This is my idea about the ticket pool architecture. For specific discussion, go to the forum to discuss: http://12306ng.org/thread-1572-1-1.html


Requirement Discussion
So far, I have learned that the ticket pool needs to be:
1. The pre-sale period of a ticket is not fixed. There are 30 days and 10 days, but most of them are 10 days.
2. Ticket Management is planned in advance, and a ticket amount plan and a temporary ticket amount plan are prepared. In this way, tickets are dynamically allocated. Some tickets will be pre-allocated in advance based on the hot bus range a few days before the pre-sale period. In the days after the pre-sale period, unused tickets will be recycled to the ticket pool.
3. Refund and change of the ticket must be considered.
4. Some tickets may be reserved for some special units. These reserved tickets may not be sold at last and will be recycled to the ticket pool.
5. You need to consider the situation of getting on and off the bus halfway. In order to ensure the output of passenger transportation, it is best to have the same seat and sell as many tickets as possible. For example, for trains from Shanghai to Beijing, if a passenger buys a ticket from Shanghai to Nanjing and B buys a ticket from Nanjing to Beijing, in terms of the efficiency of seats, of course, both Jia and Yi can be seated in one seat.

There should be other requirements. I suggest narrowing the demand scope at the beginning to avoid expansion of the demand. I think we can only consider the following requirements at the beginning:
1. Only 10 days of presales are supported.
2. The plan is not supported, that is, we get the planned ticket amount from the upstream plan system and put it into the ticket pool.
3. Refund and change of the ticket are supported.
4. reserved tickets are supported.
5. Support for getting on and off the bus halfway. A ticket is sold repeatedly for one seat.

Architecture Design Ideas
The architecture of the ticket pool should consider the following issues:
1, The ticket pool should be convenient and distributed From the above requirements, the ticket pool can be distributed in at least two dimensions,First, based on the ticket location, that is, the distribution of the ticket departure station . In this way, the corresponding ticket pool server is closest to the passengers. Passengers who purchase tickets remotely can directly redirect to the remote server or cache some tickets locally; Then, the time distribution can be used. That is, a one-day drive and a nine-day drive can be placed on different servers.
2. To ensure the ticket sales speed, Try to put the entire ticket pool in the memory , Some key data should be stored in the CPU cache as much as possible. This is because the random access speed of the hard disk is 10000 times that of the memory access speed, and the sequential access speed of the hard disk is much faster than the random access speed; the access speed of the second-level cache of the CPU is two to three times faster than the memory, and the first-level cache is about 10 times faster than the memory.
3. The CPU generally reads data to the cache in batches instead of one byte or one byte. to make full use of the CPU cache, therefore, try to store the relevant data in the continuous memory. In this way, It is best to use the array structure instead of the linked list as much as possible. .
4. There are several problems with the use of non-continuous structures such as linked lists. First, when allocating memory Program It is time-consuming to find idle memory during memory allocation. For Java and other garbage collection languages, updating the reference of the linked list after GC is also a problem. The second is the memory fragmentation problem. For long-term online servers, I think we should try to avoid using the linked list structure.
5, Lock-free operation as much as possible Even if the entire ticket pool is in the memory, if a lock is required to synchronize multiple threads, there will be several problems. The first is to switch from user to kernel, this process may need to execute thousands or more commands; the second is because the thread switches back and forth, the original CPU Cache Code And data will be invalid, and data needs to be reset back and forth in the cache and memory.
6. During Concurrent ticket pool processing, I think only one thread is responsible for writing information, and other threads are only responsible for reading . The advantage of writing without multithreading is that, first, lockless data can be implemented, and second, pseudo-sharing can be avoided.

Comparison of existing solutions
I mentioned in my previous post the design scheme of using Directed Graphs. I still stick to this scheme now-but I changed it to using Directed Graphs for indexing, I will first compare several other solutions on the Forum (for details, see: http://12306ng.org/forum.php? MoD... 01 & fromuid = 5805 ):
1. Although the binary scheme is similar to the binary scheme in my design, a major problem with the original binary scheme is that it seems that the implementation of the database is not taken into account, for example, write a similar query in The Post:
Where (station> 0011111100) and (not (Station & 0011111100) ^ 0011111100) limit 10

From the perspective of database implementation, we need to consider how to create an index, b-tree should be unable to create indexes that support bitwise operations (if possible, correct me), but I don't know if Bitmap indexes can be supported-But MySQL does not seem to support Bitmap indexes. If there is no effective index solution, using the binary solution in the database may become a row-by-row scan-that is, there are a large number of disk access, and the efficiency is very low.

2. two integers indicate the origin site and end site. This scheme will often maintain the binary tree structure, the number and height of nodes in the tree are not determined-this is because if a seat is split into multiple short-distance orders, the seat will have multiple nodes in the binary tree.

 

 


Inside, we can see that each site (that is, the nodes on the secondary node) saves all the trains (edges) passing through it in a list, and indicates the direction of the trains by directed edges, A vehicle is actually composed of multiple sides.

The sites (such as Beijing) and trains (such as Shanghai 17) can be regarded as indexes for obtaining data, such as in server-core/CPP/sites. h, define all sites as an enumeration type; server-core/CPP/trains. h, define all trains as an enumeration type (starting with a number, with an underscore in front ). Because the sites and trains are not often changed, they can be fixed. If there are any updates in the future, you only need to provide the site and train configuration files and directly generate the above two codes, if the purchase order stores the index of the starting site and the terminal site, you must ensure that the index value of the same site name remains unchanged when re-generating the ticket. However, if the order directly saves the site name, there is no need to keep the index value unchanged.

The index is shown in the inventory table. The above two arrays are the remaining ticket information of the vehicle count g108 and g107 respectively. "-" indicates that the vehicle passes through the site, and its value is actually a pointer, number of remaining votes to the corresponding number of trains:




Because we need to consider the situation of intermediate boarding, if the binary solution is put in the database, there will be a lot of performance problems, so I'm wondering if I can put the binary solution in the memory? I think it is possible, mainly due to the following findings:
1. first of all, the remaining ticket information of the trains is indeed a large array, which can be a single-digit group, each representing the ticket status of the seat, as long as there is a ticket for this seat-whether it is from the origin station to the terminal station, or in the middle of the bus, then set this bit to 1. However, the configuration of a carriage, the seat, and the layout of a carriage are fixed within at least one day in a fixed period of time, which can be considered as not changing frequently.
2. The ticket has not been sold, so you do not need to store it in the memory. You only need to set the corresponding bit to 0 in the above array.
3. All tickets from the origin site to the terminal do not need to be kept in the memory, as long as the corresponding bit is set to 1 in the above array.
4. in the memory, we only need to find a Data Structure to save the information about the seats at the middle meeting and getting off the bus. This information can be expressed in binary format, first, the memory usage is small, and second, the comparison and modification are fast.
5. As for the refund, I am still considering whether to put back the ticket pool or use a separate linked list structure to save it. I am now inclined to put it back into the ticket pool.
6. we can use Windows system to manage the data structure of memory allocation. This structure can be made into an array containing arrays, the subscript of the array represents the number of idle digits of the element at this position, as shown in:


 

Each trip has a similar two-dimensional array. In the array, the first element contains all the seats with the maximum number of consecutive free stations for the trip, that is to say, there is only one station where no one is sitting. The second element is that there are two consecutive stations where no one is sitting. Although the second element shows that there are actually three stations available, but we still put it in the second element.

At this time, if someone buys a ticket, for example, taking the first stop, we will first find the first match in the first number group and fill in the bits. At this time, we will find that the position is full, therefore, remove it from the array, and put the remaining null, as shown in:

 


If someone buys a two-site ticket, like above, find the first seat of the second array to match. after buying the ticket, the value of the ticket becomes "11111101 ", because only one site is idle, we put it in the first array, as shown in:

 


in this two-dimensional array, the number of columns for each vehicle is fixed-because the number of sites for each vehicle is fixed, and the corresponding array for each column, if the space is insufficient, can be dynamically allocated (this is a risk, and I have not carefully calculated the extreme situation ).

to save memory, the number of trains in each seat is a long integer, consisting of 8 bytes, in these 8 bytes, the first 14 digits are used to indicate the index of the seat in the train count (14 digits can have 12 digits to indicate the index, and can represent 4096 seats, we should be able to meet the bus seat, berth, and station ticket information. The other two can be used for some signs. I haven't figured out what to do ), the last 50 digits are the site occupancy information of the seats. As shown in:



there are also ideas about distributed support and load balancing, in the past two weeks, I will write more about what I think.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.