Short URL Service Design

Source: Internet
Author: User

Short URL Service Design background

Short URL service, used to convert a long URL entered into a short URL (such as the case in the appendix), when the user requests this short URL, the service query out the real URL;
What are the points to consider when designing such a short URL service?

Data

First, the need to consider how the short URLs should be stored, using a key-value structure can be;
Key is the generated short URL, which has uniqueness;
Value is the original real URL;

Algorithm

The algorithm for calculating short URLs can be very simple, and there is only one mapping relationship between the short URLs and the original URLs.
Starting from 1 increments to map each URL;
1 digits can be used 26 letters + 10 digits, that is, 36 binary;
Of course, before the calculation of the need to check through value, to determine whether there are duplicate keys, if there are duplicates, direct return;
How do you quickly locate duplicates with value? Then using an STL set to solve the weight is a method, is there a better way?

Determine the length of the key and the length of the value

Value length can be set at 500, the general URL will not exceed this number;
key:t.cn/* *
The length of a key determines how many short URLs can be supported;
If it is 5 bit length, can support more than 60 million URL, 6 bit length is 2.1 billion;

Data capacity

Estimated data capacity
How much space will be occupied, for this kind of service, based on efficiency considerations, generally is full memory operation;
If the single machine can be installed, use a single machine;
If the single machine can not be mounted, the Shard is required, and the sharding strategy may be determined according to the increment range of the key, or according to the modulus;

Sharding Policy

Sharding based on the increment range of key

Advantages: Simple expansion, more than 1 server capacity after the addition of a machine;
Disadvantage: The load may not be balanced, the short URLs generated by the general are frequently accessed, causing the server to load the early short URLs to be idle;

Sharding based on Key's modulo

Advantages: The user's load is more balanced;
Disadvantage: Difficult to expand capacity

Trade-offs: You can estimate the data capacity first, determine the number of servers used, use the second Shard method, and when the data exceeds the estimated capacity, use the first Shard method for the exceeded key to route to the new server (patching)

Interface design

Determine the user's incoming interface protocol, user input and output

Concurrent read and write and data storage

What do you use to store these key-value data?
Looks like an STL hash map container can be, but map is not thread-safe, consider locking?
If the real-time requirements are not high, you can use the AB two memory operation, a memory line read, a piece of offline write, regularly updated;
Since the user has entered a long URL, you need to be able to display the short URL to be converted on the terminal, all the real-time of the write is also required;
Requires real-time, may be locked for map, or directly use third-party memory products, such as redis,memcache, etc.;
Use asynchronous to read and write to Redis to further improve concurrency efficiency;

Internet

For the amount of user requests, if the gigabit network can be satisfied, using a single-threaded event loop to handle; (IO non-blocking + io multiplexing)
If the user request is larger, using multiple reactor event loops to process, the incoming reactor is only responsible for event monitoring, after the connection is established, the processing of the user request is transferred to the subsequent calculation reactor;
Simple query and update logic that can be processed directly in the IO event loop (similar to the Ngnix architecture)
If the update logic is complex, consider adding additional process/thread pools to the background to handle asynchronous write operations;

Safety

(optional) Consider a malicious user, construct a non-existent URL to continuously trigger the request, so as to occupy the ID of the short URL;
The URL can be verified with legitimacy (direct access to that URL too time, not too show)
Limit the number of requests to the same source user;

Case

http://t.im/This short URL generator is used with 36 binary increments:
For example, enter a different long URL multiple times to get a short URL:
Http://t.im/vgu8
Http://t.im/vgu9
Http://t.im/vgu0
Http://t.im/vgua
From this also can be seen this website concurrency is not big, I these several requests are separated a few seconds;
This site also did not make a special URL check rules, such as input A.BB.CCC such as URLs, are legitimate;

Posted by: Big CC | 06nov,2015
Blog: blog.me115.com [Subscribe]
Github: Big cc

Short URL Service Design

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.