URL short address compression algorithm analysis of Weibo short address principles (implemented in Java)

Source: Internet
Author: User
Tags key string

Recently, the short URL algorithm was used in the project, so I searched the internet and found that there was a C # algorithm. net algorithm, PHP algorithm, is not found Java version of the short URL algorithm, is very depressing. At the same time, it is also found that many netizens are posting for help, how to implement the Java version of the short URL algorithm. Let's just skip this step. Let's take a look at the popular PHP short URL algorithm on the Internet:

Based on your understanding, Java is used to implement the short URL algorithm. (\ (^ O ^)/YES! I am really amazing !)

Let's talk nonsense first. It is seen in other people's posts, mainly to let everyone know about the short URL ).

Nowadays, the application of short web sites has become popular in various microblogs across the country. For example, QQ Weibo's url.cn and groom's t.cn.

When we publish a website on Sina Weibo, Weibo will automatically identify the website and convert it, for example, http://t.cn/hrynr0. The reason for doing so is as follows:

1. Weibo has a limit of 140 words. If we need to send some connections, but this connection is so long that it will take up nearly half of our content, this is definitely not allowed, so the short website came into being.

2. Short URLs can be well managed in our project for Open-level URLs. Some websites can cover the content, violence, advertisements, and other information, so that we can use user reports to completely manage this connection and it will not be available in our applications, after the same URL is encrypted, the obtained URL is the same.

3. We can collect statistics such as traffic and clicks on a series of websites to find out the concerns of most users. This will help us make better decisions on the future work of the project.

In fact, the above three points are purely personal opinions, because they will be applied in some of my subsequent projects, so I will take a look at it. Next, let's take a look at the theory of the short URL ing algorithm (information found on the Internet):

① Use the md5 Algorithm to generate a 32-bit signature string for a long URL, which is divided into 4 segments and each segment contains 8 characters;

② Process these four segments cyclically, take the 8 characters of each segment, and regard it as a hexadecimal string and 0x3fffffff (30 bits 1) bits and operations, ignore processing with more than 30 bits;

③ Divide the 30 digits in each segment into six segments, and each 5 digits are used as the index of the alphabet to obtain a specific character, and 6 strings are obtained in sequence;

④ This md5 string can obtain four 6-bit strings, and any one of them can be used as the short url address of this long url.

In simple theory, we do not necessarily say that the obtained URL is unique, but we can retrieve four sets of URLs, so there will be almost no large repetition.

First, please understand how to use MD5 to encrypt the string to get a 32-bit encrypted string in Java. below is the Java MD5 algorithm I have encapsulated:

private final static String[] hexDigits = {      "0", "1", "2", "3", "4", "5", "6", "7",       "8", "9", "a", "b", "c", "d", "e", "f"}; public static String byteArrayToHexString(byte[] b){StringBuffer resultSb = new StringBuffer();for (int i = 0; i < b.length; i++){resultSb.append(byteToHexString(b[i]));}return resultSb.toString();}private static String byteToHexString(byte b){int n = b;if (n < 0)n = 256 + n;int d1 = n / 16;int d2 = n % 16;return hexDigits[d1] + hexDigits[d2];}public static String MD5Encode(String origin){String resultString = null;try {resultString=new String(origin);MessageDigest md = MessageDigest.getInstance("MD5");resultString.trim();resultString=byteArrayToHexString(md.digest(resultString.getBytes("UTF-8")));}catch (Exception ex){}return resultString;}public static void main(String[] args) {String data = "189022881112011111118:09sz0000123456789987654321";System.out.println(MD5Encode(data));}

 

 

 

Public class synchronized URL {public static void main (string [] ARGs) {string url = "http://www.sunchis.com"; for (string: plain text (URL) {print (string );}} public static string [] plain text (string) {string key = "xuliang "; // customize the mixed key string [] chars = new string [] {// use the character "A", "B ", "C", "D", "E", "F", "g", "H", "I", "J", "k", "L ", "M", "n", "O", "P", "Q", "r", "S", "T", "U", "V ", "W", "X", "Y", "Z", "0", "1", "2", "3", "4", "5 ", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F ", "G", "H", "I", "J", "k", "L", "M", "n", "O", "P ", "Q", "r", "S", "T", "U", "V", "W", "X", "Y ", "Z"}; string hex = md5encode (Key + String); int hexlen = hex. length (); int subhexlen = hexlen/8; string [] substring STR = new string [4]; for (INT I = 0; I <subhexlen; I ++) {string outchars = ""; Int J = I + 1; string subhex = hex. substring (I * 8, J * 8); long idx = long. valueof ("3 fffffff", 16) & long. valueof (subhex, 16); For (int K = 0; k <6; k ++) {int Index = (INT) (Long. valueof ("0000003d", 16) & idx); outchars + = chars [Index]; idx = idx >>>5;} response STR [I] = outchars;} return response STR ;} private Static void print (Object messagr) {system. out. println (messagr );}}

 

Let's take a look at the program section: Now you can directly use the plain text (url) method, you can wait until the following four sets of values:

 

Plain text ("http://www.sunchis.com") [0]; // get value: Jzyqma plain text ("http://www.sunchis.com") [1]; // get value: QBrMzm plain text ("http://www.sunchis.com ") [2]; // get value: bQreM3 plain text ("http://www.sunchis.com") [3]; // get value: VNBRna

Select any value of the four values in the result as the short URL generated by the URL.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.