Simple implementation of the PHP "related article recommendation" Function

Last Update:2014-04-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Generally, for content websites, a list of articles related to this article needs to appear in each article. Most people use the following method: Create a keyword list, judge which keywords are contained in each article, and finally find the most relevant article based on the keywords. For websites with complicated content, determining key list words is obviously troublesome.

Later I checked some php functions, and I felt that the similar_text (php4, php5) function could easily meet my requirements. This idea is: retrieve all the article titles from the article list, compare all the article titles with the current title, and generate an array of the comparison results, the article titles are compared with the original article titles by similar_text based on the similarity, and the titles are re-arranged according to the similarity of the titles, A list of similar articles is obtained.

The key functions used in this approach are:

int similar_text ( string $first, string $second [, float $percent] )

It returns the same number of bytes of the two root strings.

According to this idea, we create the following function. The function is to rearrange the $ arr_title array in a sequence similar to $ title.

<? Php $ demo_title = ""; $ demo_arr_title = array ("simple modern magic", "simple modern magic", "concise ancient magic ", "modern magic is not simple", "modern magic is difficult to understand"); $ new_array = getSimilar ($ demo_title, $ demo_arr_title); // print_r ($ new_array ); echo "the first three articles most relevant to [$ demo_title] are: <br/>"; for ($ j = 0; $ j <= 2; $ j ++) {echo ($ j + 1 ). ":". $ new_array [$ j]. "<br/>" ;}// $ title: Current title. $ arrayTitle is the Array function getSimilar ($ title, $ arr_title) to be searched) {$ arr_len = count ($ arr_title); for ($ I = 0; $ I <= ($ Rr_len-1); $ I ++) {// get two bytes of string similarity $ arr_similar [$ I] = similar_text ($ arr_title [$ I], $ title );} arsort ($ arr_similar); // sort the reset ($ arr_similar) by the number of similar bytes; // move the pointer to the first unit of the array $ index = 0; foreach ($ arr_similar as $ old_index => $ similar) {$ new_title_array [$ index] = $ arr_title [$ old_index]; $ index ++;} return $ new_title_array;}?>

Program running result:

The first three articles most relevant to [helper's house] are: 1: simple and clear modern magic 2: easy to understand modern magic 3: Concise and concise ancient magic

Note the following:

Someone has done this test on similar_text speed. The result is:

The speed issues for similar_text seem to be only an issue for long sections of text (& gt; 20000 chars ).

I found a huge performance improvement in my application by just testing if the string to be tested was less than 20000 chars before calling similar_text.

20000 + took 3-5 secs to process, anything else (10000 and below) took a fraction of a second. fortunately for me, there was only a handful of instances with> 20000 chars which I couldn't get a comparison %.

It may be slow to directly use the text for comparison.

This function may not work very well in English (I have not tried it ). You can separate an English sentence with spaces into multiple words and then write a function similar to similar_text.
When a sentence contains many non-Keyword characters, such as ", and so on, the result may be unsatisfactory.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Simple implementation of the PHP "related article recommendation" Function

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support