The concept of Web Page rank Based on user browsing records

Source: Internet
Author: User

 

Google's pagerank is not introduced much. An algorithm that can measure the importance of a webpage is essentially the result of mutual voting on the webpage. Based on this feature, by using sitemap, we can allow search engines to browse as many website content as possible, or increase the PR value of websites by doing more external links to achieve SEO.

Most search engines on the market are using pagerank similar methods. To ensure fairness, they all use machine-only methods to traverse websites through web crawlers, there are some interesting problems:

1. The content of a webpage is great, but because there are too few external links, crawlers may not be able to climb to it under the preset depth threshold, making it a "Dark content" for few people"

2. Some websites may have good search rankings even if the reposted content or low-value content has a high PR value, even if the technology-leading search engine uses semantic networks to identify high-quality content, the effect is still not good enough.

In order to avoid the above problems, introducing user data to judge the importance and quality of webpage content is a research direction. How can this problem be solved?

Hypothesis: browsing behavior is the best way to judge the quality of web pages, which is equivalent to user labeling. In the case of large-scale data, the effect should be better than that of machines.

Principle:

1. Use a browser or other client software, the best firewall or other security software to obtain user browsing logs and upload the logs to the crawler database of the search engine to obtain user browsing data.

2. crawler matches the existing index library, finds the unindexed content, and crawls it

3. Using user logs to vote for a web page, the longer the browsing time, the higher the weight, the rank of the web page is calculated.

Defects:

1. Dependent clients

2. user privacy issues

Avoidance:

1. Proposes cloud anti-virus, cloud defense, and cloud security, allowing users to agree to upload browsing records

2. Secretly upload, encrypt and split the browsing records (other files can also be), and combine and restore the browsing records on the server.

Now, let's give it a loud and profound name: peoplerank.

Finally, I am very serious about technology.

Via I By sluke Lu Weiqing original address: http://luplusplus.com/peoplerank-modle

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.