Iveely Search Engine 2 and 3 questions, use your wisdom to solve it!

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I sorted out two simple questions about the search engine this evening. All questions are from the iveely search engine. Share your wisdom with you! It is not difficult, but we hope to find the best solution.

Question 1:

Background:

In the user search process, we split the user's keywords and then matched them. For example, if you enter" Program "Life", after word segmentation, we will get "program" and "life". We can extract the webpage set corresponding to the "program" (9.00235, 123.00691, 96. 00035 ...), and the web page set corresponding to "life" (6.00025, 123.00128, 95. 00245 ...), the integer part is the Web page number, and the fractional part is the actual weight (value) of the keyword under the web page. Next, we will merge the web page set of "program" and "life, then feedback to the user. Problem: In the process of merging, you may encounter the same web page. When you encounter the same web page, we add the fractional part, and the integer part remains unchanged. If the fractional part is greater than 1, multiply the fractional part of the entire set by 0.1, and then accumulate. Problems to be Solved: Please design a Data Structure to solve the above problems with the lowest possible time complexity and space complexity.

Question 2:

Background:

In a search engine, each keyword corresponds to countless webpages, and each webpage corresponds to several keywords. After a search engine obtains a keyword, you must obtain a set of webpages with this keyword in the fastest possible time. Currently, the most common practice is reverse sorting. However, in reverse sorting files, although the keywords of the web page can be quickly extracted, the weight of the web page may not be the same. That is, the structure of the objects to be sorted is unordered.

Next, we will abstract the problem as Beijing subway station information. Every site is a keyword and every line is a webpage. Each site is contained by multiple lines (each keyword is included by several webpages), and each line contains multiple sites (each webpage contains multiple keywords ).

Problem generation:

The inverted file allows us to quickly extract the line corresponding to the site, but unfortunately, for example, the user will return to Metro Line 2 after searching for the Xizhimen, metro Line 4 and Metro Line 13. However, there is another intersection between Metro Line 4 and Metro Line 2: Xuanwu gate. Why do we need to know xuanwumen? In the iveely design, the author thinks that when the intersection site in the search results is more concentrated and reaches a certain level, the site may also be a site that the user is interested in (mathematical proof: (omitted). For example, if a user transfers to the subway, he may want to transfer to the subway at Xizhimen. If the result shows many subway lines that contain xuanwumen, so we assume that xuanwumen can also be a good subway transfer solution.

Problems to be Solved:

Please design a Data Structure and calculate it at the lowest possible time complexity and space complexity, the search result contains the sorting set of the same site (based on the number of times the results contain ). For example, if you enter Xizhimen, you can return the recommended xuanwumen. If there are other sites, the list is listed based on the number of occurrences.

The above questions are self-developed and are problems I encountered in the process of open-source iveely. I think this is a meaningful question, because not only our thinking, but also our code technology, of course, the most important thing is our mathematics. I will issue other similar questions one after another, so that we can discuss and learn them together. Welcome to your attention on iveely search engine, if you have any good comments or suggestions, you can mail liufanping@iveely.com or meager contact me.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Iveely Search Engine 2 and 3 questions, use your wisdom to solve it!

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Iveely Search Engine 2 and 3 questions, use your wisdom to solve it!

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support