No. 361, Python distributed crawler build search engine Scrapy explaining-inverted index
Inverted index
The inverted index stems from the fact that a record needs to be found based on the value of the property. Each entry in this index table includes an attribute value and the address of each record that has that property value. Because the property value is not determined by the record, it is determined by the property value to determine the position of the record, and is therefore called an inverted index (inverted). A file with an inverted index is called an inverted index file (inverted file).
Inverted index principle
is to be a word participle and record the existence of the article, when users search for words can be directly found in the current word of the article
Inverted index word weight record (word bottle)
Word segmentation weight record, is through (TF-IDF) to achieve, details https://baike.so.com/doc/433640-459181.html
No. 361, Python distributed crawler build search engine Scrapy explaining-inverted index