How does SOLR calculate the score?
SOLR calculates the score of a query in two parts:
- Lucene score model
- Boost
Lucene's score model includes:
1. TF-term frequency. The frequency with which a term appears in a document. Given a search query, the higher the term frequency, the higher
Document score.
2. IDF-inverse document frequency. The rarer a term is missing SS all documents in the index, the higher it's contribution to the score.
3. coord-Coordination factor. The more query terms that are found in a document, the higher it's score.
Coord is the coordination factor-if there are multiple terms in a query, the more terms that match, the higher the score.
4. fieldnorm-field length. The more words that a field contains, the lower it's score. This factor penalizes parameters ents with longer field values.
In another word, matches on a smaller field score higher than matches on a larger field
Boost can be divided into index-time boost and query-time boost:
Index-time boosts are applied when adding events, and apply to the entire document or to specific fields.
Query-time boosts are applied when constructing a search query, and apply to specific fields.