Elasticlunr.js Latest Version v0.6.7 released

Source: Internet
Author: User
Tags idf

The first open-source class library Elasticlunr.js, which has a little practical value, has just released the latest version of v0.6.7, which was built on the basis of lunr.js. I hope you help to do code review, I am not familiar with JavaScript, and if it is convenient to give a bit of help on GitHub, hey, you are a praise, I continue to work to improve the project's largest source of power.

Homepage of this project: Elasticlunr.js
Project Code: Elasticlunr.js Code
Project Document: Elasticlunr.js doc
NPM release address for project: Elasticlunr.js NPM
Project Online Demo Address: Elasticlunr.js Demo

At present, compared with LUNR, the major changes are:
1.Query-time Boosting, at present ELASTICLUNR support query weight adjustment, can be conveniently based on different application scenarios to adjust the weight of different field, do not have to set up the index when directly written to the field weight cannot be adjusted.
2.More Rational scoring mechanism, the current elasticlunr.js adoption and Elasticsearch and lucence have always been the document scoring criteria, more accurate and objective than the Lunr.js score. The current scoring criteria used in the generaltf idf tf*idfAlso done field length normalization, field length normalization a bit like BM25 model, if a field too long, will cause the TF is larger, do the field length Normalization can make a penality to the tf of long field, Elasticlunr.js also makes coord normalization and query normalization, query-time Boosting information enroll to query normalization. About Elasticlurn and Elasticsearch is how to do scoring, please refer to my article Elasticsearch scoring detailed explanation.
3.Field-search, users can specify which field of the document needs to be built into the index when the user is indexed, and at the same time, the user can specify which field to look for in the query, and how many weights can be assigned to each field, enabling very flexible queries.
4.Boolean Model, you can specify the global Boolean model when you query, or you can specify a Boolean model for a field, and the field Boolean model overrides the global Boolean model, This allows the user to specify whether to include all of the query tokens in a field.
5.Combined Boolean model, TF/IDF model and the Vector Space model, currently ELASTICLUNR integrates the Boolean model, TF/IDF model and vector space model for the calculation of document ranking fractions, which makes the ranking accuracy of the document better. For document score calculation see: Elasticsearch scoring detailed explanation. Other ranking model, such as BM25 model, will also be included in the future.
6.Fast, Elasticlunr.js does not have to calculate the vector of the query and the vector of the document when calculating the similarity of the document, instead of using vector space model, it calculates the score of each query token in the appearing document, and then puts each query The token's score is summed up, and finally the query normalization, so the calculation is faster when the query is done.
7.Small Index Size, Elasticlunr.js does not store token corpus because there is no need to calculate the document vectors, which saves a portion of the storage size; Elasticlunr.js users can set whether to store documents in JSON format, if the user is concerned about the size of index, can be set to not store the JSON format documents; The most important part is that elasticlunr.js Optimized the index storage information, so that the index size can be reduced to the original general about.

application Example Indexing

If the user does not set whether to store the document, the default is storage so that the user can generate a summary with the JSON document:

var index = elasticlunr(function () {    this.addField(‘title‘);    this.addField(‘body‘);    this.setRef(‘id‘);});

Users can set up not to store JSON documents:

var index = elasticlunr();index.addField(‘title‘);index.addField(‘body‘);index.setRef(‘id‘);index.saveDocument(false);
Index document

Adding documents to the index are as simple as:

Documents that can be indexed directly in JSON format

varDoc1 = {"id":1,"title":"Oracle released its latest database Oracle 12g","Body":"Yestaday Oracle has released it new database Oracle 12g, this would make + Ice profit report of annual year. "}varDOC2 = {"id":2,"title":"Oracle released its profit","Body":As expected, Oracle released its profit, during the good sales of database and hardware, Oracle ' s Prof It's reached 12.5 billion. "}index.adddoc (Doc1); Index.adddoc (DOC2);
Simple Document Retrieval
index.search("Oracle database profit");

Search results:

[{    "ref"1,    "score"0.5376053707962494},{    "ref"2,    "score"0.5237481076838757}]

Because Elasticlunr.js already has a very complex, reasonable scoring system, so in most cases users only need to use the simplest query to meet the needs, of course, if the user is more familiar with the retrieval principle, but also through the configuration to achieve more complex, for different applications of the search.

Field Search && Configuration

Users can specify which field to query, how much weight to give each field, and which Boolean model to use in each field for retrieval.

index.search("Oracle database profit", {    fields: {        2},        1}    },    "OR"});

The above query example by setting the global BOOL model means that in each field, the BOOL model is used, if the other bool model is specified in a field, then the BOOL model of field will overwrite the global bool model, For example:

index.search("Oracle database profit", {    fields: {        2"AND"},        1}    },    "OR"});
Why do you need Elasticlunr.js?
    1. In many cases, you may want to allow some of your documents to implement the retrieval functionality, but you do not want to provide too complex server environment configurations. At this point you can create a document index entirely with Elasticlunr.js, and then use the search function in the browser, you just need to provide a simple Apache static Web service on the server.

    2. Many times users may not always be able to access the network or access the network in a limited network environment, such as mobile devices, this time if you can provide offline documents and support search function, it is very friendly to users. For example, a module of a document, users often download the module is already included in the document, if the user to use these documents locally, provide a search function is very useful.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Elasticlunr.js Latest Version v0.6.7 released

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.