Distributed search elasticsearch source code analysis 2-Brief Analysis of index process source code

Source: Internet
Author: User
Elasticsearch provides a simple analysis of the index logic. Here we will only clarify the main context, and some details will be elaborated in future articles. If you call the elasticsearch index interface through Java APIs, you first construct a JSON string (represented as xcontent in ES, which is an abstraction of the content to be processed ), in indexrequest, specify the index to which the document will be indexed, its type, and the Document ID. If no Document ID is specified, es automatically generates a uuid using the uuid tool, and the code is in the Process Method of indexrequest.
       if (allowIdGeneration) {            if (id == null) {                id(UUID.randomBase64UUID());                opType(IndexRequest.OpType.CREATE);            }        }
Then, use the transportservice that encapsulates netty to send requests to the es server over TCP (rest is through HTTP ). The server obtains the transportaction and then parses the INDEX request (transportshardreplicationoperationaction ). To asyncshardoperationaction. start () method to start partitioning. First, read the cluster status, extract the target index and its shard information, and perform hash modulo Based on the index data ID, type, and index shard information, are you sure you want to allocate the data to that shard.
   private int shardId(ClusterState clusterState, String index, String type, @Nullable String id, @Nullable String routing) {        if (routing == null) {            if (!useType) {                return Math.abs(hash(id) % indexMetaData(clusterState, index).numberOfShards());            } else {                return Math.abs(hash(type, id) % indexMetaData(clusterState, index).numberOfShards());            }        }        return Math.abs(hash(routing) % indexMetaData(clusterState, index).numberOfShards());    }
Find the primary shard for which the data is to be allocated, and submit the INDEX request to the primary shard for processing (transportindexaction. shardoperationonprimary ). Determine whether the routing value must be specified
      MappingMetaData mappingMd = clusterState.metaData().index(request.index()).mappingOrDefault(request.type());        if (mappingMd != null && mappingMd.routing().required()) {            if (request.routing() == null) {                throw new RoutingMissingException(request.index(), request.type(), request.id());            }        }
To determine the index operation type, there are two types of index operations. One is index. When the Document ID to be indexed already exists, the original document is not overwritten, but the original document is updated. One is create. When the index Document ID exists, an existing error is thrown.
if (request.opType() == IndexRequest.OpType.INDEX) 
Call internalindexshard to perform index operations
            Engine.Index index = indexShard.prepareIndex(sourceToParse)                    .version(request.version())                    .versionType(request.versionType())                    .origin(Engine.Operation.Origin.PRIMARY);            indexShard.index(index);
You can use (internalindexshard) to find mapping that matches the Data Type of the Request index. Parse the JSON string to be indexed and convert it to the corresponding parsing result parseddocument according to mapping.
    public Engine.Index prepareIndex(SourceToParse source) throws ElasticSearchException {        long startTime = System.nanoTime();        DocumentMapper docMapper = mapperService.documentMapperWithAutoCreate(source.type());        ParsedDocument doc = docMapper.parse(source);        return new Engine.Index(docMapper, docMapper.uidMapper().term(doc.uid()), doc).startTime(startTime);    }
Finally, call the related methods (add or modify) in the external engine to perform operations on the underlying Lucene, which is written to the memory index of Lucene (external engine. innerindex ).
         if (currentVersion == -1) {                // document does not exists, we can optimize for create                if (index.docs().size() > 1) {                    writer.addDocuments(index.docs(), index.analyzer());                } else {                    writer.addDocument(index.docs().get(0), index.analyzer());                }            } else {                if (index.docs().size() > 1) {                    writer.updateDocuments(index.uid(), index.docs(), index.analyzer());                } else {                    writer.updateDocument(index.uid(), index.docs().get(0), index.analyzer());                }            }
After the memory index is written, it is also written into translog (translog is an index operation log that records no persistent operations) to prevent index data loss caused by power failure before flush.
Translog.Location translogLocation = translog.add(new Translog.Create(create));
After the primary shard INDEX request is complete, the request is sent to the replica for indexing. Finally, return the successful information to the client.

Address: http://blog.csdn.net/laigood12345/article/details/8450331

References: http://www.searchtech.pro/articles/2013/02/15/1360941961206.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.