Elasticsearch provides a simple analysis of the index logic. Here we will only clarify the main context, and some details will be elaborated in future articles. If you call the elasticsearch index interface through Java APIs, you first construct a JSON string (represented as xcontent in ES, which is an abstraction of the content to be processed ), in indexrequest, specify the index to which the document will be indexed, its type, and the Document ID. If no Document ID is specified, es automatically generates a uuid using the uuid tool, and the code is in the Process Method of indexrequest.
if (allowIdGeneration) { if (id == null) { id(UUID.randomBase64UUID()); opType(IndexRequest.OpType.CREATE); } }
Then, use the transportservice that encapsulates netty to send requests to the es server over TCP (rest is through HTTP ). The server obtains the transportaction and then parses the INDEX request (transportshardreplicationoperationaction ). To asyncshardoperationaction. start () method to start partitioning. First, read the cluster status, extract the target index and its shard information, and perform hash modulo Based on the index data ID, type, and index shard information, are you sure you want to allocate the data to that shard.
private int shardId(ClusterState clusterState, String index, String type, @Nullable String id, @Nullable String routing) { if (routing == null) { if (!useType) { return Math.abs(hash(id) % indexMetaData(clusterState, index).numberOfShards()); } else { return Math.abs(hash(type, id) % indexMetaData(clusterState, index).numberOfShards()); } } return Math.abs(hash(routing) % indexMetaData(clusterState, index).numberOfShards()); }
Find the primary shard for which the data is to be allocated, and submit the INDEX request to the primary shard for processing (transportindexaction. shardoperationonprimary ). Determine whether the routing value must be specified
MappingMetaData mappingMd = clusterState.metaData().index(request.index()).mappingOrDefault(request.type()); if (mappingMd != null && mappingMd.routing().required()) { if (request.routing() == null) { throw new RoutingMissingException(request.index(), request.type(), request.id()); } }
To determine the index operation type, there are two types of index operations. One is index. When the Document ID to be indexed already exists, the original document is not overwritten, but the original document is updated. One is create. When the index Document ID exists, an existing error is thrown.
if (request.opType() == IndexRequest.OpType.INDEX)
Call internalindexshard to perform index operations
Engine.Index index = indexShard.prepareIndex(sourceToParse) .version(request.version()) .versionType(request.versionType()) .origin(Engine.Operation.Origin.PRIMARY); indexShard.index(index);
You can use (internalindexshard) to find mapping that matches the Data Type of the Request index. Parse the JSON string to be indexed and convert it to the corresponding parsing result parseddocument according to mapping.
public Engine.Index prepareIndex(SourceToParse source) throws ElasticSearchException { long startTime = System.nanoTime(); DocumentMapper docMapper = mapperService.documentMapperWithAutoCreate(source.type()); ParsedDocument doc = docMapper.parse(source); return new Engine.Index(docMapper, docMapper.uidMapper().term(doc.uid()), doc).startTime(startTime); }
Finally, call the related methods (add or modify) in the external engine to perform operations on the underlying Lucene, which is written to the memory index of Lucene (external engine. innerindex ).
if (currentVersion == -1) { // document does not exists, we can optimize for create if (index.docs().size() > 1) { writer.addDocuments(index.docs(), index.analyzer()); } else { writer.addDocument(index.docs().get(0), index.analyzer()); } } else { if (index.docs().size() > 1) { writer.updateDocuments(index.uid(), index.docs(), index.analyzer()); } else { writer.updateDocument(index.uid(), index.docs().get(0), index.analyzer()); } }
After the memory index is written, it is also written into translog (translog is an index operation log that records no persistent operations) to prevent index data loss caused by power failure before flush.
Translog.Location translogLocation = translog.add(new Translog.Create(create));
After the primary shard INDEX request is complete, the request is sent to the replica for indexing. Finally, return the successful information to the client.
Address: http://blog.csdn.net/laigood12345/article/details/8450331
References: http://www.searchtech.pro/articles/2013/02/15/1360941961206.html