As we all know, sphindexing is a full-text indexing program, and its high-speed query capability is also obvious to all. In addition, can we mine some other functions? As a simple cache server. First, let's take a look at the files used by the Sphinx. The files used by the Sphinx include. 7D,. spa,. spi,. spd,. spp,. spm,. spl.
As we all know, sphindexing is a full-text indexing program, and its high-speed query capability is also obvious to all. In addition, can we mine some other functions? As a simple cache server. First, let's take a look at the files used by the Sphinx. The files used by the Sphinx include. 7D,. spa,. spi,. spd,. spp,. spm,. spl.
As we all know, sphindexing is a full-text indexing program, and its high-speed query capability is also obvious to all. In addition, can we mine some other functions? As a simple cache server.
First, let's take a look at the files used by the Sphinx. The files used by the Sphinx include. 7D,. spa,. spi,. spd,. spp,. spm,. spl.
- .
- Spi: the pointer of saving the WordId and the document information pointing to this WordId In the spd file. The spi file is fully loaded into the memory when the retrieval program starts. Spi files are segmented and sorted in blocks. The purpose of chunks should be to quickly retrieve the WordId. Because the WordId in spi is variable-length compression, the index must first perform binary location at the block level, and then extract and search in the fast.
- Spa: The file that stores DocInfo. When the retrieval program starts, the file is loaded as memory. sphinx can specify the storage method of DocInfo:
- Inline: stored in the spd file.
- Extern: the spa file is generated when it is stored separately.
- Spd: List of documents.
- Spp: the location list of the keyword.
- Spm: In DocInfo, there is a special attribute called MVA, multi-value attribute. Sphenders have special processing on this attribute and need to be stored in the spm file. When the retrieval program starts, the file is loaded as memory. This attribute stores the byte offset in this file at the location of DocInfo.
- Spk: killlist
- Spl: index lock
Through introduction, we can know the attributes of the documents stored by Sphinx. Versions earlier than 0.98 are not stored. Can we use the data as a cache to obtain Document Information Based on DocID.
Use the hack search service to add the SEARCHD_COMMAND_DOCINFO command, and add the GetDocinfo function to the client API to achieve the expected results.
Php sample code:
require 'sphinxapi.php';$cl = new SphinxClient ();$cl->SetServer();$res = $cl->GetDocinfo(1, 'singer');print_r($res);
The result is as follows:
Array ([singer_id] => 1 [singer_name] => A Niu [cate_id] => 1 [tag_ids] => Array ([0] => 110 [1] => 114 [2] = & gt; 127) [song_number] => 137 [album_number] => 14)
Patch file: https://gist.github.com/2251422
References
- Sphtracing Analysis
- Sphsf-spx File Format
Original article address: how to configure Sphinx as a cache server, thanks to the original author for sharing.