To create a vertical Q & A library, use php + mysql (redis or mogodb may be used later) to involve some professional word libraries, in the future, we hope to adjust the search results based on the user's semantics. Currently, we have seen mainstream search methods, such as sphonix, elasticsearch, and solr, which have limited time... want to create a vertical Q & A library
Use php + mysql (redis or mogodb may be used later)
Involves some professional word libraries, and later hopes to adjust the search results according to the user's Semantics
I have read
Mainstream search, sphonix, elasticsearch, solr
Limited time, which development cost is relatively low?
A friend who has used the product offers a suggestion. Thank you ·~
Reply content:
Want to create a vertical Q & A library
Use php + mysql (redis or mogodb may be used later)
Involves some professional word libraries, and later hopes to adjust the search results according to the user's Semantics
I have read
Mainstream search, sphonix, elasticsearch, solr
Limited time, which development cost is relatively low?
A friend who has used the product offers a suggestion. Thank you ·~
The four are used, and the development costs are basically the same, because the four sdks in mainstream languages can be used directly. The cost is the configuration cost of the four software, but both are relatively simple.
Differences
Xunsearch uses scws for Chinese word segmentation, which is highly accurate and contains parts of speech. The index creation speed is also acceptable. High query efficiency. However, because it is based on xapian, it lacks some syntactic sugar. In addition, we have lost the index when rebuilding the index. At that time, the data volume was approximately 10 million, and the data size in mysql was about 35 GB. I don't know if it is because of an earlier version.
It is relatively slow to create an index for sphenders, and there is no built-in Chinese word segmentation. However, you can refer to coreseek or configure your own word divider. Supports many mainstream word splitters. The query performance is weak, and the distributed support is not perfect. Some functions are missing, such as search folding in xunsearch. However, it is easy to use and relatively stable.
Solr/es are similar to each other, but there are more solr materials in China than elasticsearch. Chinese Word Segmentation is also supported, and common word segmentation such as ik and jieba are supported. The efficiency is almost the same as that of query and creation. They all have good distributed solutions. I personally think that the distribution of es is better, at least the configuration is simpler than solr. After solr5, the architecture has been adjusted, but a lot of domestic data is still based on solr4.
In fact, the cost of index replacement and migration is not high. We recommend that you use sphinx when the data size is small, which is simple and stable. When a performance bottleneck occurs, we do not recommend that you replace solr or es with distributed sphinx. Xunsearch personally feels that it is still at the toy level. Of course, I have come to this conclusion based on earlier versions. The current situation is not very familiar.
Sphtracing has many simple elasticsearch/solr functions. It is a relatively accurate dictionary for search engines.
Sphinx is recommended, which is easy to operate