The neo4j being used is the current latest version: 3.1.0, various tread pits. Say how to use the Chinese index in neo4j 3.1.0. Choose the Ikanalyzer to do the word breaker.
1. First refer to the article:
https://segmentfault.com/a/1190000005665612
The way of indexing with Ikanalyzer is roughly spoken. But it is not clear, in fact, the background of this article is to use embedded neo4j, that is, neo4j must be embedded in your Java application (https://neo4j.com/docs/java-reference/current/# tutorials-java-embedded), remember. Otherwise, you cannot use the custom Analyzer. Secondly, the method used in the text now has a problem, because neo4j 3.1.0 with lucene5.5, so the official Ikanalyzer has not been applied.
2. Correction
Switch to Ikanalyzer2012ff_u1.jar, which Google can download to (https://code.google.com/archive/p/ik-analyzer/downloads). This version of Ikanalyzer is a version that has been modified by a small partner to fix Ikanalyzer not fit lucene3.5. But the use of this package is still a problem, error message:
caused By:java.lang.AbstractMethodError:org.apache.lucene.analysis.Analyzer.createComponents (ljava/lang/string ;) lorg/apache/lucene/analysis/analyzer$tokenstreamcomponents;
That is, the Ikanalyzer Analyzer class and the current version of Lucene still do not fit.
Solution: Add two more classes
Package com.uc.wa.function;
Import Org.apache.lucene.analysis.Analyzer;
Import Org.apache.lucene.analysis.Tokenizer;
public class ikanalyzer5x extends analyzer{
private Boolean usesmart;
public Boolean Usesmart () {
return usesmart;
}
public void Setusesmart (Boolean usesmart) {
this.usesmart = Usesmart;
}
Public ikanalyzer5x () {This
(false);
}
Public ikanalyzer5x (Boolean Usesmart) {
super ();
This.usesmart = Usesmart;
}
/**
protected tokenstreamcomponents createcomponents (String fieldName, Final Reader in) {
Tokenizer _ Iktokenizer = new Iktokenizer (in, This.usesmart ());
return new tokenstreamcomponents (_iktokenizer);
}
**/
/**
* Rewrite the latest version of the Createcomponents
* Overload Analyzer interface, construct the sub-phrase */
@Override
protected Tokenstreamcomponents createcomponents (String fieldName) {
Tokenizer _iktokenizer = new iktokenizer5x ( This.usesmart ());
return new tokenstreamcomponents (_iktokenizer);
}
}
Package com.uc.wa.function;
Import java.io.IOException;
Import Org.apache.lucene.analysis.Tokenizer;
Import Org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
Import Org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
Import Org.apache.lucene.analysis.tokenattributes.TypeAttribute;
Import Org.wltea.analyzer.core.IKSegmenter;
Import Org.wltea.analyzer.core.Lexeme;
public class iktokenizer5x extends tokenizer{//ikִʵprivate iksegmenter _ikimplement;
Ԫıprivate final Chartermattribute Termatt;
Ԫλprivate final Offsetattribute Offsetatt;
Ԫԣէοorg.wltea.analyzer.core.lexemeеķೣprivate final Typeattribute Typeatt;
¼һԪľλprivate int endposition;
/** Public Iktokenizer (Reader in, Boolean Usesmart) {super (in);
Offsetatt = AddAttribute (Offsetattribute.class);
Termatt = AddAttribute (Chartermattribute.class); Typeatt = AddAttribute (Typeattribute.class);
_ikimplement = new Iksegmenter (input, Usesmart);
}**//** * Lucene 5.x tokenizer캯*ʵµtokenizerӿ* @param usesmart * *
Public iktokenizer5x (Boolean Usesmart) {super ();
Offsetatt = AddAttribute (Offsetattribute.class);
Termatt = AddAttribute (Chartermattribute.class);
Typeatt = AddAttribute (Typeattribute.class);
_ikimplement = new Iksegmenter (input, Usesmart); }/* (non-javadoc) * @see org.apache.lucene.analysis.tokenstream#incrementtoken () */@Override Pub
Lic Boolean Incrementtoken () throws IOException {//еĵԪclearattributes ();
Lexeme nextlexeme = _ikimplement.next (); if (nextlexeme! = null) {//Lexemeתattributes//ôԪıtermatt.append (nextlexeme.
Getlexemetext ()); ÔԪtermatt.setlength (NEXTlexeme.getlength ());
ÔԪλoffsetatt.setoffset (Nextlexeme.getbeginposition (), nextlexeme.getendposition ());
¼ִʵλendposition = Nextlexeme.getendposition ();
¼Ԫtypeatt.settype (Nextlexeme.getlexemetypestring ());
True֪¸Ԫreturn true;
}//False֪Ԫreturn false; }/* * (non-javadoc) * @see org.apache.lucene.analysis.tokenizer#reset (java.io.Reader) */@Ov
Erride public void Reset () throws IOException {Super.reset ();
_ikimplement.reset (input); } @Override public final void End () {//Set final offset int finaloffset = Correctoffset
(this.endposition);
Offsetatt.setoffset (Finaloffset, Finaloffset);
}
}
Solve problems that Ikanalyzer2012ff_u1.jar and lucene5 do not fit. Replace Ikanalyzer with ikanalyzer5x when used.
3. Finally
NEO4J Chinese index Establishment and search example:
/** * Create an index for a single node * * @param propkeys */public static void Createfulltextindex (long id, list<string> PROPK
Eys) {log.info ("Method[createfulltextindex] begin.propkeys<" +propkeys+ ">");
Index<node> entityindex = null;
Try (Transaction tx = Neo4j.graphDb.beginTx ()) {Entityindex = Neo4j.graphDb.index (). Fornodes ("Nodefulltextindex",
Maputil.stringmap (Indexmanager.provider, "Lucene", "Analyzer", IKAnalyzer5x.class.getName ()));
Node node = Neo4j.graphDb.getNodeById (ID); Log.info ("Method[createfulltextindex" Get Node id< "+node.getid () +" > name< "+node.getproperty (" Knowledge_
Name ") +" > "); /** Get node Details */set<map.entry<string, object>> properties = Node.getproperties (Propkeys.toarray (new
String[0]). EntrySet (); For (map.entry<string, object> property:properties) {log.info ("Method[createfulltextindex] index prop<" +PR
Operty.getkey () + ":" +property.getvalue () + ">"); Entityindex.add (nOde, Property.getkey (), Property.getvalue ());
} tx.success (); }
}
/** * Use index query * * @param query * @return * @throws IOException */public static list<map<string, Obje Ct>> selectbyfulltextindex (string[] fields, String query) throws IOException {list<map<string, Object
>> ret = lists.newarraylist ();
Try (Transaction tx = Neo4j.graphDb.beginTx ()) {Indexmanager index = Neo4j.graphDb.index (); /** Query */index<node> Addressnodefulltextindex = Index.fornodes ("Nodefulltextindex", MapUtil.stringMap (IndexMa Nager.
PROVIDER, "Lucene", "Analyzer", IKAnalyzer5x.class.getName ()));
Query q = Ikqueryparser.parsemultifield (fields, query);
indexhits<node> foundnodes = addressnodefulltextindex.query (q);
for (Node n:foundnodes) {map<string, object> m = n.getallproperties (); if (!
Float.isnan (Foundnodes.currentscore ())) {M.put ("score", Foundnodes.currentscore ()); } log.info ("Method[selectbyindex] score<" +foundnodes.currentscore () + ">");
Ret.add (m);
} tx.success (); } catch (IOException e) {log.error ("Method[selectbyindex] fields<" +joiner.on (","). Join (Fields) + "> query<" +
Query+ ">", e);
Throw e;
} return ret; }
Notice here that I used Ikqueryparser, which automatically constructs query based on our query terms and the fields to be queried. Here is to bypass a pit: with Lucene query statement directly, there is a problem. For example: "Address: Nanchang" query statement, will search all with the city word address, this is very unreasonable. Use Ikqueryparser to fix the problem. Ikqueryparser is a tool that Ikanalyzer comes from, but is cut off in Ikanalyzer2012ff_u1.jar. So I re-introduced the original Ikanalyzer jar package, and the project was eventually co-existing with two jar packages.
The pit is almost on foot here.