Why does this exception occur:
If rangequery, prefixquery, wildcardquery, and fuzzyquery are used during Lucene search, the toomanyclses exception may occur. Why is this exception? Example:
Take rangequery as an example. If the date range is 19990101 to 20091231, and the index file contains such date phrases as 19990102,19990103, rangequery will be extended to "19990102 or 19990103" and become two clauses. As you can imagine, if there are many dates in the index file during this period, many clauses will be generated.
The same applies to prefixquery. For example, if the query term is "Legal *", the index file contains "legal", "legal field", "Forensic", and "legal code, this query will be extended into "legal or legal proceedings", and perhaps more.
To save memory, Lucene limits the number of clauses to 1024 by default. If the limit is exceeded, A toomanyclses exception is thrown.
How can we solve this problem? Lucene provides three methods:
(1) Use filter to replace query. Of course, this is at the expense of query speed, but this problem can be mitigated through caching. For example, you can use rangefilter to replace rangequery as follows:
PreviousCode:
Booleanquery simplequery = new booleanquery (); term datelower = new term ("publishdate", startyear + "0101"); term dateupper = new term ("publishdate ", endyear + "1231"); rangequery datequery = new rangequery (datelower, dateupper, true); simplequery. add (datequery, occur. must );
Subsequent code:
Booleanquery simplequery = new booleanquery (); rangefilter datefilter = new rangefilter ("publishdate", startyear + "0101", endyear + "1231", true, true ); filteredquery = new filteredquery (simplequery, datefilter );
(2) Use booleanquery. setmaxclausecount (10240) to limit the number. This will increase the memory consumption. Use booleanquery. setmaxclausecount (integer. max_value) to completely remove this restriction.
(3) range query can reduce the precision as much as possible. For example, if the query does not need to be accurate to the month or date, it only needs to be accurate to the Year, it is said that the datemedils class can be used to easily solve the time conversion problem. I did not try it.