Collator is used to simplify our processing of differences between different languages. Collator mainly deals:
- Normalized typical equivalent characters
- Multi-level comparison
Compare Java strings by comparing the Unicode byte code point. This will mean that in UnicodeCodeThe sorting weight specified by the chart character position, but this is not the case. The language may have identical characters and different sorting weights.
For example, if you do not understand German, you may want to sort (\ u00df) as B or B, but in fact it is SS, in this language, the sort value is higher than the normal S.
Multi-level comparison provides four comparison levels: basic characters, accents, cases, and punctuation marks.
Note that collator does not support punctuation.
Let's take a look at some specific code for multi-level comparison examples:
System. Out. println ("A equals B->" + (collator. Compare ("A", "B") = 0? "True": "false"); system. out. println ("A equals à->" + (collator. compare ("A", "à") = 0? "True": "false"); system. out. println ("A equals a->" + (collator. compare ("A", "A") = 0? "True": "false "));
When collator. setstrength (collator. Primary ):
A equals B-> false
A equals à-> true
A equals a-> true
When collator. setstrength (collator. Secondary ):
A equals B-> false
A equals à-> false
A equals a-> true
When collator. setstrength (collator. Tertiary ):
A equals B-> false
A equals à-> false
A equals a-> false
There is also the following code, although the first line outputs false, but they do look exactly the same string:
Collator. setdecomposition (collator. canonical_decomposition); string single = "abgaskr \ signature"; string combined = "abgaskr \ u0075 \ signature"; system. Out. println ("single equals combined? "+ (Collator. Compare (single, combined) = 0? "True": "false "));
The seed collator class is hard to understand, but you can't do without it when you need to process different languages.
You can view
The javadoc document of the collator class.