To process a document set, you need to calculate all the words contained in corpus, that is, to calculate a vocabulary. Words and corresponding indexes must be saved in the vocabulary. Of course, the document must be stopword removal and text stemming before statistics are made. After obtaining a vocabulary, we naturally have two requirements: one is to obtain its index based on the word, and the other is to obtain the corresponding word based on the index. However, the map class in JDK only supports obtaining value based on the key. If you want to obtain the key based on the value, you need to use the entryset method to obtain all the entries and traverse them again, determine whether the value of each entry is what we expect. If yes, write down the corresponding key of this entry. For details, refer to here.
Of course, it is a ugly practice. We can use the bidimap interface in Apache commons collections to complete this task. One of the limitations of using this interface is that the key and value must be one to one. The following is a simple example:
BidiMap bidi = new HashBidiMap();bidi.put("SIX", "6");bidi.get("SIX"); // returns "6"bidi.getKey("6"); // returns "SIX"bidi.removeValue("6"); // removes the mappingBidiMap inverse = bidi.inverseBidiMap(); // returns a map with keys and values swapped
Several examples can be found on Google: http://java.dzone.com/articles/guavas-bidirectional-maps,
APIs in Apache commons collections do not support generic functions, but there is another option, guava-libraries, which is a class library from Google. Bimap can complete the same task. A simple example is as follows:
BiMap<Integer, String> biMap = HashBiMap.create();biMap.put(1, "a");biMap.put(2, "b");biMap.put(3, "c");BiMap<String, Integer> invertedMap = biMap.inverse();
So many examples are incomplete, but they should be very simple.
Reference: http://stackoverflow.com/questions/1383797/java-hashmap-how-to-get-key-from-value