How to efficiently implement a counter map

Source: Internet
Author: User

This is a discussion on stackoverflow many years ago. The answer involves multiple counting methods. For a map with a key-value structure, we often involve the key as an object during programming, while the value is an integer or long for counting, so as to count the frequency of multiple keys. There may be many implementations in the face of such a basic requirement. For example, the most basic method is to use jdk map to directly implement -- value is an integer or long. The basic code type is as follows: 1: final Map <String, Integer> freq = new HashMap <String, Integer> (); 2: int count = freq. containsKey (word )? Freq. get (word): 0; 3: freq. put (word, count + 1); logic is simple, judge whether there is, is get value, otherwise it is 0, then put into a value after adding 1. A total of three methods are called: get and put. Of course, we can further remove the contain judgment. The Code is as follows: 1: final Map <String, Integer> freq = new HashMap <String, Integer> (); 2: Integer count = freq. get (word); 3: if (count = null) {4: freq. put (word, 1); 5:} else {6: freq. put (word, count + 1); 7:} in general, most people have satisfied the logic and can accept the simple performance, isn't that true? Get and put. Of course, this implementation is not efficient enough, so we began to try to implement or find a more efficient method to see if the open-source collection class library is needed: There is a Trove, for more information, see: 1: final TObjectIntHashMap <String> freq = new TObjectIntHashMap <String> (); 2: freq. adjustOrPutValue (word, 1, 1); this is very elegant. What is the performance? I don't know. I need to check the source code for details. Then let's see how the famous guava works? 1: AtomicLongMap <String> map = AtomicLongMap. create (); 2: map. getAndIncrement (word); The implementation is still elegant, but look at the name and look at the source code. Okay, thread-safe, supports concurrency. This is not a good deal. Do we need it in our scenario? If you don't need it, intuition tells us that it must be "slow. Search for: 1: Multiset <String> bag = HashMultiset. create (); 2: bag. add (word); this seems appropriate, and the bag implementation is much better. In terms of semantics, such an interface is easier to understand. What are the performance of these methods? After a simple comparison, 26 English letters are used as keys, and the efficiency of each method is compared several times in a uniform cycle (time efficiency alone), and the time does not count the build overhead. The concurrentMap Implementation of the thread security edition is added. In fact, the AtomicLongMap in google's guava is also the concurrentMap of juc. There is the final MutableInt method in it. Look for it. The best performance is it. 1:/** 2: * 3: */4: 5: 6: import gnu. trove. map. hash. TObjectIntHashMap; 7: 8: import java. util. hashMap; 9: import java. util. map; 10: import java. util. concurrent. concurrentHashMap; 11: import java. util. concurrent. concurrentMap; 12: import java. util. concurrent. atomic. atomicLong; 13: 14: import com. google. common. collect. hashMultiset; 15: import com. google. common. collect. multiset; 16: import com. googl E. common. util. concurrent. atomicLongMap; 17: 18:/** 19: * @ author Administrator 20: * 21: */22: public class IntMapTest {23: 24:/** 25: * @ param args 26: */27: public static void main (String [] args) {28: // TODO Auto-generated method stub 29: int cycles [] = {100,100 0, 10000,100 000}; 30: Tester baseLine = new BaseLine (); 31: Tester testForNull = new UseNullTest (); 32: Tester useAtomicLong = New UseAtomicLong (); 33: Tester useTrove = new UseTrove (); 34: Tester useMutableInt = new UseMutableInt (); 35: Tester useGuava = new UseGuava (); 36: tester useGuava2 = new UseGuava2 (); 37: 38: for (int I = 0; I <cycles. length; I ++) {39: System. out. println ("----- With" + cycles [I] + "cycles -----"); 40: baseLine. test (cycles [I]); 41: testForNull. test (cycles [I]); 42: useAtomicLong. test (cycles [I]); 43: useTrove. test (cycles [I]); 44: useMutableInt. test (cycles [I]); 45: useGuava. test (cycles [I]); 46: useGuava2.test (cycles [I]); 47: System. out. println ("------------------------"); 48:} 49: 50:} 51: 52:} 53: 54: abstract class Tester {55: long MS; 56: static String [] strs = "abcdefghijklmnopqrstuvwxyz ". split (""); 57: 58: void pre () {59: System. out. println ("=" + this. getName () + "Test Case" ); 60: MS = System. currentTimeMillis (); 61: System. out. println ("start at" + MS); 62:} 63: 64: void post () {65: MS = System. currentTimeMillis ()-MS; 66: System. out. println ("Time used:" + ms + "ms"); 67:} 68: 69: abstract void doAction (int cycles); 70: 71: public void test (int cycles) {72: pre (); 73: doAction (cycles); 74: post (); 75:} 76: 77: abstract String getName (); 78:} 79: 80: class B AseLine extends Tester {81: final Map <String, Integer> freq = new HashMap <String, Integer> (); 82: 83: @ Override 84: void doAction (int cycles) {85: for (int I = 0; I <cycles; I ++) {86: for (String word: strs) {87: int count = freq. containsKey (word )? Freq. get (word): 0; 88: freq. put (word, count + 1); 89:} 90:} 91:} 92: 93: @ Override 94: String getName () {95: return "BaseLine "; 96:} 97: 98:} 99: 100: class UseNullTest extends Tester {101: final Map <String, Integer> freq = new HashMap <String, Integer> (); 102: 103: @ Override 104: void doAction (int cycles) {105: for (int I = 0; I <cycles; I ++) {106: for (String word: strs) {107: Intege R count = freq. get (word); 108: if (count = null) {109: freq. put (word, 1); 110:} else {111: freq. put (word, count + 1); 112:} 113:} 114:} 115:} 116: 117: @ Override 118: String getName () {119: return "TestForNull"; 120:} 121: 122:} 123: 124: class UseAtomicLong extends Tester {125: final ConcurrentMap <String, AtomicLong> map = new ConcurrentHashMap <String, atomicLong> (); 126: 127: @ Overri De 128: void doAction (int cycles) {129: for (int I = 0; I <cycles; I ++) {130: for (String word: strs) {131: map. putIfAbsent (word, new AtomicLong (0); 132: map. get (word ). incrementAndGet (); 133:} 134:} 135:} 136: 137: @ Override 138: String getName () {139: return "AtomicLong"; 140:} 141: 142:} 143: 144: class UseTrove extends Tester {145: final TObjectIntHashMap <String> freq = new TObject IntHashMap <String> (); 146: 147: @ Override 148: void doAction (int cycles) {149: for (int I = 0; I <cycles; I ++) {150: for (String word: strs) {151: freq. adjustOrPutValue (word, 1, 1); 152:} 153:} 154:} 155: 156: @ Override 157: String getName () {158: return "Trove "; 159:} 160: 161:} 162: 163: class MutableInt {164: int value = 1; // note that we start at 1 since we're counting 165: 166: p Ublic void increment () {167: ++ value; 168:} 169: 170: public int get () {171: return value; 172:} 173:} 174: 175: class UseMutableInt extends Tester {176: Map <String, MutableInt> freq = new HashMap <String, MutableInt> (); 177: 178: @ Override 179: void doAction (int cycles) {180: for (int I = 0; I <cycles; I ++) {181: for (String word: strs) {182: MutableInt count = freq. get (word); 183: if (Count = null) {184: freq. put (word, new MutableInt (); 185:} else {186: count. increment (); 187:} 188:} 189:} 190:} 191: 192: @ Override 193: String getName () {194: return "MutableInt"; 195 :} 196: 197:} 198: 199: class UseGuava extends Tester {200: AtomicLongMap <String> map = AtomicLongMap. create (); 201: 202: @ Override 203: void doAction (int cycles) {204: for (int I = 0; I <cycles; I ++) {205: for (String word: strs) {206: map. getAndIncrement (word); 207:} 208:} 209:} 210: 211: @ Override 212: String getName () {213: return "Guava AtomicLongMap"; 214:} 215: 216:} 217: 218: class UseGuava2 extends Tester {219: Multiset <String> bag = HashMultiset. create (); 220: 221: @ Override 222: void doAction (int cycles) {223: for (int I = 0; I <cycles; I ++) {224: for (String word: str S) {225: bag. add (word); 226:} 227:} 228:} 229: 230: @ Override 231: String getName () {232: return "Guava HashMultiSet"; 233:} 234: 235:} output result: 1: ----- With 100 cycles ----- 2: === BaseLineTest Case 3: start at 1358655702729 4: Time used: 7 MS 5: = TestForNullTest Case 6: start at 1358655702736 7: Time used: 3 MS 8: = AtomicLongTest Case 9: start at 1358655702739 10: Time used: 14 MS 11: = TroveTest Case 12: start at 1358655702753 13: Time used: 2 MS 14: = MutableIntTest Case 15: start at 1358655702755 16: Time used: 2 MS 17: ==== Guava AtomicLongMapTest Case 18: start at 1358655702757 19: Time used: 4 MS 20: === Guava HashMultiSetTest Case 21: start at 1358655702761 22: Time used: 7 MS 23: ------------------------ 24: ----- With 1000 cycles ----- 25: === BaseLineTest Case 26: st Art at 1358655702768 27: Time used: 17 MS 28: === TestForNullTest Case 29: start at 1358655702785 30: Time used: 7 MS 31: === = AtomicLongTest Case 32: start at 1358655702792 33: Time used: 44 MS 34: === TroveTest Case 35: start at 1358655702836 36: Time used: 17 MS 37: ==== MutableIntTest Case 38: start at 1358655702853 39: Time used: 5 MS 40: === Guava AtomicLongMapTest Case 41: start at 1358655702 858 42: Time used: 9 MS 43: === Guava HashMultiSetTest Case 44: start at 1358655702868 45: Time used: 50 MS 46: -------------------- 47: ----- With 10000 cycles ----- 48 :=== BaseLineTest Case 49: start at 1358655702918 50: Time used: 16 MS 51 :=== TestForNullTest Case 52: start at 1358655702934 53: time used: 14 MS 54: ==== AtomicLongTest Case 55: start at 1358655702948 56: Time used: 29 MS 57: ==== TroveTest Case 58: start at 1358655702977 59: Time used: 10 MS 60 :=== MutableIntTest Case 61: start at 1358655702988 62: Time used: 5 MS 63: ==== Guava AtomicLongMapTest Case 64: start at 1358655702993 65: Time used: 15 MS 66: === Guava HashMultiSetTest Case 67: start at 1358655703009 68: Time used: 77 MS 69: ------------------------ 70: ----- With 100000 cycles ----- 71: === BaseLineTest Case 7 2: start at 1358655703086 73: Time used: 124 MS 74: ==== TestForNullTest Case 75: start at 1358655703210 76: Time used: 118 MS 77: ==== AtomicLongTest Case 78: start at 1358655703329 79: Time used: 240 MS 80 :=== TroveTest Case 81: start at 1358655703569 82: Time used: 102 MS 83: ==== MutableIntTest Case 84: start at 1358655703671 85: Time used: 45 MS 86 :=== Guava AtomicLongMapTest Case 87: start 1358655703716 88: Time used: 126 MS 89: === Guava HashMultiSetTest Case 90: start at 1358655703842 91: Time used: 98 MS 92: ---------------------- general conclusion: a single thread uses MutableInt, multi-threaded use of guava's AtomicLongMap, in fact, you can look at guava's implementation of addAndGet, loop, very interesting. Finally, when we optimize this problem, the obvious idea is to reduce method calls, while MutableInt is the most efficient, obviously, it minimizes the number of method calls-1 get, and the power of the pointer suddenly appears. Of course, we need to consider multiple factors when implementing the actual business code, such as code readability and combination with the business. In reality, we do not have to pursue such efficiency, but you should also avoid writing the code in baseline without thinking, because it is obviously optimizable, why not?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.