Commonly used containers to make initialization capacity

Source: Internet
Author: User
Tags rehash set set

In front of the javase I have collated some of the relevant data structures and now make a summary of the performance aspects here. In the future, in the actual coding, the relevant data structure should be initialized in this way.

    • 1,stringbuffer and StringBuilder

StringBuffer()Constructs a string buffer with no characters, with an initial capacity of 16 characters.

StringBuffer(int capacity)Constructs a string buffer with no characters, but with the specified initial capacity.

About the difference between the 2 classes themselves and performance I do not repeat, the general situation we use StringBuilder is good, because the general operation of these strings there is no data different problems, so the direct use of the thread unsafe StringBuilder just fine.

Underlying principle: StringBuffer and StringBuilder, the data storage used at the bottom is char[], and the size of the underlying data storage char[] will need to be adjusted as new elements are added StringBuffer. The result of the sizing is that the character array char[] is assigned to a larger space to create a new string array char[], the character elements in the old array are copied into the new array, and the original old array is discarded, meaning that the resource will be garbage collected.

During the actual encoding process, when initializing a string, if no specific content is specified, we are generally adjusting the first constructor above, and the default initialization capacity is typically 16 characters. In fact, in the actual code operation process, rarely appear less than 16 characters, so we usually manually go to specify, before initializing The approximate length of the string, and then take the parameters into the constructor to initialize .

    • 2, array

The array does not repeat. Because the array is fixed, it is said that once the array is initialized , the space occupied by the array in memory is fixed, so the length of the array will not be changed. There are 2 ways to obtain a specific initial value, one that is assigned by the system itself, and one that is specified by the programmer.

Static initialization : The initial value of each array element is explicitly specified by the programmer when initializing , and the length is determined by the system.

Dynamic initialization : The programmer only specifies the length of the array when initializing , and the system assigns the initial value to the set of elements.


    • 3,arraylist

ArrayList () Constructs an empty list with an initial capacity of ten .
ArrayList (int initialcapacity) constructs an empty list with the specified initial capacity.

That's what the JDK API says: Each ArrayList instance has a capacity. This capacity refers to the size of the array used to store the list elements. It is always at least equal to the size of the list. As you add elements to the ArrayList, their capacity increases automatically. The details of the growth strategy are not specified because it is not just the addition of elements that makes it as simple as allocating fixed-time overhead.

Before adding a large number of elements, an application can use the ensurecapacity operation to increase the capacity of the ArrayList instance. This can reduce the amount of incremental redistribution.

In the actual coding process if we can not clearly know the size of the list, then I can specify a similar, if we can clearly know, for example, because of some kind of business processing operation, I already know that there are 4 objects to be placed in a list, then should be in when the container is initialized to specify a capacity of 4 OK, there is one of the most common scenario is to say paged query database, so this time we have said clearly know the number of rows returned data, Then we grab the data out of the ORM mapping object and throw it into the list, then we can at this time the number of rows returned or the amount of pages as the container when the size of the initial initialization .

    • 4,hashmap

HashMap () constructs an empty HashMap with the default initial capacity (16) and the default load factor (0.75).
HashMap (int initialcapacity) constructs an empty HashMap with the specified initial capacity and default load factor (0.75).
HashMap (int initialcapacity, float loadfactor) constructs an empty HashMap with the specified initial capacity and load factor.

This implementation assumes that the hash function distributes elements correctly between buckets, providing stable performance for basic operations (get and put). The time that is required to iterate the collection view is proportional to the "capacity" (the number of buckets) of the HASHMAP instance and its size (number of key-value mappings). Therefore, if iteration performance is important, do not set the initial capacity too high (or set the load factor too low).

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. Capacity is the number of buckets in the hash table, and the initial capacity is only the capacity at the time of creation of the Hashtable. A load factor is a scale in which a hash table can reach a full amount before its capacity increases automatically. Doubles the capacity by calling the Rehash method when the number of entries in the hash table exceeds the product of the load factor to the current capacity.

Typically, the default load factor (. 75) seeks a tradeoff in time and space costs. The high load factor, while reducing the space overhead, also increases the query cost (which is reflected in the operations of most HASHMAP classes, including get and put operations). When setting the initial capacity, you should take into account the number of entries required in the mapping and their loading factors in order to minimize the number of rehash operations. The rehash operation does not occur if the initial capacity is greater than the maximum number of entries divided by the load factor.

If many of the mapping relationships are to be stored in the HASHMAP instance, creating it with a large enough initial capacity will make it more efficient to store the mapping relationship relative to the automatic rehash operation on demand to increase the capacity of the table.

Typically, the default load factor (. 75) seeks a tradeoff in time and space costs. High load factor while reducing space overhead, it also increases query cost (in most HASHMAP class operations, including get and put operations, this is reflected, and you can think about why). When setting the initial capacity, you should take into account the number of entries required in the mapping and their loading factors in order to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor (in fact the maximum number of entries is less than the initial capacity * load factor), the rehash operation does not occur.

If many of the mapping relationships are to be stored in the HASHMAP instance, creating it with a large enough initial capacity will make it more efficient to store the mapping relationship relative to the automatic rehash operation on demand to increase the capacity of the table. When HashMap store more and more elements, to reach the threshold value (threshold) threshold, it is necessary to expand the entry array, which is the Java Collection Class framework The greatest charm, hashmap in the expansion, the capacity of the new array will be The original twice times, due to changes in capacity, the original elements need to recalculate bucketindex, and then stored in the new array, it is called rehash. HashMap default initial capacity 16, load factor 0.75, is said to be able to put up 16*0.75=12 elements, when put 13th, HashMap will occur rehash,rehash a series of processing compared to affect performance, so when we need to HashMap when storing more elements, it is best to specify the appropriate initial capacity and loading factor, otherwise hashmap can only save 12 elements by default, and multiple rehash operations will occur.

In the actual coding process, we can specify the Loadfactor as 1, the size of the capacity of this understanding is not a problem, you should understand this load factor concept. If this value is set too small, then it is equivalent to initializing the capacity setting is too high, this looks good for queries, but not for iterations. If this value is set too large, then the iterative performance will be better, but it also increases the cost of the query. Although the query is the most frequent collection of operations, but it is the actual coding process we often iterate, so generally we specify these 2 parameters manually, you can set the loading factor equals 1 Good, this is also convenient for us to understand the code, because we are such a child The element that the initialized container can put in is the number of the container capacity that we specified earlier, without requiring the capacity * load factor.

    • 5,HashSet

HashSet () constructs a new empty collection whose default initial capacity of the underlying HASHMAP instance is 16, and the load factor is 0.75.
HashSet (int initialcapacity) constructs a new empty collection whose underlying HASHMAP instance has the specified initial capacity and the default load factor (0.75).
HashSet (int initialcapacity, float loadfactor) constructs a new empty collection whose underlying HASHMAP instance has the specified initial capacity and the specified load factor.


This class provides stable performance for basic operations, including Add, remove, contains, and size, assuming that the hash function correctly distributes the elements in the bucket. The time it takes to iterate over this collection is proportional to the size of the HashSet instance (the number of elements) and the "capacity" of the underlying HASHMAP instance (the number of buckets). Therefore, if the iteration performance is important, do not set the initial capacity too high (or set the load factor too low).

In front of my javase blog, I have also said, from the Java source code, Java is the first implementation of the map, and then by wrapping a all of the value is a null map to implement the set set. Now, let's make a performance summary of the relevant set of hash algorithms.

If you know from the beginning that HashSet and HashMap will save a lot of records, you should use a larger, reasonable initialization capacity at the time of creation. It is reasonable to say that the set value can not be too exaggerated, we set a value a little bit larger is to not let the above set is prone to rehashing, if the value is too big too big, then instead of wasting space, but also affect the performance of the iteration.

Commonly used containers to set initialization capacity

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.