Thinking logic of computer programs (41) and thinking 41

Source: Internet
Author: User
Tags addall

Thinking logic of computer programs (41) and thinking 41

The previous section introduced HashMap and introduced the Set interface. The keySet and entrySet methods of the Map interface return both Set. In this section, we will look at an important implementation class HashSet of the Set interface.

Similar to HashMap, literally, HashSet consists of two words: Hash and Set. Set indicates the interface. There are also multiple methods to implement the Set interface, which have their own characteristics, the HashSet implementation method uses Hash.

Next, let's first look at the usage of HashSet, then look at the implementation principle, and finally summarize and analyze the characteristics of HashSet.


Set Interface

Set indicates the container interface that does not have repeated elements and does not guarantee the order. It extends the Collection, but does not define any new methods. However, for some of these methods, it has its own specifications.

The complete definition of the Set interface is as follows:

public interface Set<E> extends Collection<E> {    int size();    boolean isEmpty();    boolean contains(Object o);    Iterator<E> iterator();    Object[] toArray();    <T> T[] toArray(T[] a);    boolean add(E e);    boolean remove(Object o);    boolean containsAll(Collection<?> c);    boolean addAll(Collection<? extends E> c);    boolean retainAll(Collection<?> c);    boolean removeAll(Collection<?> c);    void clear();    boolean equals(Object o);    int hashCode();}

The methods defined in the Collection interface are the same. However, some methods have different requirements.

Add Element

boolean add(E e);

If the same element already exists in the Set, the set is not changed and false is returned directly. If the set does not exist, the set is added and true is returned.

Batch add

boolean addAll(Collection<? extends E> c);

Duplicate elements are not added. If the set changes, true is returned. If the set does not change, false is returned.


Iterator<E> iterator();

Iterations do not require a special sequence between elements. The implementation of HashSet has no sequence, but some Set implementations may have specific sequence, such as TreeSet. We will introduce it in subsequent chapters.


Similar to HashMap, the construction methods of HashSet are as follows:

public HashSet()public HashSet(int initialCapacity)public HashSet(int initialCapacity, float loadFactor)public HashSet(Collection<? extends E> c)

The meanings of initialCapacity and loadFactor are the same as those in HashMap.

The use of HashSet is also very simple, such:

Set <String> set = new HashSet <String> (); set. add ("hello"); set. add ("world"); set. addAll (Arrays. asList (new String [] {"hello", "Old Horse"}); for (String s: set) {System. out. print (s + "");}


Hello, Ma world

"Hello" is added twice, but only one copy is saved, and the output has no special order.

HashCode and equals

Similar to HashMap, HashSet requires that elements rewrite the hashCode and equals methods. If the two objects have the same equals, hashCode must also be the same. If the elements are custom classes, pay attention to this.

For example, there is a class Spec that represents a specification, which has two attributes: Big and color:

class Spec {    String size;    String color;        public Spec(String size, String color) {        this.size = size;        this.color = color;    }    @Override    public String toString() {        return "[size=" + size + ", color=" + color + "]";    }}

Let's look at a Spec Set:

Set<Spec> set = new HashSet<Spec>();set.add(new Spec("M","red"));set.add(new Spec("M","red"));System.out.println(set);


[[size=M, color=red], [size=M, color=red]]

The same specification is output twice. To avoid this, you need to rewrite the hashCode and equals methods for Spec. The two methods can be automatically generated using IDE development tools, such as in Eclipse, you can use "Source"-> "Generate hashCode () and equals ()... ", we will not go into details.

Application scenarios

HashSet has many application scenarios, such:

  • Weight sorting. If there is no order requirement on the elements after the weight sorting, HashSet can be conveniently used for weight sorting.
  • Save special values. Set can be used to save various special values. When a program processes user requests or data records, it can perform special processing based on whether the values are special values, such as saving the IP address blacklist or whitelist.
  • Set Operations: You can use Set to conveniently perform operations in mathematical sets, such as intersection and Union operations. These operations have some practical significance. For example, in user tag calculation, each user has some tags. The intersection of tags between two users indicates their common features. The size of the intersection divided by the size of the Union set can indicate their similar length.

Implementation Principle

Internal components

HashSet is implemented using HashMap internally. It has a HashMap instance variable, as shown below:

private transient HashMap<E,Object> map;

We know that Map has a key and a value. HashSet is equivalent to a key with the same fixed value. The value is defined:

private static final Object PRESENT = new Object();

After understanding the internal components, the implementation method is easier to understand. Let's look at the code.


The HashSet constructor mainly calls the corresponding HashMap constructor, for example:

public HashSet(int initialCapacity, float loadFactor) {    map = new HashMap<>(initialCapacity, loadFactor);}public HashSet(int initialCapacity) {    map = new HashMap<>(initialCapacity);}public HashSet() {    map = new HashMap<>();}

The constructor that accepts Collection parameters is slightly different. The code is:

public HashSet(Collection<? extends E> c) {    map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));    addAll(c);}

It is also easy to understand that c. size ()/. 75f is used to calculate initialCapacity, and 0.75f is the default value of loadFactor.

Add Element

Let's look at the code of the add method:

public boolean add(E e) {    return map.put(e, PRESENT)==null;}

The put Method of map is called. Element e is used for the key, and the value is the fixed value 'present'. If 'put' is returned, null indicates that no corresponding key exists and is successfully added. A key in HashMap only saves one copy, so repeated HashMap addition does not change.

Check whether the element is included


public boolean contains(Object o) {    return map.containsKey(o);}

Check whether the map contains the corresponding key.

Delete Element


public boolean remove(Object o) {    return map.remove(o)==PRESENT;}

The remove Method of map is called. If the returned value is PRESENT, the corresponding key exists and the deletion is successful.



public Iterator<E> iterator() {    return map.keySet().iterator();}

Is the iterator that returns the keySet of map.

HashSet Feature Analysis

HashSet implements the Set interface, which is implemented through HashMap internally. This determines that it has the following features:

  • No repeated Elements
  • It can efficiently add or delete elements and determine whether the elements exist. The efficiency is O (1 ).
  • No order

HashSet is an ideal choice if the requirements exactly match these features.


This section describes the usage and implementation principles of HashSet. It implements the Set interface without repeated elements and uses HashMap internally, this allows you to conveniently and efficiently implement features such as deduplication and set operations.

Like HashMap, HashSet has no order. To maintain the order of addition, you can use a subclass of HashSet, including LinkedHashSet. Set also has an important implementation class, TreeSet, which can be sorted. These two classes are to be described in subsequent sections.

The common implementation mechanism of HashMap and HashSet is a hash table. Map and Set also have an important common implementation mechanism. Tree implementation classes are TreeMap and TreeSet, respectively, let's discuss it in the next two sections.


For more information, see the latest article. Please pay attention to the Public Account "lauma says programming" (scan the QR code below), from entry to advanced, ma and you explore the essence of Java programming and computer technology. Retain All copyrights with original intent.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.