Discussion: deleting repeated elements in an array in Java

Source: Internet
Author: User
Tags addall

This is an old problem, but most people do not know enough about it. The younger brother is here to inspire others. Welcome to make a brick .......

Problem: for example, I have an array (the number of elements is 0), and the elements to be added cannot be repeated.

With such a problem, I may quickly write down the code. Here the array uses arraylist.

   private static void testListSet(){
List<String> arrays = new ArrayList<String>(){
@Override
public boolean add(String e) {
for(String str:this){
if(str.equals(e)){
System.out.println("add failed !!! duplicate element");
return false;
}else{
System.out.println("add successed !!!");
}
}
return super.add(e);
}
};

arrays.add("a");arrays.add("b");arrays.add("c");arrays.add("b");
for(String e:arrays)
System.out.print(e);
}

I don't care about anything here. I only care about the judgment when adding elements to the array (of course, only the add method is used to add the array element), and whether the same element already exists. If this element does not exist in the array, add it to this array, and vice versa. This write may be simple, but it looks clumsy when facing a huge array: An array with 100000 elements is an element of the family. Do you want to call equal for 100000 times? Here is the basis.

Q: How can I delete the repeated elements in the array after adding an array with some elements?

We all know that Java integration can be divided into two categories: List and set. Elements in the List class must be ordered but can be repeated, while elements in the Set class must be unordered but cannot be repeated. Here we can consider using the set feature to delete duplicate elements. After all, the existing algorithms in the system are better than the existing algorithms.

    public static void removeDuplicate(List<People> list){
HashSet<People> set = new HashSet<People>(list);
list.clear();
list.addAll(set);
}
  private static People[] ObjData = new People[]{
        new People(0, "a"),new People(1, "b"),new People(0, "a"),new People(2, "a"),new People(3, "c"),
    }; 
public class People{
private int id;
private String name;

public People(int id,String name){
this.id = id;
this.name = name;
}

@Override
public String toString() {
return ("id = "+id+" , name "+name);
}

}

The above Code uses a custom people class. When I add the same object (that is, it contains the same data content ), the removeduplicate method is called to find that the actual problem cannot be solved and the same object still exists. How does one determine whether an object is the same in a hashset? Open the hashset source code and you will find that each time you add data to it, you must call the add method:

         @Override
94 public boolean add(E object) {
95 return backingMap.put(object, this) == null;
96 }

Here, the backingmap is the data maintained by the hashset. It uses a clever method to treat the object added each time as the key in the hashmap, and its own hashset object as the value. In this way, the key uniqueness in the hashmap is used, and the data of the hashset is not repeated. However, whether duplicate data exists depends on how the two keys in hashmap are determined to be the same.

         @Override public V put(K key, V value) {
390 if (key == null) {
391 return putValueForNullKey(value);
392 }
393
394 int hash = secondaryHash(key.hashCode());
395 HashMapEntry<K, V>[] tab = table;
396 int index = hash & (tab.length - 1);
397 for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
398 if (e.hash == hash && key.equals(e.key)) {
399 preModify(e);
400 V oldValue = e.value;
401 e.value = value;
402 return oldValue;
403 }
404 }
405
406 // No entry for (non-null) key is present; create one
407 modCount++;
408 if (size++ > threshold) {
409 tab = doubleCapacity();
410 index = hash & (tab.length - 1);
411 }
412 addNewEntry(key, value, hash, index);
413 return null;
414 }

In general, the implementation idea here is: traverse the elements in the hashmap, if the hashcode of the element is equal (in fact, we need to perform a processing on the hashcode), and then judge the eqaul method of the key. If these two conditions are met, they are different elements. If the element type in the array is customized, you need to use the set mechanism to implement equal and hashmap by yourself (here, the hashmap algorithm is not described in detail, I can understand that) method:

public class People{
private int id; //
private String name;

public People(int id,String name){
this.id = id;
this.name = name;
}

@Override
public String toString() {
return ("id = "+id+" , name "+name);
}

public int getId() {
return id;
}

public void setId(int id) {
this.id = id;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

@Override
public boolean equals(Object obj) {
if(!(obj instanceof People))
return false;
People o = (People)obj;
if(id == o.getId()&&name.equals(o.getName()))
return true;
else
return false;
}

@Override
public int hashCode() {
// TODO Auto-generated method stub
return id;
//return super.hashCode();
}
}

The removeduplicate (list) method will not show two identical people.

Let's test their performance here:

View code

public class RemoveDeplicate {

public static void main(String[] args) {
// TODO Auto-generated method stub
//testListSet();
//removeDuplicateWithOrder(Arrays.asList(data));
//ArrayList<People> list = new ArrayList<People>(Arrays.asList(ObjData));

//removeDuplicate(list);

People[] data = createObjectArray(10000);
ArrayList<People> list = new ArrayList<People>(Arrays.asList(data));

long startTime1 = System.currentTimeMillis();
System.out.println("set start time --> "+startTime1);
removeDuplicate(list);
long endTime1 = System.currentTimeMillis();
System.out.println("set end time --> "+endTime1);
System.out.println("set total time --> "+(endTime1-startTime1));
System.out.println("count : " + People.count);
People.count = 0;

long startTime = System.currentTimeMillis();
System.out.println("Efficient start time --> "+startTime);
EfficientRemoveDup(data);
long endTime = System.currentTimeMillis();
System.out.println("Efficient end time --> "+endTime);
System.out.println("Efficient total time --> "+(endTime-startTime));
System.out.println("count : " + People.count);




}
public static void removeDuplicate(List<People> list)
{
HashSet<People> set = new HashSet<People>(list);
list.clear();
list.addAll(set);
}

public static void removeDuplicateWithOrder(List<String> arlList)
{
Set<String> set = new HashSet<String>();
List<String> newList = new ArrayList<String>();
for (Iterator<String> iter = arlList.iterator(); iter.hasNext();) {
String element = iter.next();
if (set.add( element))
newList.add( element);
}
arlList.clear();
arlList.addAll(newList);
}


@SuppressWarnings("serial")
private static void testListSet(){
List<String> arrays = new ArrayList<String>(){
@Override
public boolean add(String e) {
for(String str:this){
if(str.equals(e)){
System.out.println("add failed !!! duplicate element");
return false;
}else{
System.out.println("add successed !!!");
}
}
return super.add(e);
}
};

arrays.add("a");arrays.add("b");arrays.add("c");arrays.add("b");
for(String e:arrays)
System.out.print(e);
}

private static void EfficientRemoveDup(People[] peoples){
//Object[] originalArray; // again, pretend this contains our original data
int count =0;
// new temporary array to hold non-duplicate data
People[] newArray = new People[peoples.length];
// current index in the new array (also the number of non-dup elements)
int currentIndex = 0;

// loop through the original array...
for (int i = 0; i < peoples.length; ++i) {
// contains => true iff newArray contains originalArray[i]
boolean contains = false;

// search through newArray to see if it contains an element equal
// to the element in originalArray[i]
for(int j = 0; j <= currentIndex; ++j) {
// if the same element is found, don't add it to the new array
count++;
if(peoples[i].equals(newArray[j])) {

contains = true;
break;
}
}

// if we didn't find a duplicate, add the new element to the new array
if(!contains) {
// note: you may want to use a copy constructor, or a .clone()
// here if the situation warrants more than a shallow copy
newArray[currentIndex] = peoples[i];
++currentIndex;
}
}

System.out.println("efficient medthod inner count : "+ count);

}

private static People[] createObjectArray(int length){
int num = length;
People[] data = new People[num];
Random random = new Random();
for(int i = 0;i<num;i++){
int id = random.nextInt(10000);
System.out.print(id + " ");
data[i]=new People(id, "i am a man");
}
return data;
}

Test results:

set end time -->  1326443326724
set total time --> 26
count : 3653
Efficient start time --> 1326443326729
efficient medthod inner count : 28463252
Efficient end time --> 1326443327107
Efficient total time --> 378
count : 28463252

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.