Problem solving and programming practices for finding the different elements in two lists

Source: Internet
Author: User

Zheng Haibo 2013-07-08

Problem:

There are list<string> List1 and list<string> List2, two sets each have tens of thousands of elements, how to find two sets of different elements?

Problem Analysis:

Since there are tens of thousands of elements in each list, if you use a simple traversal lookup algorithm, then at least a 10000*10000 comparison is required. Obviously, this is extremely inefficient. So is there a better plan? After my thinking, I came up with 2 ways. Please judge us.

Method One: An improved algorithm for ergodic algorithm

Idea: For each element in the List1, look in the list2, whether to repeat, if not repeat, put the element in Listdiff. If repeated, the element is removed from the list2. In this way, the time complexity of the traversal algorithm can be reduced, and the more repeated elements, the shorter the running time of the improved algorithm. Of course, if the number of repeating elements of the two list is much smaller than the length of the list, then the time complexity of the algorithm and the traversal algorithm are similar, it will become very slow and impractical.

Method Two: Using the characteristic of no repeating element in map

Idea: The element in List1 is first copied to map<string,integer>, and its Integer value is set to 1. Then the elements in the list2 are compared to the elements in the map. If the string already exists in the map, the integer in map corresponding to string is added 1 (indicating the number of occurrences of the string), and if it does not exist in the map, it will be copied to the map. and set its integer to 1. Then, the string for the element in map with an integer value of 1 is the different element in the two list.

The following code is implemented in Java, or can be tested with C + + STL.

[Java]View Plaincopy
  1. Import java.util.ArrayList;
  2. Import Java.util.HashMap;
  3. Import java.util.List;
  4. Import Java.util.Map;
  5. /*
  6. * @author: Zhenghaibo
  7. *2013-07-08 Nanjing,conris,china
  8. */
  9. Public class Testmian {
  10. private static final int listlen = 10000; Set the length of the list
  11. private static final Integer flagunique = 1; Key value with no repeating string
  12. Public list<string> list1 = new arraylist<string> ();
  13. Public list<string> list2 = new arraylist<string> ();
  14. public static void Main (string[] args) {
  15. //TODO auto-generated method stub
  16. Testmian mtest=New Testmian ();
  17. Mtest.initlist ();
  18. List<string> Listdiff1=mtest.getdiffelementuseeach (MTEST.LIST1,MTEST.LIST2); //Get different elements
  19. Mtest.initlist ();
  20. List<string> Listdiff2=mtest.getdiffelementusemap (MTEST.LIST1,MTEST.LIST2); //Get different elements
  21. System.out.println ("The number of the diff element is:" +listdiff1.size ());
  22. System.out.println ("The number of the diff element is:" +listdiff2.size ());
  23. //mtest.printlist (LISTDIFF1);
  24. //mtest.printlist (LISTDIFF2);
  25. }
  26. //Initializes the elements in the list and guarantees the same elements
  27. public void Initlist () {
  28. List1.clear ();
  29. List2.clear ();
  30. For (int i = 0; i < Listlen; i++) {
  31. List1.add ("conris_list_of" + i + "test");
  32. List2.add ("conris_list_of" + 3 * i + "test");
  33. }
  34. }
  35. //Get different elements in a list, find the Delete method
  36. Public list<string> Getdiffelementuseeach (list<string> list1,list<string> list2) {
  37. System.out.println ("-----------------------Method 1----------------------");
  38. Long runtime = System.nanotime (); //Start timing
  39. list<string> difflist = new arraylist<string> (); For saving two different elements in a list
  40. for (String string:list1) {//To eliminate duplicate elements of the list1 itself
  41. int Index=list2.indexof (string);
  42. if (index==-1) {//indicates that this element does not exist in List2
  43. Difflist.add (string);
  44. }else{//list2 This element exists, then delete this element
  45. List2.remove (index);
  46. }
  47. }
  48. For (String string:list2) {//At this time, duplicate elements in Liat2 have been deleted, just copy to Difflist
  49. Difflist.add (string);
  50. }
  51. System.out.println ("Getdiffelementuseremove Run Time:"
  52. + (System.nanotime ()-runtime));
  53. return difflist;
  54. }
  55. //Get different elements from two list, map method
  56. Public list<string> Getdiffelementusemap (list<string> list1,list<string> list2) {
  57. System.out.println ("-----------------------Method 2----------------------");
  58. Long runtime = System.nanotime (); //Start timing
  59. //Using a map that does not have the characteristics of duplicate elements
  60. map<string, integer> map = new hashmap<string,integer> (list1.size () + list2.size ());
  61. list<string> difflist = new arraylist<string> (); For saving two different elements in a list
  62. For (String string:list1) {
  63. Map.put (String,flagunique); ///First copy the elements in List1 to map to save
  64. }
  65. For (String string:list2) {
  66. Integer key = Map.get (string); //Get key value
  67. if (key! = null) {//If the element already exists in the map, stating that the element exists in List1, then add its key to 1
  68. Map.put (string, ++key);
  69. continue;
  70. }else{//If not present, put in map
  71. Map.put (String,flagunique);
  72. }
  73. }
  74. for (map.entry<string, integer> entry:map.entrySet ()) {
  75. if (entry.getvalue () = = Flagunique)//In map, the element with the key value Flagunique is a non-repeating element
  76. {
  77. Difflist.add (Entry.getkey ());
  78. }
  79. }
  80. System.out.println ("Getdiffelementusemap Run Time:"
  81. + (System.nanotime ()-runtime));
  82. return difflist;
  83. }
  84. public void Printlist (list<string> List) {
  85. For (int i=0;i<list.size (); i++) {
  86. System.out.println (List.get (i));
  87. }
  88. }
  89. }

Experimental results:

When Listlen is set to 10000:

Result 1:

[HTML]View Plaincopy
    1. -----------------------Method 1----------------------
    2. Getdiffelementuseremove Run time:2015792051
    3. -----------------------Method 2----------------------
    4. Getdiffelementusemap Run time:37966034
    5. The number of diff element is:13332
    6. The number of diff element is:13332

When the Listlen is set to 100000: After a half-day Method 1 does not run out of the results, Method 2 runs the following results:

[HTML]View Plaincopy
    1. -----------------------Method 2----------------------
    2. Getdiffelementusemap Run time:471017640
    3. The number of diff element is:133332

It can be seen that when the amount of data reaches 100000 (10 times times larger), method two still works, and time increases linearly with increasing data volume.

and Method 1 has not run out of results for a long time ...

This shows that the method of using HashMap is faster and can meet the basic requirements. I don't know what other ideas we can exchange. Hope to be of help to everyone.

PS: If implemented in C + + STL, it will run faster! Try again when it's okay.

Problem solving and programming practices for finding the different elements in two lists

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.