Python list deduplication method you should know, python list deduplication Method

Last Update:2017-01-23 Source: Internet

Author: User

Tags python list

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Python list deduplication method you should know, python list deduplication Method

Preface

List deduplication is a common problem when writing Python scripts, because no matter where the source data comes from, when we convert it into a list, the expected results may not be our final results, the most common thing is that the Meta in the list is repeated. At this time, we need to re-process the first thing.

Let's take a look at the simplest method, which is implemented using the python built-in data type set.

Assume that our list data is as follows:

level_names = [ u'Second Level', u'Second Level', u'Second Level', u'First Level', u'First Level']

Because the elements of a set cannot be repeated, repeated elements are automatically removed when the list is converted to a set. This is the basic principle. The Code is as follows:

>>> the_list = set(level_names)>>> print(the_list)set([u'Second Level', u'First Level'])

The disadvantage of this method is that the previous list order cannot be saved when you switch to a list. If you do not have this requirement, this method is the simplest answer. Some friends may think it is easy, is there no technical content? That's right, so it is generally written like this for removing the duplicate list in the interview questions:

Please write out the list deduplication method (set cannot be used)

People say they cannot use set. So, sometimes this trick is not available, and of course it is hard for us. We still have other methods.

We all know that the list can be traversed, and it is easy to traverse the problem. Let's define an empty list, traverse the list with data, and add a judgment after the traversal. If there is no empty list, the Code is as follows:

the_list = []for level in level_names: if level not in the_list:  the_list.append(level)print(the_list)

Do you think this method is acceptable? But it is okay to deal with small lists in general. However, if you encounter a super large list, it will not work, because the the_list list is very large, it will affect the efficiency in the judgment, because the list is searched by index order, it will slow down when the data volume is large.

Maybe you have to ask, what should I do if I encounter a large list? Is there a better way? Of course, let's continue. Since using the list during judgment will affect efficiency, we will switch to another idea. If we use a set, you may have to ask, then the set will be faster? Yes, because the hash function used by the set finds the value. Although the set is unordered, the position is fixed. You only need to check whether a specific element exists once, some people have compared the list and set element search on the Internet. In the same data condition, it takes 16 minutes to use list and 52 seconds to use set. This shows the effect, I will not talk about anything more. paste the Code:

the_list = []the_set = set()for level in level_names: if level not in the_set:  the_set.add(level)  the_list.append(level)print(the_list)

Summary

The above is all about this article. I hope this article will help you in your study or work. If you have any questions, please leave a message.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More