Linq distinct is not enough !, Linqdistinct

Last Update:2015-08-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Problem cause: In practice, a problem occurs. Set deduplication is required. The reference type is stored in the set and deduplication is performed based on the id. At this time, the distinct of linq is not enough. For the reference type, it directly compares the address. The test data is as follows:

    class Person    {        public int ID { get; set; }        public string Name { get; set; }    }    List<Person> list = new List<Person>()    {         new Person(){ID=1,Name="name1"},         new Person(){ID=1,Name="name1"},         new Person(){ID=2,Name="name2"},         new Person(){ID=3,Name="name3"}                    };

We need to deduplicate according to the Person ID. Of course, there is still a way to achieve this if you do not use linq Distinct. You can use GroupBy to split the group and then retrieve the first data. For example:

list.GroupBy(x => x.ID).Select(x => x.FirstOrDefault()).ToList()

It is also possible to implement it through GroupBy. After all, the operation in the memory is still very fast. But here we will implement it in other ways and find the best implementation method.

1. IEqualityComparer Interface

The extended method Distinct of IEnumerable <T> is defined as follows:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source);public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer);

We can see that the Distinct method has an overload parameter IEqualityComparer <T>. The interface is defined as follows:

// Type parameter T: Type of the object to be compared. Public interface IEqualityComparer <T> {bool Equals (T x, T y); int GetHashCode (T obj );}

By implementing this interface, we can implement our own comparator and define our own comparison rules.

Here is a problem. The T of IEqualityComparer <T> is the type of the object to be compared. Here it is Person. How can we obtain the property id of Person? Or, for any type, How do I know which attribute to compare? The answer is:Delegate. Through delegation, the attribute to be compared is specified externally. This is also the design of the linq extension method. parameters are of the delegate type, that is, the rules are defined externally and only called internally. OK. Let's look at the final implementation code:

// The same is true if you inherit the EqualityComparer class. Class CustomerEqualityComparer <T, V>: IEqualityComparer <T> {private IEqualityComparer <V> comparer; private Func <T, V> selector; public CustomerEqualityComparer (Func <T, V> selector): this (selector, EqualityComparer <V>. default) {} public mermerequalitycomparer (Func <T, V> selector, IEqualityComparer <V> comparer) {this. comparer = comparer; this. selector = selector;} public bool Equals (T x, T y) {return this. comparer. equals (this. selector (x), this. selector (y);} public int GetHashCode (T obj) {return this. comparer. getHashCode (this. selector (obj ));}}

(Supplement 1) I didn't post the extension method before, and some friends mentioned the case-insensitive problem of comparing strings (in fact, there are two constructors above to solve this problem ). The extension method can be written as follows:

Static class EnumerableExtention {public static IEnumerable <TSource> Distinct <TSource, TKey> (this IEnumerable <TSource> source, Func <TSource, TKey> selector) {return source. distinct (new CustomerEqualityComparer <TSource, TKey> (selector);} // The last parameter above 4.0 can be written as the default parameter EqualityComparer <T>. default. The two extensions Distinct can be combined into one. Public static IEnumerable <TSource> Distinct <TSource, TKey> (this IEnumerable <TSource> source, Func <TSource, TKey> selector, IEqualityComparer <TKey> comparer) {return source. distinct (new CustomerEqualityComparer <TSource, TKey> (selector, comparer ));}}

For example, to ignore case-sensitivity comparison based on the Person Name, you can write it as follows:

List. Distinct (x => x. Name, StringComparer. CurrentCultureIgnoreCase). ToList (); // StringComparer implements the IEqualityComaparer <string> Interface

Ii. Use a hash table.The disadvantage of the first approach is not only to define new extension methods, but also to define a new class. Is there only one extension method? Yes, it can be done through Dictionary (HashSet is used when HashSet is available ). The implementation method is as follows:

        public static IEnumerable<TSource> Distinct<TSource,TKey>(this IEnumerable<TSource> source, Func<TSource,TKey> selector)        {                        Dictionary<TKey, TSource> dic = new Dictionary<TKey, TSource>();            foreach (var s in source)            {                TKey key = selector(s);                if (!dic.ContainsKey(key))                    dic.Add(key, s);            }            return dic.Select(x => x.Value);        }

3. Override the object method.Can I skip the extension method? Yes. We know that an object is a base class of all types. There are two virtual Methods: Equals and GetHashCode. By default ,. net compares objects by using these two methods. Is the Distinct without parameters determined by these two methods? We use the override method in Person and implement our own comparison rules. When breakpoint debugging is performed, it is found that the Distinct method will enter the two methods. The Code is as follows:

class Person{    public int ID { get; set; }    public string Name { get; set; }    public override bool Equals(object obj)    {        Person p = obj as Person;        return this.ID.Equals(p.ID);    }    public override int GetHashCode()    {        return this.ID.GetHashCode();    }}

In my needs, it is de-duplicated by id, so the third method provides the most elegant implementation. In other cases, the preceding method is more common.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linq distinct is not enough !, Linqdistinct

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linq distinct is not enough !, Linqdistinct

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support