Linq distinct is not enough !, Linqdistinct

Source: Internet
Author: User

Linq distinct is not enough !, Linqdistinct

Problem cause: In practice, a problem occurs. Set deduplication is required. The reference type is stored in the set and deduplication is performed based on the id. At this time, the distinct of linq is not enough. For the reference type, it directly compares the address. The test data is as follows:

    class Person    {        public int ID { get; set; }        public string Name { get; set; }    }    List<Person> list = new List<Person>()    {         new Person(){ID=1,Name="name1"},         new Person(){ID=1,Name="name1"},         new Person(){ID=2,Name="name2"},         new Person(){ID=3,Name="name3"}                    }; 

 

We need to deduplicate according to the Person ID. Of course, there is still a way to achieve this if you do not use linq Distinct. You can use GroupBy to split the group and then retrieve the first data. For example:

list.GroupBy(x => x.ID).Select(x => x.FirstOrDefault()).ToList()

It is also possible to implement it through GroupBy. After all, the operation in the memory is still very fast. But here we will implement it in other ways and find the best implementation method.

 

1. IEqualityComparer Interface

The extended method Distinct of IEnumerable <T> is defined as follows:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source);public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer);

We can see that the Distinct method has an overload parameter IEqualityComparer <T>. The interface is defined as follows:

// Type parameter T: Type of the object to be compared. Public interface IEqualityComparer <T> {bool Equals (T x, T y); int GetHashCode (T obj );}

By implementing this interface, we can implement our own comparator and define our own comparison rules.

Here is a problem. The T of IEqualityComparer <T> is the type of the object to be compared. Here it is Person. How can we obtain the property id of Person? Or, for any type, How do I know which attribute to compare? The answer is:Delegate. Through delegation, the attribute to be compared is specified externally. This is also the design of the linq extension method. parameters are of the delegate type, that is, the rules are defined externally and only called internally. OK. Let's look at the final implementation code:

// The same is true if you inherit the EqualityComparer class. Class CustomerEqualityComparer <T, V>: IEqualityComparer <T> {private IEqualityComparer <V> comparer; private Func <T, V> selector; public CustomerEqualityComparer (Func <T, V> selector): this (selector, EqualityComparer <V>. default) {} public mermerequalitycomparer (Func <T, V> selector, IEqualityComparer <V> comparer) {this. comparer = comparer; this. selector = selector;} public bool Equals (T x, T y) {return this. comparer. equals (this. selector (x), this. selector (y);} public int GetHashCode (T obj) {return this. comparer. getHashCode (this. selector (obj ));}}

 

(Supplement 1) I didn't post the extension method before, and some friends mentioned the case-insensitive problem of comparing strings (in fact, there are two constructors above to solve this problem ). The extension method can be written as follows:

Static class EnumerableExtention {public static IEnumerable <TSource> Distinct <TSource, TKey> (this IEnumerable <TSource> source, Func <TSource, TKey> selector) {return source. distinct (new CustomerEqualityComparer <TSource, TKey> (selector);} // The last parameter above 4.0 can be written as the default parameter EqualityComparer <T>. default. The two extensions Distinct can be combined into one. Public static IEnumerable <TSource> Distinct <TSource, TKey> (this IEnumerable <TSource> source, Func <TSource, TKey> selector, IEqualityComparer <TKey> comparer) {return source. distinct (new CustomerEqualityComparer <TSource, TKey> (selector, comparer ));}}

For example, to ignore case-sensitivity comparison based on the Person Name, you can write it as follows:

List. Distinct (x => x. Name, StringComparer. CurrentCultureIgnoreCase). ToList (); // StringComparer implements the IEqualityComaparer <string> Interface

 

Ii. Use a hash table.The disadvantage of the first approach is not only to define new extension methods, but also to define a new class. Is there only one extension method? Yes, it can be done through Dictionary (HashSet is used when HashSet is available ). The implementation method is as follows:

        public static IEnumerable<TSource> Distinct<TSource,TKey>(this IEnumerable<TSource> source, Func<TSource,TKey> selector)        {                        Dictionary<TKey, TSource> dic = new Dictionary<TKey, TSource>();            foreach (var s in source)            {                TKey key = selector(s);                if (!dic.ContainsKey(key))                    dic.Add(key, s);            }            return dic.Select(x => x.Value);        }

 

3. Override the object method.Can I skip the extension method? Yes. We know that an object is a base class of all types. There are two virtual Methods: Equals and GetHashCode. By default ,. net compares objects by using these two methods. Is the Distinct without parameters determined by these two methods? We use the override method in Person and implement our own comparison rules. When breakpoint debugging is performed, it is found that the Distinct method will enter the two methods. The Code is as follows:

class Person{    public int ID { get; set; }    public string Name { get; set; }    public override bool Equals(object obj)    {        Person p = obj as Person;        return this.ID.Equals(p.ID);    }    public override int GetHashCode()    {        return this.ID.GetHashCode();    }}

In my needs, it is de-duplicated by id, so the third method provides the most elegant implementation. In other cases, the preceding method is more common.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.