C # equality comparison,
This article describes C #Equality comparison, Which focuses on the following two aspects:
= And! = Operator, when can they be used for Equality comparison, and when they are not applicable. If they are not used, what is their alternative?
When should I customize the equality comparison logic of a type?
Before describing equality comparison and how to customize equality comparison logic, we should first understand the comparison of value types and reference types.
Comparison of value types and reference types
There are two types of equality comparison in C:
- The value type is equal. The two values are equal in certain scenarios.
- The reference type is equal. The two references point to the same object.
By default,
- The value type is equal to the value type.
- The reference type is equal to the reference type.
In fact, the value type can only be equivalent (unless the value type is packed ). Let's look at a simple example (compare two numbers). The running result is True.
int x = 5, y = 5;Console.WriteLine(x == y);
By default, the reference type is equal to the reference type. For example, in the following example, False is returned.
object x = 5, y = 5;Console.WriteLine(x == y);
If x and y point to the same object, True is returned:
object x = 5, y = x;Console.WriteLine(x == y);
Equality criteria
The following three criteria are used for Equality comparison:
- = And! = Operator
- Virtual method Equals in object
- IEquatable <T> Interface
Next, we will describe them separately.
1. = and! = Operator
Use = and! = Is because they are operators and they achieve equality comparison through static functions. Therefore, when you use = or! =, C #Compile timeThe comparison type is determined, and no virtual method (Object. Equals) is executed ). This is the expected comparison of equal rows. For example, in the first numeric comparison example, the compiler determines at compilation that the = operation is of the int type, because both x and y are of the int type.
In the second example, the compiler decides that the = operation is of the object type, because the object is of the class (reference type ), therefore, the = operator of the object compares x and y by referencing equal. Then False is returned, because x and y point to different objects on the stack (boxed int)
2. Object. Equals virtual Method
To correctly compare x and y in the second example, we can use the Equals virtual method. System. Object defines the Equals virtual method, which applies to all types
object x = 5, y = 5;Console.WriteLine(x.Equals(y));
Equals in the programRuntimeDetermine the type of comparison-compare Objects Based on their actual types. In the preceding example, the Euqals method of Int32 is called, and the value of this method is equal for comparison. Therefore, True is returned in the preceding example. If x and y are reference types, the call references are equal for comparison. If x and y are structure types, Equals calls the Equals method of the corresponding type of each member of the structure for comparison.
Here, you may wonder why the C # designer does not design = to audit aul to make it the same as Equals to avoid appeal defects. This is because:
- If the first calculation object is null, The Equals method will throw the NullReferenceException, while the static operator will not
- Because the = operator determines the comparison type during compilation (static parsing comparison type), it can be executed very quickly. This makes it possible to write a large amount of computing code to execute the equality comparison, which will not significantly affect the performance.
- Sometimes, = and Equals are suitable for Equality comparison in different scenarios. (Subsequent content will be involved)
In short, complex design reflects complex scenarios: The concept of equality involves many scenarios.
The Euqals method is applicable to comparing two unknown objects. The following method applies to comparing two objects of any type:
public static bool AreEqual(object obj1, object obj2){ return obj1.Equals(obj2);}
However, this function cannot handle the case where the first parameter is null. If the first function is null, you will get an NullReferenceException. Therefore, we need to modify the function:
public static bool AreEqual(object obj1, object obj2){ if (obj1 == null) return obj2 == null; return obj1.Equals(obj2);}
Object static Equals Method
The object class also defines a static Equals method, which functions the same as the AreEquals method.
public static bool Equals(Object objA, Object objB){ if (objA==objB) { return true; } if (objA==null || objB==null) { return false; } return objA.Equals(objB);}
In this way, we can safely compare null objects of unknown types during compilation.
object x = 5, y = 5;Console.WriteLine(object.Equals(x, y)); // -> Truex = null;Console.WriteLine(object.Equals(x, y)); // -> Falsey = null;Console.WriteLine(object.Equals(x, y)); // -> TrueConsole.WriteLine(x.Equals(y)); // -> NullReferebceException, because x is null
Note that when writing the Generic type, the following code cannot be compiled (unless = or! = Replace the operator with the call of the Object. Equals method ):
public class Test<T> : IEqualityComparer<T>{ T _value; public void SetValue(T newValue) { // Operator '!=' cannot be applied to operands of type 'T' and 'T' // it should be : if(!object.Equals(newValue, _value)) if (newValue != _value) _value = newValue; }}
Object static ReferenceEquals Method
Sometimes, you need to forcibly compare whether two references are equal. In this case, you need to use object. ReferenceEquals:
Internal class Widget
{
Public string UID {get; set ;}
Public override bool Equals (object obj)
{
If (obj = null)
Return this = null;
If (! (Obj is Widget ))
Return false;
Widget w = obj as Widget;
Return this. UID = w. UID;
}
Public override int GetHashCode ()
{
Return this. UID. GetHashCode ();
}
Public static bool operator = (Widget w1, Widget w2)
{
Return w1.Equals (w2 );
}
Public static bool operator! = (Widget w1, Widget w2)
{
Return! W1.Equals (w2 );
}
}
Static void Main (string [] args)
{
Widget w1 = new Widget ();
Widget w2 = new Widget ();
Console. WriteLine (w1 = w2); //-> True
Console. WriteLine (w1.Equals (w2); //-> True
Console. WriteLine (object. ReferenceEquals (w1, w2); //-> False
Console. ReadLine ();
}
Basic ReferenceEquals
internal class Widget { public string UID { get; set; } public override bool Equals(object obj) { if (obj == null) return this == null; if (!(obj is Widget)) return false; Widget w = obj as Widget; return this.UID == w.UID; } public override int GetHashCode() { return this.UID.GetHashCode(); } public static bool operator == (Widget w1, Widget w2) { return w1.Equals(w2); } public static bool operator !=(Widget w1, Widget w2) { return !w1.Equals(w2); }}static void Main(string[] args){ Widget w1 = new Widget(); Widget w2 = new Widget(); Console.WriteLine(w1==w2); // -> True Console.WriteLine(w1.Equals(w2)); // -> True Console.WriteLine(object.ReferenceEquals(w1, w2)); // -> False Console.ReadLine();}
The reason for calling the ReferenceEquals method is that the custom class Widget overwrites the virtual method Equals of the object class. In addition, the class also overwrites the operators ==and! Therefore, True is also returned when the = operation is executed. Therefore, calling ReferenceEquals ensures that the returned references are equal.
3. IEquatable <T> Interface
The object. Equals method is called to pack the compared value types. This method is not suitable for scenarios with high performance requirements. Starting from C #2.0, this problem is solved by introducing the IEquatable <T> interface.
public interface IEquatable<T>{ bool Equals(T other);}
When implementing the IEquatable interface, calling the interface method is equivalent to calling the virtual method Equals of objet, but the interface method is executed faster (no type conversion is required ). Most. NET basic types implement the IEquatable <T> interface. You can also add the IEquatable <T> limit for the Generic type.
internal class Test<T> where T : IEquatable<T>{ public bool IsEqual(T t1, T t2) { return t1.Equals(t2); }}
If we remove the IEquatable <T> restriction, the Test <T> class can still be compiled, but t1.Equals (t2) uses the object. Equals method.
4. When the Equals result is inconsistent with the = result
In the previous content, we have mentioned that sometimes, = or equals is applicable to different scenarios. For example:
double x = double.NaN;Console.WriteLine(x == x); // FalseConsole.WriteLine(x.Equals(x)); // True
This is because the = operator of the double type forces NaN not to be equal to any other value, even if another NaN. From a mathematical point of view, the two are indeed not equal. The Equals method returns True for x. Equals (x) because of its symmetry.
The Set and dictionary depend on the Equals symmetry. Otherwise, the elements that have been saved in the set or dictionary cannot be found.
For value types, Equals and = Rarely have different equality. It is common in reference types. Generally, the creator of the reference type overrides the Equals method to perform equal comparison, while the reserved = executes equal comparison of references. For example, the StringBuilder class is like this:
StringBuilder buffer1 = new StringBuilder("123");StringBuilder buffer2 = new StringBuilder("123");Console.WriteLine(buffer1 == buffer2); // FalseConsole.WriteLine(buffer1.Equals(buffer2)); // True
Compare custom types
Review the default comparison Behaviors
- Use equal value for value type
- The reference type is equal to the reference type.
Further,
- The equals method of the Structure Type compares equal rows based on the type of each field.
Sometimes, when creating a type, you need to override the above behavior. In the following two cases, you need to override:
- Change the meaning of equality
- Increase the speed of structure type comparison
1) change the meaning of equality
When the default = and Equals are not applicable (they do not comply with the natural rules, or do not deviate from the user's expectations) for the custom type, you need to change the meaning of the same. For example, the DateTimeOffset structure has two private members: A DateTime-type UTC and an int-type offset. If you are creating the DateTimeOffset type, you may only need to compare the Offset field if the UTC field is equal. Another example is to support NaN numeric types, such as float and double. If you create these two types, you may want NaN to be able to be compared.
For the Class type, it is often more meaningful to use the value. In particular, some classes that contain less data, such as System. Uri or System. String
2) Improve the comparison speed of Structure Types
The default comparison algorithm of the structure type is relatively slow. You can rewrite the Equals Method to Improve the Performance by 5%. The overload = operation and the implementation of IEquatable <T> can achieve equality comparison without packing, which makes it possible to increase the performance by 5%.
For custom equality comparison, there is a special case. After the hashing algorithm of the structure type is changed, hashtable can achieve better performance. This is because the hashing algorithm and equality comparison both occur on the stack.
3) how to override equal
Generally, there are three methods:
- Override GetHashcode () and Equals ()
- [Optional] heavy load! = And =
- [Optional] implement IEquatable <T>
I) override GetHashCode
The virtual method GetHashCode of the object is only beneficial to the Hashtable type and the Dictionary <TKey, TValue> type.
Both types are hash table sets. Each element in the set is a key value used to store elements and obtain elements. A hash table uses a specific policy to effectively allocate elements based on the element's key value. This requires that each key value has an Int32 number (or hash code ). The hash code is not only unique for each key value, but also must have good performance. The hash table considers that the GetHashCode method defined by the object class is sufficient. Therefore, the methods for obtaining the hash code are omitted for both types.
The GetHashCode method is implemented by default regardless of the value type or reference type, so you do not need to override this method unless you need to override the Equals method. (Therefore, if you overwrite the GetHashCode method, you must overwrite the Equals method ).
To overwrite the GetHashCode method, refer to the following rules:
- If the Equals method returns True, the two objects must return the same hash code.
- An exception cannot be thrown.
- Unless the object changes, the same hash code should be returned when the GetHashCode method is called for an object repeatedly.
To improve the performance of the hash table, GetHashCode needs to be rewritten to prevent different values from returning the same hash code. This also explains why the Equals and GetHashCode methods need to be rewritten for the structure type, so this rewrite is more efficient than the default hash algorithm. The default implementation of the GetHashCode method of the structure type occurs only at runtime, and may be implemented based on every member of the structure.
// char typepublic override int GetHashCode() { return (int)m_value | ((int)m_value << 16);}// int32public override int GetHashCode() { return m_value;}
For the class type, the default implementation of the GetHashCode method is based on the internal object identifier, which is unique for each object instance in CLR.
public virtual int GetHashCode(){ return RuntimeHelpers.GetHashCode(this);}
II) rewrite Equals
Object. Equal:
- An object cannot be equal to null (unless the object is of the nullable type)
- Equality is symmetric (an object equals itself)
- Equality is interchangeable (if a is equal to B, B is equal to)
- Equality is passed (if a is equal to B and B is equal to c, a is equal to c)
- Equality is repeatable and reliable (no exception thrown)
III) heavy load = and! =
In addition to rewriting Equals, The Equals and non-Equals operators can be reloaded.
For structure types, operators equal to and not equal to are basically reloaded. If they are not reloaded, incorrect results will be returned for structure types equal to and not equal;
For the class type, there are two processing methods:
- Not reload = and! =, Because they will execute references equal
- Overload = and! = To make it consistent with Equals
The first implementation applies to most custom types, especially mutable types. It ensures that the custom type conforms to = and! = The reference equality comparison should be executed, so as not to mislead these custom users. Review the StringBuilder example above.
StringBuilder buffer1 = new StringBuilder("123");StringBuilder buffer2 = new StringBuilder("123");Console.WriteLine(buffer1 == buffer2); // False, Reference equalityConsole.WriteLine(buffer1.Equals(buffer2)); // True, Value equality
The second implementation applies to users who never want custom type execution references to be equal. Generally, all these types are immutable types, such as the string type and System. Uri type. Of course, they also contain some reference types.
III) Implement IEquatable <T>
To maintain integrity, we recommend that you implement the IEquatable <T> interface while rewriting the Equals method. The result of the interface method should be consistent with that of the custom override Equals method. If you have already rewritten the Equals method, then implementing IEquatable <T> requires no additional implementation code (simply call the Equlas method)
internal class Staff : IEquatable<Staff>{ public string FirstName { get; set; } // implements IEquatable<Staff> public bool Equals(Staff other) { return this.FirstName.Equals(other.FirstName); } // override Equals public override bool Equals(object obj) { if (obj == null) return this == null; if (!(obj is Staff)) return false; Staff s = obj as Staff; return this.FirstName == s.FirstName; } // override GetHashCode public override int GetHashCode() { return this.FirstName.GetHashCode(); } }
IV) pluggable equal Comparator
If you want a type to use different comparisons in a specific scenario, you can use plug-in IEqualityComparer. It is particularly applicable to collection classes. (Subsequent content)
Equality comparison summary
In the C # class library, three interfaces are designed for Equality comparison: IEquatable <T>, IEqualityComparer, and IEqualityComparer <T>.
The difference between IEqualityComparer and IEqualityComparer <T> is simple. For a non-Generic, You need to convert T to an Object and then call the Equals method of the Object. The latter directly calls the Equals method of the T-type instance.
So what are the differences between IEquatable and IEqualityComparer? What are the application scenarios?
1. IEquatable <T> is used to compare whether another object of the same type is the same as its own, while IEqualityComparer <T> is used to compare whether two instances of the same type are equal.
2. if there is only one possibility for two instances to be equal, or if there are several equal comparisons but only one of them is more meaningful, you should choose IEquatable <T>, the T type implements the IEquatable <T> interface. Therefore, the IEquatable <T> instance knows how to compare itself and another instance. In contrast, if multiple equal comparisons exist between instances to be compared, IEqualityComparer <T> is more suitable for this situation. This interface is not implemented by the T type, on the contrary, an external class is required to implement the IEqualityComparer <T> interface. Because, when comparing two types of instances for equality, because T type does not know how to compare, then you need to explicitly specify an IEqualityComparer <T> instance for Equality comparison to meet specific requirements.
3. Example
Internal class Staff: IEquatable <Staff>
{
Public string FirstName {get; set ;}
Public string Title {get; set ;}
Public string Dept {get; set ;}
Public override string ToString ()
{
Return string. Format (
"FirstName: {0}, Title: {1}, Dept: {2 }",
FirstName, Title, Dept );
}
// Implements IEquatable <Staff>
Public bool Equals (Staff other)
{
Return this. FirstName. Equals (other. FirstName );
}
// Override Object. GetHashCode
Public override int GetHashCode ()
{
Return this. FirstName. GetHashCode ();
}
}
Internal class StaffTitleComparer: IEqualityComparer <Staff>
{
Public bool Equals (Staff x, Staff y)
{
Return x. Title = y. Title;
}
Public int GetHashCode (Staff obj)
{
Return obj. Title. GetHashCode ();
}
}
Internal class StaffDeptComparer: IEqualityComparer <Staff>
{
Public bool Equals (Staff x, Staff y)
{
Return x. Dept = y. Dept;
}
Public int GetHashCode (Staff obj)
{
Return obj. Dept. GetHashCode ();
}
}
Static void Main (string [] args)
{
IList <Staff> staffs = new List <Staff>
{
New Staff {FirstName = "AAA", Title = "Manager", Dept = "Sale "},
New Staff {FirstName = "BBB", Title = "Accountant", Dept = "Finance "},
New Staff {FirstName = "BBB", Title = "Accountant", Dept = "Finance "},
New Staff {FirstName = "AAA", Title = "Sales", Dept = "Sale "},
New Staff {FirstName = "ABA", Title = "Manager", Dept = "HR "}
};
Print ("All Staffs", staffs );
Print ("No duplicated first name", staffs. Distinct ());
Print ("No duplicated title", staffs. Distinct (new StaffTitleComparer ()));
Print ("No duplicated department", staffs. Distinct (new StaffDeptComparer ()));
Console. ReadLine ();
}
Private static void Print (string group, IEnumerable <Staff> staffs)
{
Console. WriteLine (group );
Foreach (Staff s in staffs)
Console. WriteLine (s. ToString ());
Console. WriteLine ();
}
Overall
-- Update --
In the last example, you can extend IEnumeable <T> to implement DistinctBy:
public static class IEnurambleExtension{ public static IEnumerable<TSource> DistinctBy<TSource, TKey> (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) { HashSet<TKey> keys = new HashSet<TKey>(); foreach (TSource element in source) if (keys.Add(keySelector(element))) yield return element; }}
Yes
Staffs. DistinctBy (s => s). Note that the staff class must implement IEquatable <T> (or rewrite Equals and GetHashCode)
Staffs. DistinctBy (s => s. Dept), which eliminates the need to write the StaffDeptComparer class.
Further, if a field of staff is a class, this class also needs to implement IEquatable <T> (or rewrite Equals and GetHashCode)