Objective-C implements Equality and Hashing

Source: Internet
Author: User

Implementing integrity and Hashingby Mike Ash

Welcome back to a late edition of Friday Q &. WWDC pushed the schedule back one week, but it's finally time for another one. this week, I'm going to discuss the implementation of equality and hashing in Cocoa, a topic suggested by Steven Degutis.


Equality
Object Integrity is a fundamental concept that gets used all over the place. In Cocoa, it's implemented withisEqual:Method. Something as simple[array indexOfObject:]Will use it, so it's important that your objects support it.

Object comparison is quite basic and can be seen everywhere in the code. In Cocoa programming, you can useisEqual:Method. This method is as simple as [array indexOfObject. Therefore, custom class objects should also support this method.


It's so important that Cocoa actually gives us a default implementation of it onNSObject. The default implementation just compares pointers. In other words, an object is only equal to itself, and is never equal to another object. The implementation is functionally identical:

In Cocoa programming, the NSObject class has provided the default implementation of this method. This default implementation method only compares pointers. In other words, an object can only be equal to itself, and cannot be equal to other objects. The implementation process is similar:

    - (BOOL)isEqual: (id)other    {        return self == other;    }


While oversimplified in example cases, this is actually good enough for a lot of objects. For example,NSViewIs never considered equal to anotherNSView, Only to itself.NSView, And define other classes which behave that way, the default implementation is enough. that's good news, because it means that if your class has that same authentication ity semantic, you don't have to do anything, and get the correct behavior for free.

Although this method is too simple in most cases, it is actually very useful for many objects. For example, an NSView object cannot be equal to other NSView objects, and can only be equal to itself. For NSView or other class objects with this feature, the default isEqual: method implementation is sufficient. This may be good news, because if your class has the same semantics, you can directly use isEqual: without additional work :.


Implementing Custom Equality
Sometimes you need a deeper implementation of equality. it's common for objects, typically what you might refer to as a "value object", to be distinct from another object but be logically equal to it. for example:

Sometimes, you need to customize this method. Generally, most objects, especially value objects, are used to distinguish two objects that are logically identical but different. For example:

    // use mutable strings because that guarantees distinct objects    NSMutableString *s1 = [NSMutableString stringWithString: @"Hello, world"];    NSMutableString *s2 = [NSMutableString stringWithFormat: @"%@, %@", @"Hello", @"world"];    BOOL equal = [s1 isEqual: s2]; // gives you YES!


Of courseNSMutableStringImplements this for you in this case. But what if you have a custom object that you want to be able to do the same thing?

Of course, in this example, NSMutableString has implemented isEqual :. But what if it is a custom object?

    MyClass *c1 = ...;    MyClass *c2 = ...;    BOOL equal = [c1 isEqual: c2];


In this case you need to implement your own versionisEqual:.

In this example, You need to implement isEqual: method by yourself.


Testing for equality is fairly straightforward most of the time. Gather up the relevant properties of your class, and test them all for equality. If any of them are not equal, then returnNO. Otherwise, returnYES.

It is quite easy to check whether objects are equal. Collect related attributes of class objects and check whether they are equal in turn. If one of them is not equal, NO is returned; otherwise, YES is returned.


One subtle point with this is that the class of your object is an important property to test as well. It's perfectly valid to testMyClassFor equality withNSString, But that comparison shoshould never returnYES(UnlessMyClassIs a subclassNSString, Of course ).

The interesting thing about object comparison is that comparing a custom class MyClass object with an NSString object is completely valid, but it is impossible to return YES. (Unless MyClass is a subclass of NSString)


A somewhat less subtle point is to ensure that you only test properties that are actually important to equality. Things like caches that do not influence your object's externally-visible value shocould not be tested.

In comparison, we also need to ensure that the attributes used for detection must be very important. For example, the cache attribute does not affect the external visible value of the object, so there is no need to compare it.


Let's say your class looks like this:

Let's take a look at the example below:

    @interface MyClass : NSObject    {        int _length;        char *_data;        NSString *_name;        NSMutableDictionary *_cache;    }


Your compliance ity implementation wowould then look like this:

IsEqual: The implementation of the method is as follows:

    - (BOOL)isEqual: (id)other    {        return ([other isKindOfClass: [MyClass class]] &&                [other length] == _length &&                memcmp([other data], _data, _length) == 0 &&                [[other name] isEqual: _name])                // note: no comparison of _cache    }



Hashing

Hash tables are a commonly used data structure which are used to implement, among other things, NSDictionaryAnd NSSet. They allow fast lookups of objects no matter how should objects you put in the container.

A hash table is a common data structure used to implement NSDictionary and NSSet. It allows you to quickly search for objects, regardless of the number of objects in the container.


If you're familiar with how hash tables work, you may be want to skip the next paragraph or two.

If you are familiar with how the hash table works, you can skip these two sections.


A hash table is basically a big array with special indexing. objects are placed into an array with an index that corresponds to their hash. the hash is essential a writable udorandom number generated from the object's properties. the idea is to make the index random enough to make it unlikely for two objects to have the same hash, but have it be fully reproducible. when an object is inserted, the hash is used to determine where it goes. when an object is looked up, its hash is used to determine where to look.

A hash table is a large array with special indexes. The object is placed in the array, and its subscript is the corresponding hash value. Hash is essentially a pseudo-random number generated from an object's attributes. The purpose of doing so is to make the index as random as possible, so that it is impossible for two objects to have the same hash value, but it can be completely repeated. When an object is inserted, the hash value determines its location. when an object is queried, the hash value determines its location.


In more formal terms, the hash of an object is defined such that two objects have an identical hash if they are equal. note that the reverse is not true, and can't be: two objects can have an identical hash and not be equal. you want to try to avoid this as much as possible, because when two unequal objects have the same hash (calledCollision) Then the hash table has to take special measures to handle this, which is slow. However, it's provably impossible to avoid it completely.

In more formal terms, if two objects have the same hash value, they should be equal. Note: otherwise, it is incorrect. In addition, it is impossible for two objects to have the same hash value, but the two objects are not equal. You should try to avoid this possible situation-that is, two different objects have the same hash value (calledCollision). In case of a collision, the hash table must take special measures to solve this problem. However, this proves to be completely unavoidable.


In Cocoa, hashing is implemented withhashMethod, which has this signature:

In Cocoa programming, the hash function is implemented through the hash method, and its method declaration is:

    - (NSUInteger)hash;


As with equality,NSObjectGives you a default implementation that just uses your object's identity. Roughly speaking, it does this:

Just like the equality comparison method, NSObject has provided a default implementation, as shown below:

    - (NSUInteger)hash    {        return (NSUInteger)self;    }

The actual value may differ, but the essential point is that it's based on the actual pointer valueself. And just as with equality, if object identity Authentication ity is all you need, then the default implementation will do fine for you.

The actual value may be different, but the point is that it is based on its actual pointer value self. And, just like the equality comparison, if an object ID is what you need, the default implementation is enough.


Implementing Custom Hashing
Because of the semanticshash, If you overrideisEqual:Then youMustOverridehash. If you don't, then you risk having two objects which are equal but which don't have the same hash. if you use these objects in a dictionary, set, or something else which uses a hash table, then hilarity will ensue.

Because of the semantics of the hash function, if you overload the isEqual method, you must overload the hash method. If you do not have one, two objects may have the same but different hash values. If you use these objects in dictionaries or collections, errors may occur.


Because the definition of the object's hash follows integrity so closely, the implementationhashLikewise closely follows the implementationisEqual:.

The definition and equality of object hash values are closely related, so the implementation of the hash method is related to the implementation of the isEqual method.


An exception to this is that there's no need to include your object's class in the definitionhash. That's basically a safeguard inisEqual:To ensure the rest of the check makes sense when used with a different object. your hash is likely to be very different from the hash of a different class simply by using UE of hashing different properties and using different math to combine them.



Generating Property Hashes
Testing properties for equality is usually straightforward, but hashing them isn't always. How you hash a property depends on what kind of object it is.

It is easy to check whether object attributes are equal, but it is usually not easy to calculate the hash value. How to calculate the data that an attribute's hash value depends on.


For a numeric property, the hash can simply be the numeric value.

For a numeric attributeThe hash value can be the value.


For an object property, you can send the objecthashMethod, and use what it returns.

Attributes of an objectYou can use the value returned by the hash method of this object.


For data-like properties, you'll want to use some sort of hash algorithm to generate the hash. you can use CRC32, or even something totally overkill like MD5. Another approach, somewhat less speedy but easy to use, is to wrap the data inNSDataAnd ask it for its hash, essential offloading the work onto Cocoa. In the above example, you coshould compute the hash_dataLike so:

For data class attributesYou need to use a hash algorithm to generate a hash value. You can use CRC32 or MD5. Another method is inefficient, but it is convenient to encapsulate data in NSData and call the hash method. You can calculate_dataAs follows:

    [[NSData dataWithBytes: _data length: _length] hash]



Combining Property Hashes

So you know how to generate a hash for each property, but how do you put them together?

Now you know how to generate hash values for different attributes, but how to put them together?


The easiest way is to simply add them together, or use the bitwise xor property. however, this can hurt your hash's uniqueness, because these operations are using Ric, meaning that the separation between different properties gets lost. as an example, consider an object which contains a first and last name, with the following hash implementation:

The simplest way is to simply add them together, or use bitwise operations (XOR. However, this may affect the uniqueness of hash, because these operations are symmetric, which means the loss of differences between different attributes. For example, an object has the first name and last name, and has the following hash implementation:

    - (NSUInteger)hash    {        return [_firstName hash] ^ [_lastName hash];    }


Now imagine you have two objects, one for "George Frederick" and one for "Frederick George ". they will hash to the same value even though they're clearly not equal. and, although hash collisions can't be avoided completely, we shocould try to make them harder to obtain than this!

Now let's assume there are two objects: George Frederick and Frederick George ". This will cause the two to have the same hash value, even though they are two different objects. Although hash collision is inevitable, we should try to avoid this situation.


How to best combine hashes is a complicated subject without any single answer. however, any asypolicric way of combining the values is a good start. I like to use a bitwise rotation in addition to the xor to combine them:

It is complicated to combine all hash values, and the answer is not unique. However, any asymmetric combination is a good idea. I like to useShift and XOR:

    #define NSUINT_BIT (CHAR_BIT * sizeof(NSUInteger))    #define NSUINTROTATE(val, howmuch) ((((NSUInteger)val) << howmuch) | (((NSUInteger)val) >> (NSUINT_BIT - howmuch)))        - (NSUInteger)hash    {        return NSUINTROTATE([_firstName hash], NSUINT_BIT / 2) ^ [_lastName hash];    }



Custom Hash Example

Now we can take all of the above and use it to produce a hash method for the example class. it follows the basic form of the equality method, and uses the above techniques to obtain and combine the hashes of the individual properties:

Now, we can use the content mentioned above to generate a hash method. As follows:

    - (NSUInteger)hash    {        NSUInteger dataHash = [[NSData dataWithBytes: _data length: _length] hash];        return NSUINTROTATE(dataHash, NSUINT_BIT / 2) ^ [_name hash];    }


If you have more properties, you can add more rotation and more xor operators, and it'll work out just the same. you'll want to adjust the amount of rotation for each property to make each one different.

If you have more attributes, you can add shift and XOR operations to calculate the hash value. You need to use shift to adjust each attribute.


A Note on Subclassing
You have to be careful when subclassing a class which implements custom between ity and hashing. in particle, your subclass shocould not expose any new properties which equality is dependent. if it does, then it must not compare equal with any instances of the superclass.

When You subclass a certain category, pay attention to the custom implementation of isEqual and hash. In particular, your subclass should not expose any new attributes related to the isEqual method.


To see why, consider a subclass of the first/last name class which between des a birthday, and between des that as part of its own ity computation. it can't include it when comparing capacity ity with an instance of the superclass, though, so its capacity ity method wowould look like this:

If you want to know why, suppose the subclass has a class with first and last name, and the subclass has a birthday attribute. The isEqual code of the object comparison method is as follows:

    - (BOOL)isEqual: (id)other    {        // if the superclass doesn't like it then we're not equal        if(![super isEqual: other])            return NO;                // if it's not an instance of the subclass, then trust the superclass        // it's equal there, so we consider it equal here        if(![other isKindOfClass: [MySubClass class]])            return YES;                // it's an instance of the subclass, the superclass properties are equal        // so check the added subclass property        return [[other birthday] isEqual: _birthday];    }


Now you have an instance of the superclass for "John Smith", which I'll callA, And an instance of the subclass for "John Smith" with a birthday of 5/31/1982, which I'll callB. Because of the definition of capacity ity above,AEqualsB, AndBAlso equals itself, which is expected.

Now there is A super Class Object A: "John Smith"; there is also A sub-class object B: "John Smith", whose birthday attribute is 5/31/1982. Because of the isEqual method above, A and B are equal, and B is equal to itself.


Now consider an instance of the subclass for "John Smith" with a birthday of 6/7/1994, which I'll callC.CIs not equalB, Which is what we have CT.CIs equalA, Also expected. But now there's a problem.AEquals bothBAndC,BAndCDo not equal each other! This breaks the standard transitivity of the equality operator, and leads to extremely unexpected results.

Now, there is a subclass Object C: "John Smith" whose birthday attribute is 6/7/1994. C and B are not equal, but C and A are equal. But now there is A problem: A is equal to B and C, but B and C are not equal! This is inconsistent with the passing of equal operators, resulting in processing unexpected errors.


In general this shoshould not be a big problem. if your subclass adds properties which influence object equality, that's probably an indication of a design problem in your hierarchy anyway. rather than working around it with weird implementationsisEqual:, Consider redesigning your class hierarchy.

In general, this is not a big problem. If the subclass adds attributes and affects the equality of objects, this is likely to be a problem with design class inheritance. You don't always need to focus on the implementation of isEqual methods. Consider re-designing your class inheritance.


A Note on Dictionaries
If you want to use your object as a key inNSDictionary, You need to implement hashing and equality, but you also need to implement-copyWithZone:. Techniques for doing that are beyond the scope of today's post, but you shoshould be aware that you need to go a little bit further in that case.

If you want to use your custom object as the key value of NSDictionary, You need to implement the hash and isEqual methods, and also implement the copyWithZone method.This content is beyond the scope of this article. You can learn more about it through other channels.


Conclusion
Cocoa provides default implementations of equality and hashing which work for your objects, but if you want your objects to be considered equal even when they're distinct objects in memory, you have to do a bit of extra work. fortunately, it's not difficult to do, and once you implement them, your class will work seamlessly with your Cocoa collection classes.

Cocoa programming provides the default isEqual method and hash method implementation, which are useful in many objects. However, if you want to compare the equality of custom objects at the memory level, you should have some additional processing. Fortunately, these are relatively simple. Once you implement them, you can use these custom class objects in the collection classes in Cocoa.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.