Hashtables provides a useful way to maximize the performance of your application.
Hashtables (hash table) is not a new concept in the field of computers. They are used to speed up the processing speed of the computer, with today's standards to deal with very slow, and they can let you query many data items, quickly find a special entry. Although modern machines are thousands of times times faster, hashtables is still a useful way to get the best performance of the application.
Imagine that you have a data file that contains about 1000 records?? For example, a small business customer record also has a program that reads records into memory for processing. Each record contains a unique five-digit customer ID number, customer name, address, account balance, and so on. Assuming that records are not sorted by Customer ID number order, the only way to find a particular customer record is if the program wants to use the customer number as "key" to search for each record sequentially. Sometimes, it will quickly find the records you need, but sometimes, before the program finds the record you need, it has almost searched for the last record. If you want to search in 1,000 records, finding any record requires an average of 500.5 ((1000 + 1)/2) records to check. If you often need to find data, you should need a quicker way to find a record.
One way to speed up your search is to divide your records into segments so that you don't have to search for a large list, but instead search for a few short lists. For our digital customer ID number, you can build 10 lists?? The ID number that begins with 0 consists of a list, a list of ID numbers that begin with 1, and so on. To find customer ID number 38016, you just need to search for a list that starts with 3. If there are 1,000 records, the average length of each list is 100 (1,000 records are divided into 10 lists), then the average number of searches for a record drops to about 50 (see Figure 1).
Of course, if about one-tenth of the customer number starts with 0, the other one-tenth starts with 1, and so on, then this approach will work well. If 90% of the customer number starts with 0, then the list will have 900 records, and each lookup should be compared on average 450 times. In addition, 90% of the searches that the program needs to perform are for numbers that start with 0. Therefore, the average comparison is much more than the range of simple mathematical operations.
If we could distribute records in our list in such a way, it would be better if each list had a record of the same entries, regardless of the distribution of the numbers in the key value. We need a way to mix the customer numbers together and distribute the results better. For example, we can take each digit in a number, multiply by a large number (different from the number position), and then add the result to a total, divide the number by 10, and use the remainder as the index value. When the record is read, the program runs the hash function on the customer number to determine which list the record belongs to. When a user needs a query, the same hash function is used as a "key" for the customer number, so that the correct list can be searched. A data structure like this is called a hash table (hashtable).
The hashtables in Java
Java contains two classes,Java. util. Hashtable and Java. util. HASHMAP, they provide a versatile hashtable mechanism. These two classes are very similar and usually provide the same public interface. But they do have some important differences, which I'll talk about later.
Hashtable and HashMap objects allow you to combine a key with a value and enter the Key/value into the table using the put () method. You can then get this value (value) by calling a Get () method and using the key as a parameter. Key and value can be any object as long as the two basic requirements are met. Note that because key and value must be objects, the original type (primitive types) must be converted to an object by using a method such as Integer (int).
In order to use an object of a particular class as a key, the class must provide two methods, Equals () and hashcode (). These two methods are in Java. lang. Object, so all classes can inherit the two methods, but the implementation of these two methods in the object class is generally useless, so you usually need to overload the two methods yourself.
The Equals () method compares its object to another object, and returns True if the two objects represent the same information. The method also looks at and ensures that the two objects belong to the same class. If two reference objects are identical objects, Object.Equals () returns True, which explains why this method is usually not a good fit. In most cases, you need a method to compare a field to a field, so we think the different objects representing the same data are equal.
The Hashcode () method generates an int value by executing a hash function using the contents of the object. Hashtable and HashMap Use this value to calculate which bucket (Hashion) (or list) a pair of Key/value is located in.
As an example, we can look at the string class because it has its own way to implement both methods. String.Equals () compares two string objects one character at a character, and returns True if the string is the same:
Copy Code code as follows:
String myname = "Einstein";
The following test is
Always True
if (Myname.equals ("Einstein"))
{ ...
String.hashcode () runs the hash function on a string. The numeric code for each character in the string is multiplied by 31, and the result depends on the position of the character in the string. The results of these calculations are then added together to obtain a total. This process may seem complicated, but it ensures better distribution of values. It also proves how far you can go in developing your own Hashcode () method, and that the results are unique.
For example, suppose I want to use a hashtable to implement a directory of books, the ISBN number of the book as a search key to search. I can use the string class to host the details and have the Equals () and Hashcode () methods Ready (see Listing 1). We can add pairs of Key/value to Hashtable using the Put () method (see Listing 2).
put The () method accepts two parameters, all of which belong to type object. The first argument is key, and the second argument is value. put The () method calls the Hashcode () method of the key, using the number of lists in the table to remove the result. Use the remainder as an index value to determine which list the record is added to. Note that the key is unique in the table; if you invoke put () with an existing key , the matching entry is modified so that it references a new value, and the old value is returned (the put () returns a null value when the key does not exist in the table).
To read a value in the table, we use the search key for The Get () method. It returns an object reference that converts to the correct type:
Copy Code code as follows:
Bookrecord br =
(Bookrecord) Isbntable.get (
"0-345-40946-9");
System.out.println (
"Author:" + br.author
+ "Title:" + br.title);
Another useful method is remove (), which is almost the same as get (), which deletes the entry from the table and returns it to the calling program.
Your own class
If you want to have an original type used as a key, you must create an object of the same type. For example, if you want to use an integer key, you should use the constructor integer (int) to generate an object from an integer. All the encapsulation classes?? As integers, float, and Boolean see the original values as objects, they overload the Equals () and Hashcode () methods, so they can be used as keys. This is also true of many other classes provided in the JDK (even the Hashtable and HashMap classes implement their own equals () and Hashcode () methods), but you should look at the file before you make any object of the class the Hashtable keys. It is also necessary to look at the source of the class to see how Equals () and hashcode () are implemented. For example, Byte, Character, short, and integer return the integer value represented as the hash code. This may or may not be appropriate for your needs.
Using Hashtables in Java
If you want to create a Hashtable, this hashtable use the object of a class of your own definition as a key, then you should be sure that the Equals () and Hashcode () methods of this class provide useful values. First look at your extended class to determine whether its implementation meets your needs. If not, you should overload the method.
The basic design constraint for any equals () method is that if the object passed to it belongs to the same class, and its data field is set to a value that represents the same data, it should return true. You should also be sure that if you pass an empty argument to the method, then your code returns
Copy Code code as follows:
False:public boolean equals (Object O)
{
if ((o = = null)
|| ! (o instanceof MyClass))
{
return false;
}
Now compare data fields ...
Also, when designing a Hashcode () method, you should keep some rules in mind. First, the method must return the same value for a particular object, regardless of how many times the method is invoked (as long as the object's contents are not changed between invocations, this should be avoided when an object is used as a Hashtable key). Second, if the two objects defined by your equals () method are equal, they must also generate the same hash code. Third, it's more like a policy than a principle, and you should try to design a method that produces different results for different object content. It does not matter if occasionally different objects produce exactly the same hash code. However, if the method can only return a value ranging from 1 to 10, only 10 lists can be used, regardless of how many lists are in the Hashtable.
Another factor to keep in mind when designing equals () and Hashcode () is performance issues. Each call to put () or get () includes calling Hashcode () to find the correct list, and when get () scans the list to find the key, it calls equals () for each element in the list. Implement these methods to make them run as quickly and efficiently as possible, especially if you intend to make your classes publicly available, because other users may want to use your classes in high-performance applications in the context of the speed of execution.
Hashtable performance
The main factor affecting hashtable effectiveness is the average length of the list in the table, because the average search time is directly related to this average length. Obviously, to reduce the average length, you have to increase the number of lists in the Hashtable, and if the list is so large that most lists or all lists contain only one record, you get the best search efficiency. However, this may be too much. If your Hashtable list is far more than your data entry, you don't need to do that kind of memory, and in some cases it's impossible to accept that.
In our previous example, we know in advance how many records we have 1,000. Knowing this, we can determine how many lists our hashtable should contain in order to achieve the best compromise between search speed and memory usage efficiency. In many cases, however, you do not know beforehand how many records you have to process, the files that are read may be expanding, or the number of records may change significantly day by day.
As the entries increase, the Hashtable and HashMap classes handle the problem by dynamically expanding the table. All two classes have constructors that accept the initial number of lists in the table, and a load factor (load factor) as a parameter:
Public Hashtable (
int initialcapacity,
float loadfactor)
Public HashMap (
int initialcapacity,
float loadfactor)
Multiplies these two numbers to calculate a critical value. Each time a new entry is added to the hash table, the count is updated and the table is reset (rehash) when the count exceeds the critical value. (The number of lists increases to twice times the previous number plus 1, and all entries are transferred to the correct list.) The default constructor sets the initial capacity to 11 and the load factor is 0.75, so the threshold value is 8. When the Nineth record is added to the table, the hash table is resized so that it has 23 lists, and the new threshold value will be 17 (the integer portion of 23*0.75). As you can see, the load factor is the upper limit of the average number of lists in the hash table, which means that, by default, a Hashtable rarely has many lists that contain more than one record. Compare our initial example, in that case, we have 1,000 records, distributed in 10 lists. If we use the default value, the table will be extended to contain more than 1,500 lists. But you can control this. If the number of lists multiplied by the load factor is greater than the number of entries you are working on, the table will never be duplicated, so we can emulate the following example :
Copy Code code as follows:
Table would not rehash until it
has 1,100 entries (10*110):
Hashtable myhashtable =
New Hashtable (110.0F);
You may not want to do this unless you save memory for the empty list and don't mind the extra search time, which may occur in embedded systems. However, this method can be useful because the reset is very time-consuming, and this method guarantees that it will never happen again.
Note that while the call to put () can make the table larger (the number of lists increases), calling remove () does not have the opposite effect. So, if you have a large table and remove most of the entries from it, you will have a large but mostly empty table.
Hashtable and HashMap
There are three important differences between the Hashtable and the HashMap classes. The first difference is mainly historical. Hashtable is based on the stale dictionary class, and HashMap is an implementation of the Map interface introduced by Java 1.2.
Perhaps the most important difference is that the Hashtable method is synchronized, and the HashMap method is not. This means that although you can use a hashtable in a multithreaded application without taking any special actions, you must provide an external synchronization for a hashmap as well. A convenient approach is to take advantage of the static Synchronizedmap () method of the collections class, which creates a thread-safe Map object and returns it as an encapsulated object. This object's approach allows you to synchronize access to potential hashmap. The result is that when you don't need to sync, you can't cut the sync in the Hashtable (like in a single-threaded application), and the synchronization adds a lot of processing overhead.
The 3rd difference is that only hashmap can let you use NULL as the key or value of a table entry. Only one record in HashMap can be an empty key, but any number of entries can be empty value. This means that if the search key is not found in the table, or if the search key is found, but it is an empty value, then get () returns NULL. If necessary, use the Containkey () method to distinguish between these two situations.
Some data suggest, when need synchronization, use Hashtable, converse with HashMap. However, because HashMap can be synchronized when needed, HashMap functions more than Hashtable, and it is not based on a stale class, so it is argued that HashMap takes precedence over Hashtable in all situations.
About properties
Sometimes you might want to use a hashtable to map the string of key to value. There are some examples of environment strings in DOS, WINDOWS, and UNIX, such as the string path of the key that is mapped to value in the strings C: \ windows; C:\WINDOWS\SYSTEM. Hashtables is a simple way to represent these, but Java provides another way.
Java. util. The properties class is a subclass of Hashtable, designed for string keys and values. The use of properties objects is similar to that of Hashtable, but the class adds two time-saving methods, as you should know.
The store () method saves the contents of a Properties object to a file in a readable form. The Load () method is just the opposite, used to read the file and set the Properties object to contain keys and values.
Note that because properties extend the Hashtable, you can use the superclass put () method to add keys and values that are not string objects. This is not advisable. In addition, if you use the store () for a Properties object that does not contain a string object, the store () will fail. Instead of put () and get (), you should use SetProperty () and GetProperty (), which use string arguments.
Well, I hope you know now how to use hashtables to speed up your processing.