[Disclaimer: All Rights Reserved. You are welcome to reprint it. Do not use it for commercial purposes. Contact Email: feixiaoxing @ 163.com]
Whether it is a database or a common ERP system, the search function is a basic function of data processing. Data search is not complex, but how can we quickly and effectively search data? Some methods accumulated by our predecessors in practice are worth learning. We assume that the searched data exists only, and no duplicate data exists in the array.
(1) Common Data Search
Imagine how we can find the data we want in 1 m data. At this time, the data itself has no features, so the data we need may appear in all positions of the array, may be at the beginning of the data, or may be at the end of the data. In this case, we must traverse the data before obtaining the corresponding data.
View plaincopy to clipboardprint? Int find (int array [], int length, int value)
{
If (NULL = array | 0 = length)
Return-1;
For (int index = 0; index <length; index ++ ){
If (value = array [index])
Return index;
}
Return-1;
}
Int find (int array [], int length, int value)
{
If (NULL = array | 0 = length)
Return-1;
For (int index = 0; index <length; index ++ ){
If (value = array [index])
Return index;
}
Return-1;
} Analysis:
We do not know how many times this data is needed. However, we know that such a data query requires at least one time, so it requires at most n times. On average, it can be regarded as (1 + n)/2, which is about half of n. The complexity of this algorithm, which is proportional to n, is recorded as o (n ).
(2) The above data has no features, which leads to disorganized data. Imagine what the results would be if the data was arranged neatly? Just like in life, if you do not pay attention to tidy up at ordinary times, it is very troublesome and inefficient to find things. However, once the location of things is fixed, all things are classified and sorted, then the results will be different, and we will form a mindset, so that the efficiency of searching for things will be very high. How can we find an ordered array? The binary method is the best method.
View plaincopy to clipboardprint? Int binary_sort (int array [], int length, int value)
{
If (NULL = array | 0 = length)
Return-1;
Int start = 0;
Int end = length-1;
While (start <= end ){
Int middle = start + (end-start)> 1 );
If (value = array [middle])
Return middle;
Else if (value> array [middle]) {
Start = middle + 1;
} Else {
End = middle-1;
}
}
Return-1;
}
Int binary_sort (int array [], int length, int value)
{
If (NULL = array | 0 = length)
Return-1;
Int start = 0;
Int end = length-1;
While (start <= end ){
Int middle = start + (end-start)> 1 );
If (value = array [middle])
Return middle;
Else if (value> array [middle]) {
Start = middle + 1;
} Else {
End = middle-1;
}
}
Return-1;
} Analysis:
The complexity of common data search algorithms is o (n ). Then we can use the same method to determine the complexity of the algorithm. How many times does this method take at least one time? We found that a maximum of log (n + 1)/log (2) is required. You can find an example and calculate it by yourself. For example, we can find seven data records at most three times. For 15 data records, we can find four data records at most. And so on, detailed demonstration methods can be found in introduction to algorithms and computer programming art. Obviously, this data search efficiency is much higher than the previous search method.
(3) The above search is based on the continuous memory. What if it is pointer-type data? What should we do? Then we need to introduce the binary sorting tree. The definition of a binary sorting tree is very simple: (1) Non-leaf nodes should be at least once and the branches should be non-NULL; (2) the left and right branches of the leaf nodes should be NULL; (3) each node should record one data, at the same time, the data of the left branch is smaller than that of the right branch. Let's take a look at the following definition:
View plaincopy to clipboardprint? Typedef struct _ NODE
{
Int data;
Struct _ NODE * left;
Struct _ NODE * right;
} NODE;
Typedef struct _ NODE
{
Int data;
Struct _ NODE * left;
Struct _ NODE * right;
} NODE; then it is easier to search.
View plaincopy to clipboardprint? Const NODE * find_data (const NODE * pNode, int data ){
If (NULL = pNode)
Return NULL;
If (data = pNode-> data)
Return pNode;
Else if (data <pNode-> data)
Return find_data (pNode-> left, data );
Else
Return find_data (pNode-> right, data );
}
Const NODE * find_data (const NODE * pNode, int data ){
If (NULL = pNode)
Return NULL;
If (data = pNode-> data)
Return pNode;
Else if (data <pNode-> data)
Return find_data (pNode-> left, data );
Else
Return find_data (pNode-> right, data );
}
(4) Similarly, we can see that (2) and (3) are all based on full sorting. Is there any search based on compromise? Yes, that is, the hash table. The hash table is defined as follows: 1) Each data belongs to a certain category based on certain clustering operations, and all data links form a linked list; 2) the header pointers of all linked lists form a pointer array. This method is effective when processing medium-sized data because it does not require complete sorting. The node definition is as follows:
View plaincopy to clipboardprint? Typedef struct _ LINK_NODE
{
Int data;
Struct _ LINK_NODE * next;
} LINK_NODE;
Typedef struct _ LINK_NODE
{
Int data;
Struct _ LINK_NODE * next;
} LINK_NODE;
How can we find the data in the hash table?
View plaincopy to clipboardprint? LINK_NODE * hash_find (LINK_NODE * array [], int mod, int data)
{
Int index = data % mod;
If (NULL = array [index])
Return NULL;
LINK_NODE * pLinkNode = array [index];
While (pLinkNode ){
If (data = pLinkNode-> data)
Return pLinkNode;
PLinkNode = pLinkNode-> next;
}
Return pLinkNode;
}
LINK_NODE * hash_find (LINK_NODE * array [], int mod, int data)
{
Int index = data % mod;
If (NULL = array [index])
Return NULL;
LINK_NODE * pLinkNode = array [index];
While (pLinkNode ){
If (data = pLinkNode-> data)
Return pLinkNode;
PLinkNode = pLinkNode-> next;
}
Return pLinkNode;
} Analysis:
Because hash tables do not need to be sorted, they only perform simple classification, which is especially convenient for data search. The size of the search time depends on the mod size. The smaller the mod, the closer the hash query is to the normal query. The larger the hash, the higher the probability of a successful hash query.
[Notice: Next blog introduces sorting content]