Start from scratch _ Learn data structure (four)--find algorithm, index, binary sort tree

Source: Internet
Author: User

Find algorithm

Basic concepts:

(1) Keywords: if there is a structure

struct Node //A node that stores data and pointers

{

data data; //Data attributes, for storing data

int key; //Assume that key is an int value, which is unique throughout the table

Pointer field, specific, pointing to another node, or an array subscript

};

The key value is the keyword, and the key value is different for each node (not necessarily an int value). So, when we look at the data, just know its key value and then compare the key value with the key value we're looking for. Can tell if it's the data we're looking for.

Pros: ① data properties are changed without affecting the lookup, and the key value is usually not changed;

Cons: ① need more space to store the key value;

(2) Find time complexity:

When looking for (a lookup is also an algorithm), you need to record the time it takes to execute code (which can be understood as how many lines of code to execute), and the time it takes to execute is the time complexity of the algorithm. When the amount of data is n , it is denoted as T (n) =o(f (n)), andf (n) indicates the amount of data N a function at the time.

O(f (n)) refers to the time complexity of the algorithm when the data volume is n .

He said that with the increase of the problem size n , the time of the algorithm execution grew. This method uses capital o() to represent the complexity of the algorithm, called the big o notation.

The general concern is the average time of the big O notation, and the worst outcome time.

Several common cases are:

① If the search time is a constant regardless of the number of data, it is denoted as O(1), which is a constant;

② if it is logarithmic-type growth (the data increases by 1 times per increase ), denoted by O(log n), indicating the logarithm;

③ if it is linear growth, it is recorded as O(n).

④ if it is a exponentiation (and the relationship between the amount of data is a relationship), it is recorded as O(n2)

(3) Find table and find

Lookup tables: A collection of data of the same type (for example, nodes in a tree are of the same type, and a collection of nodes in the tree is a lookup table).

Find: Based on a value (key), determine an item in the lookup table (such as a node in the tree, for example, to get a pointer to the node).

(4) hit

It can be understood to find the item you are looking for.

Sequential lookup (linear lookup):

(1) Applicable situation:

Most of the situation.

(2) principle:

Search from the first, and then try the match in turn until you find a value that meets the requirements, or the last item.

(3) Optimal application of the algorithm:

① database is small;

② has no strict demands on time.

(4) algorithm time complexity:

O(n)

Binary method Search:

(1) applicable situation:

The premise:key is ordered, and the order of the table is defined by key , and is generally constant (if you change it, you need to sort the table again).

(2) Principle:

First find the middle node in the lookup table m, and then compare the key value of m and the key value to find the relationship, if the if the m.key value is large, then find the right side of M (the range is half smaller than before). If It is smaller than the M.key value, find the left side of M (the range is still half as small as before). Hit if It is as large as the M.key value.

such as code:

node* find (int last, int key,node *a)//This applies to the array type, a refers to the pointer to the node number group {int F, l, m;f = 0;l = Last;while (f <= L)//As long as the range has a left limit of subscript than the right limit Subscript small can {m = (f + L)/2;if (Key < A[m].key)//If the key value is smaller than the middle node L = m-1;//is smaller than the median value 1 (also appears a new range) else if (Key>a[m].key)//than the middle node key Value large F = m + 1;elsereturn a;//hit}return null;//description dead, return null pointer}

(2) Optimal application of the algorithm:

Ordered table, ordered two-fork tree, when the data are more.

(3) Algorithm time complexity:

O(log n)

Interpolation lookup:

(1) Algorithm principle:

Optimized based on the relationship between the key value and the maximum and minimum subscript. (I do not understand the specific)

(2) Applicable situation:

a table with a more uniform key value. For example , 1,2,3,4,5, 6 ... Such Not suitable for distribution extremes (e.g. 1 , ... )

(3) algorithm:

node* find (int last, int key,node *a)//This applies to the array type, a refers to the pointer to the node number group {int F, l, m;f = 0;l = Last;while (f <= L)//As long as the range has a left limit of subscript than the right limit  Subscript small can be {m = L + (key-a[f].key)/(A[l].key-a[f].key) * (l-f),//**** Modify this line ****if (Key < A[m].key)//If the key value is smaller than the middle node L = m -1;//is 1 smaller than median (also appears new range) else if (Key>a[m].key)//is larger than the middle node key value F = m + 1;elsereturn a;//hit}return null;//description dead, return null pointer}

(4) Best applicable situation:

the key values are evenly distributed.

Fibonacci Find:

(1) Principle:

Not understand .... Well, it's not a patient look.

(2) applicable situation:

For the search value close to the right half , the efficiency is higher than the binary search;

But to the far left , the efficiency is lower than the binary search.

Index:

The so-called Index refers to a special index table that stores each key value and a pointer to the item that contains the key value. The index here refers to a linear index .

For example:



Features of the index:

① in order. The data item may be unordered (the right half of the middle), but the index is ordered. Therefore, regardless of the order of the data items, as long as the key value conforms to the index entry, it is natural to find a pointer to the index item according to the data item.

② because it is ordered (this does not mean that the index table is all ordered), you can use the binary lookup method, or other lookup methods, to find the key value that meets the requirements .

③ occupies very little space. You may only need an int value and a pointer, which is much smaller compared to the data item.

④ but the efficiency is high. Find in unordered data items, basically only linear lookup, however, using the index, the linear index (o(n)), the binary method to find (o(Log (n)), so the efficiency of the lifting is high.

Dense index:

(1) Definition:

Refers to an index entry for each of the datasets in a linear index.

(2) Features:

① must be arranged in order (by key value);

② find high efficiency;

(3) Disadvantages:

① when the volume of data grows extremely fast, there is no way (or difficult) to arrange in order;

② when the amount of data is large, it is difficult to read (because each one corresponds to an index entry, so the index table is also large).

Chunked index:

(1) Definition:

The dataset is divided into pieces (several parts), and then each piece corresponds to an index entry.

(2) Features:

The ① block is disordered;

The ② blocks are ordered (hence the index is ordered);

(3) principle:

① block Although disorderly, but meet certain requirements, such as the key value within a certain range;

There are two values in the ② block, which are used to record the minimum key value and the maximum key value within the block (so that the value of an item in the other block is necessarily larger or smaller than the key value of all items in the Block );

③ The index entry will have a value that records the number of items in the current block;

④ Pointer to the head of the block (no need to know the end of the block, because there is the number of items, after the number of items found, the natural end);

(4) Advantages:

The ① quickly delimits the block, and can then be searched one by one (the number of items in the block is not much) and is not very slow.

Inverted index:

(1) Simple concept:

① Set Keywords (the keyword is what we want to find, such as words), and then produce a keyword table.

② Each keyword item, there is an array that records the number of the item in which the keyword is located (for example, record that it is the first item in the database);

③ Find a keyword, you can first find the keyword table, the key word in the item. Then we find the numbered table of the record (which records the items of the database, which have the keyword);

④ then get a table, each of which contains the key words we are looking for;

⑤ show up, search is over.

(2) Advantages:

① is suitable for finding words, the principle is simple, storage space is small, the response speed is fast.

② keyword table can be arranged alphabetically, and then all the words of the same letter can even be placed in the same block (block index), so the efficiency is very high;

③ actual application, do not need to show all at once, so you can read several items at a time (such as the array of 0#~9# items, the next time to read the 10#~19# item);

Binary sort tree:

(1) features

Use binary tree form to search.

Unlike binary search, binary search is mainly about arrays (with subscript), while binary sort trees do not have arrays (linked lists). Therefore, in the design of the code, it is not possible to use middle=(first+last)/2 such a way.

(2) algorithm: (Here the two fork tree to change the idea of re-write, Add delete node )

Look, it is recommended that you draw a binary tree, and then follow the code to go through the idea, the understanding will be more profound.

Node: struct tree{data m;int key; tree* lchild = null, *rchild = null;}; Find: BOOL Searchtree (tree*t, int key, tree*p, tree**n)//Pointer to current node T,key value key, point to parent node P (default null), point to the address of the last non-empty node on the search path, n{if ( T = = null)//If it is a null pointer (lookup failed) {*n = p;//pointer to the current node pointer, pointing to the parent pointer (essentially p is the address of the last non-empty node pointer on the access path) return false; else if (T->key = = key)//The key value of the current node conforms to the requirement {*n = t;//pointer to the current node pointer, pointing to the current node return true;} else if (T->key > key)//In the left subtree Searchtree (T->lchild, Key, T, n);//Its parameters are pointers to the left child, key value, pointer to the parent node of the left child, Point to Elsesearchtree (T->rchild, Key, T, n);} The effect is that the found object returns a pointer to the object and returns a null pointer without finding it. Insert: BOOL Inserttree (tree*t, int key)//Find First, key value repeat insert failed return false,key value do not repeat insert successfully return true{tree*temp = NULL, *p;if (! Searchtree (t, key, NULL, &temp))//If the lookup fails (the description is not duplicated), then the value of temp is the last non-empty node on the path (must be a leaf node) {p = new Tree;p->key = key;if (t = = NULL)//If the inserted position is the root node (due to pre-set, so T is present, direct assignment key to the root node of Keyt->key = Key;if (P->key > Temp->key)// If the key value of the node to be inserted is greater than its parent temp->rchild = p;else//Otherwise small (cannot be equal) Temp->lchild = P;return true;//Insert succeeded, return True}elsereturn false;//found, insert failed to return false} effect: aA key value is inserted into the binary tree (not involving data field operations), if the binary tree does not have the key value is successful, return true, otherwise return false 

Delete a node:

It is divided into four situations:

(1) The node is an empty node (no deletion);

(2) The left sub-tree of the node is empty and will point to its own pointer, pointing to its own right sub-tree;

(3) the node right sub-tree is empty, will point to its own pointer, pointing to its own left sub-tree;

(4) the node A subtree is present, in the Zuozi (root node is B) Find the rightmost (or in the right subtree to find the leftmost) node C , and then replace yourself. Then a 's parent node points to a pointer to c,c The left Dial hand node pointer points to B, and the right child node ofb points to the left Dial hand node of a.

The idea is simple, but it's awkward to write.

I tried to write one, not sure if it was correct. In the future, if you have the opportunity to verify, if someone verifies the error, welcome message reminder.

BOOL DeleteTree (tree**t, int key)//Find First, then delete {tree*l,*r; if (*t)->key = key)//The first one is (description is root) {delete *t;return true;} Else{while ((*t)->key! = key&& (*t)!=null)//If the current key is different and is not a null pointer {L = (*t)->lchild;r = (*t)->rchild;if (* T)->key > key)//If key is smaller *t = (*t)->lchild;//points to its left child else *t = (*t)->rchild;//if key is larger, point to its right child}if (*T==NULLPTR)// If it is a null pointer, the description did not find return false;//then found the key item, and at this time L and R is the parent node of the left and right child tree** temp;if (L->key = = key)//If left dial hand is the key value temp = &l;// Temp is the address of the left pointer else temp = &r;//Otherwise temp is the address of the right pointer//So *temp is a pointer to the item (whose parent node points to its pointer) if ((*t)->lchild = = null)//If the left dial hand tree is empty *temp = (*t)->rchild;else if ((*t)->rchild = = NULL) *temp = (*t)->lchild;else//Otherwise the subtree is not empty {//Find the right node of Zuozi, and its parent node (because you want to modify his right pointer to point) tree*a, *b, *c; c = (*temp)->lchild;//c will be the right-most node of its left subtree while (c->rchild! = NULL) C = c->rchild;//points to success//4) The Node A is present in the subtree, In Zuozi (root node is b) find the rightmost (or find the leftmost) node C in the right subtree and replace yourself. Then A's parent node points to the pointer to a, and the left Dial hand node pointer to c,c points to the left Dial hand node of the B,b's right child node to point A. A = *temp;//a at this point is a pointer to the deleted node *temp = c;//The parent node of the deleted node currently points to the replacement node B = a->lchild;//b is the left child of a (b! =c)//To exclude B and C is the same node {b->rchild = c->lchild;//b left dial hand is the left dial hand of C (left subtree remains the same) C->lchild = b;//then the left dial hand of c is B}c->rchild = a-> The right child of rchild;//c is the right child of a, delete A;}}


Start from scratch _ Learn data structure (four)--find algorithm, index, binary sort tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.