Starting from Zero _ learning _ data Structure (IV.)--Lookup algorithm, index, binary sort tree __java

Source: Internet
Author: User

Lookup algorithm

Basic concepts:

(1) Keywords: if there is a structure

struct node//node, storing data and pointers

{

Data data; Data properties, for storing data

int key; Assuming the key is an int value, it is unique within the entire table

Pointer field, specific, point to other nodes, or the subscript of an array

};

Key value is the keyword, for each node, its key value is not the same (not necessarily must be int value). Therefore, when we look up the data, we can determine whether we are looking for the data by knowing its key value and then comparing the key value to the key value we are looking for.

Advantages: ① Data properties are changed, does not affect lookup, and key values are usually not changed;

Disadvantage: ① need more space to store key values;

(2) Find time complexity:

When looking (lookup is also an algorithm), you need to record the time required to execute the code (which can be understood as how many lines of code to execute), and the time required for this execution is the time complexity of the algorithm. When the amount of data is n, it is written as T (n) =o (f (n)), and F (n) represents a function when the amount of data is N.

and O (f (n)) refers to the time complexity of the algorithm when the amount of data is N.

He says the rate at which the algorithm executes increases with the increase in the size of the problem n. In this way, an uppercase O () is used to represent the complexity of the algorithm, called the Big O notation.

The general concern is the average time of the Big O notation, and the worst result of the time.

Some of the common scenarios are:

① If no matter how much the data, its search time is a constant, then write as O (1), the expression is constant;

If the ② is logarithmic growth (the data increases by one times, the number is increased 1 times), it is recorded as O (log n), representing the logarithm;

③ if it is linear growth, it is written as O (n).

④ If it is the rise of the powers (and the relation of the amount of data is the relation of the powers), write O (n2)

(3) Lookup table and find

Lookup table: A collection of data of the same type (for example, the node in the tree is the same type, and the set of nodes in the tree is a lookup table).

Lookup: Determines an item in a lookup table (such as a node in a tree, such as a pointer to a node) based on a value (key).

(4) Hit

Can be understood to find the items that you want to find.

Sequential lookup (linear lookup):

(1) Application:

Most of the cases.

(2) Principle:

Find from the first, and then try to match until you find a value that meets your requirements, or until you have the last item.

(3) Optimal application of the algorithm:

① database is small;

② no harsh demands on time.

(4) Time complexity of the algorithm:

O (N)

Two-Way Search:

(1) Application:

Prerequisite: Key is ordered, and the order in the table is defined by key, and generally unchanged (if you need to reorder the table).

(2) Principle:

First find the most intermediate node in the table m, and then compare the key value of M and the value of the key to find the relationship, if larger than the M.key value, then find M to the right (the range is reduced by half before). If it is smaller than the M.key value, find the left side of M (the range is still half as narrow as before). Hit if it is as large as the M.key value.

such as code:

node* Find (int key,node *a)	//This applies to the array type, and a refers to the pointer to the set of nodes
{
	int F, L, M;
	f = 0;
	L = Last;
	while (F <= L)	////(as long as the subscript of the scope left limit is smaller than the subscript of the right limit
		) {m = (f + L)/2;
		if (Key < A[m].key)	//If the key value of the intermediate node is smaller
			l = m-1;	A value of 1 (also appearing in a new range) than the median value
		of else if (Key>a[m].key)		//is greater than the key of the intermediate node
			f = m + 1;
		else return
			A;	Hit
	} return
	NULL;	Description dead, return null pointer
}

(2) Optimal application of the algorithm:

Orderly table, ordered two-fork tree, more data.

(3) Time complexity of the algorithm:

O (log n)

Interpolation lookup:

(1) Algorithm principle:

Based on the relationship between the key value and the maximum and minimum subscript, it is optimized. (I do not understand the specific)

(2) Application:

A table with a more homogeneous key value. For example 1, 2, 3, 4, 5, 6 ... Such Not suitable for distribution of extremes (e.g. 1, 100, 500, 1000 ...). )

(3) algorithm:

node* Find (int key,node *a)	//This applies to the array type, and a refers to the pointer to the set of nodes
{
	int F, L, M;
	f = 0;
	L = Last;
	while (F <= L)	////If the subscript of the scope left limit is smaller than the subscript of the right limit
	{
		m = l + (key-a[f].key)/(A[l].key-a[f].key) * (l-f);	Modified is this line * * * if
		(Key < A[m].key)	//If the key value of the intermediate node is smaller
			l = m-1;	A value of 1 (also appearing in a new range) than the median value
		of else if (Key>a[m].key)		//is greater than the key of the intermediate node
			f = m + 1;
		else return
			A;	Hit
	} return
	NULL;	Description dead, return null pointer
}

(4) Best application:

The key value is evenly distributed.

Fibonacci Lookup:

(1) Principle:

I can't read it. Well, I don't have the patience to read it.

(2) Application:

For the lookup value near the right side, the efficiency is higher than the two-point search;

But to the far left, the efficiency is lower than the two-point lookup.

Index:

A so-called index is a specialized index table that stores each key value and a pointer to the item that contains the key value. The index here refers to the linear index.

For example:


The characteristics of the index:

① in order. The data item may be unordered (as in the right half of the figure), but the index is ordered. So whether the data item is ordered or not, as long as the key value matches the index entry, nature can find the data item according to the index item's pointer.

② because it is ordered (not all ordered in the index table), you can use the binary lookup method, or other lookup method, to find the key value that meets the requirement.

③ occupy a small space. You may only need an int value and a pointer, which is a lot less space compared to data items.

④ but the promotion efficiency is very high. Lookup in unordered data items, basically only linear lookup, however, using the index, the linear index (O (n)), into a binary lookup (O (log n)), so the high efficiency.

Dense index:

(1) Definition:

In a linear index, each of the data sets corresponds to an index entry.

(2) Features:

① must be arranged in order (by key value);

② search efficiency is high;

(3) Disadvantages:

① when the amount of data growth is extremely fast, there is no way (or difficult) to arrange the orderly;

② when the volume of data is large, it is difficult to read (because each corresponds to an index entry, so the index table is also very large).

Chunking index:

(1) Definition:

The dataset is divided into several pieces (several points), and each piece corresponds to an index entry.

(2) Features:

① within the block is disordered;

The ② blocks are ordered (so the index is ordered);

(3) Principle:

① block, although disorderly, but meet certain requirements, such as key value within a certain range;

The ② block has two values, which are used to record the minimum key value and the maximum key value within the block, and (therefore, the value of the other block must be greater than the key value of all items in the block, or smaller);

The ③ index entry will have a value that records the number of items in the current block;

The ④ pointer points to the top of the block (no need to know the end of the block, because the number of items, after the number of items to find out, natural end);

(4) Advantages:

The ① quickly delimits the block, and can then be looked up one by one (at which point the number of items in the block is not much) is not very slow.

Inverted index:

(1) Simple concept:

① Set Keywords (this keyword is what we look for, such as words), and then produce a list of keywords.

② Each keyword item, there is an array that records the number of the item that the keyword is in (for example, record it is the first item in the database);

③ Find a keyword, you can first find the keyword table, the key words in the item. It then finds the numbered list of the records (which records the entries of the databases, with this keyword);

④ then get a table, which each item, contains the keywords we are looking for;

⑤ display, the search is over.

(2) Advantages:

① is suitable for finding words, the principle is simple, storage space is small, response speed is fast.

The ② keyword table can be arranged alphabetically, then all words of the same letter can even be placed in the same block (block index), so the efficiency is very high;

③ in practical applications, you do not need to display all at once, so you can read several items at a time (for example, the 0#~9# of an array, and the next time you read the 10#~19# item);

Two-fork sort tree:

(1) Features

Using the form of a binary tree, search.

Unlike a binary search, a binary search is primarily an array (with subscript), while a binary sort tree is not an array (with a linked list). Therefore, in the design of the code, you can not use middle= (first+last)/2 such a method.

(2) algorithm: (Here the two fork tree for a new idea to write, add delete node)

Look at the time, suggest that you draw a binary tree, and then follow the code to go through the idea, understanding will be more profound.

Node: struct tree {data m;
	int key;
tree* lchild = null, *rchild = NULL;
}; Find: BOOL Searchtree (tree*t, int key, tree*p, tree**n)//pointer to the current node T,key value key, point to parent node p (default is null), point to the address of the last Non-empty node on the search path, n {if (		T = = null)//If NULL pointer (lookup failed) {*n = P;
	A pointer to the current node pointer that points to the parent pointer (essentially p is the address that points to the last Non-empty node pointer on the access path) return false;		else if (T->key = = key)//The key value of the current node meets the requirement {*n = T;
	A pointer to the current node pointer that points to the current node return true;	else if (T->key > key)///in left subtree Searchtree (T->lchild, Key, T, N);
The parameters are the pointer to the left child, the key value, the pointer to the parent node of the left child, pointing to else searchtree (T->rchild, Key, T, N);

The effect is to find the object that returns a pointer to the object and return the null pointer if it is not found.
	Insert: BOOL Inserttree (tree*t, int key)///lookup First, key value repeat insert failure return false,key value not repeat insert successfully returns true {Tree*temp = NULL, *p; if (!
		Searchtree (T, Key, NULL, &temp))//If lookup fails (description does not repeat), the value of temp is the last Non-empty node on the path (must be a leaf node) {p = new tree;
		P->key = key;
		if (t = = NULL)//If the insertion position is the root node (because of the preset, so T is present, the key T->key = key that directly assigns the key to the root node;
		if (P->key > Temp->key)//If the key value of the node to be inserted is larger than its parent temp->rchild = p;				Else	Otherwise small (cannot equal) Temp->lchild = p;	return true;	Insert succeeded, returns true} else return false; Find, insert failed return false} effect: Inserts a key value into the binary tree (no data Field Operation), if the binary tree does not have the key value succeeds, returns True, otherwise returns false

Delete a node:

Divided into four kinds of situations:

(1) The node is an empty node (do not delete);

(2) The left subtree of the node is empty, pointing to its own pointer, pointing to its right subtree;

(3) The right subtree of the node is empty, pointing to its own pointer, pointing to its own left subtree;

(4) The Node A left and right subtree exist, in the Zuozi (b) to find the rightmost (or in the right subtree to find the leftmost) node C, and then replace themselves. Then a parent node points to the pointer to a, and the Zoozi node pointer to the C,C points to the Zoozi node of the b,b, pointing to the right child node of a.

The idea is simple, but it's awkward to write.

I tried to write one, not sure if it was right. Later have the opportunity to verify it, if someone to verify the error, welcome message reminder.

BOOL DeleteTree (tree**t, int key)//lookup First, then delete {tree*l,*r;
		if ((*t)->key = key)//The first one is (description is the root) {delete *t;
	return true;
			else {while (*t)->key!= key&& (*t)!=null)//If the current key is different and is not a null pointer {L = (*t)->lchild;
			R = (*t)->rchild;	if ((*t)->key > key)//If key is smaller *t = (*t)->lchild;	Point to its left child else *t = (*t)->rchild;
		If the key is larger, point to its right child} if (*T==NULLPTR)//If it is a null pointer, the return false is not found;
		Then the key item is found, and L and R are the left and right tree** temp of its parent node at this time;	if (L->key = = key)//If Zoozi is key value temp = &l;	Temp is the address of the left pointer else temp = &r;
		Otherwise, temp is the address of the right pointer//So *temp is the pointer to the item (its parent node is pointing to it) if ((*t)->lchild = = null)//If the Zoozi tree is empty *temp = (*t)->rchild;
		else if ((*t)->rchild = = NULL) *temp = (*t)->lchild;
			else//Otherwise the left and right subtrees are not empty {//Find the Zuozi node of the most, and its parent node (because you want to point to modify his right pointer) tree*a, *B, *c;	C = (*temp)->lchild;	C will be the right node for its left subtree while (C->rchild!= NULL) c = c->rchild; Point to successful//4) the Node a left and right subtree exists, in the Zuozi (b) to find the rightmost (or in the right subtree to find the leftmost) node C, and then replace fromHas
			Then a parent node points to the pointer to a, and the Zoozi node pointer to the C,C points to the Zoozi node of the b,b, pointing to the right child node of a.	A = *temp;	A at this point is the pointer to the deleted node *temp = C;	The parent node of the deleted node is currently pointing to the replacement node B = a->lchild; B is a left child if (b!= c)//to exclude B and C are the same node {b->rchild = c->lchild;//b Zoozi is the Zoozi of C (left subtree remains unchanged) C->lchild = B;	/Then C's Zoozi is b} c->rchild = a->rchild;
		The right child of C is the right child of a and delete A;




 }
	}
}


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.