Ternary search Trees three-point tree

Source: Internet
Author: User
Tags prefix lookup

Often encounter to save a bunch of string, this time can be hash tables, although the hash tables Find Fast, but the hash tables can not show the link between the strings. Binary search tree is possible, but the query speed is not very good. Can be used trie, but trie will waste a lot of space (of course, you can also use two arrays to achieve also compared to save space). So here ternary search trees has the advantage of Trie fast query speed and binary search tree space.

Implement a 12-word lookup

This is a binary search tree implementation, n is the number of words, Len is the length, the complexity is O (LOGN * n), space is N*len

This is achieved with trie, the complexity O (n), space is here is 18 * 26 (assuming only 26 lowercase characters), as the word length increases, and so on, more space is needed

This is ternary search tree, you can see that the spatial complexity of the same as binary search tree, the complexity of approximate O (n), the constant will be more than trie almost.

Introduced

Ternary search tree has the advantage of binary search tree space and trie query fast.
Ternary search tree has three nodes, when looking for, compare the current character, if the characters are smaller, then jump to the left node. If the character you are looking for is larger, jump to the friend node. If the character is exactly equal, Then move to the middle node. This time compares the next character.
Like the example above, to find "ax", compare "a" and "I", "a" < "I", jump to the left node of "I", compare "a" < "B", jump to the left node of "B", "a" = "a", jump to the middle node of "a", and compare the next character "X". "X" > "s", jumps to the right node of "s" and Compares "X" > "T" to find that "T" has no right node. Find the result, there is no "ax" character

Construction method

This is implemented in C language.
Node Definition:

struct tnode *tptr;   struct tnode {  char  s;  Tptr Lokid, Eqkid, Hikid;  } Tnode;

Here's how to find it first:

intSearchChar*s)//S is the string you want to find{tptr p; P= t;//T is the root node of the ternary search tree that has been constructed.  while(p) {if(*s < P->s) {//if *s is smaller than p->s, then the node jumps to P->lokidp = p->Lokid; } Else if(*s > p->s) {p= p->Hikid; } Else {  if(* (s) = =' /') {//when *s is ' s ', the search succeedsreturn 1; } //if *s = = P->s, go to the middle node, and s++s++; P= p->Eqkid; }  }  return 0; }  

Insert a string:

Tptr Insert (Tptr p,Char*s) {if(p = =NULL) {P= (tptr)malloc(sizeof(tnode)); P->s = *s; P->lokid = P->eqkid = P->hikid =NULL; }       if(*s < p->s) {p->lokid = insert (p->Lokid, s); } Else if(*s > p->s) {p->hikid = insert (p->Hikid, s); } Else {           if(*s! =' /') {p->eqkid = insert (P->eqkid, + +)s); } Else{p->eqkid = (tptr) insertstr;//Insertstr is the string to be inserted for easy traversal of all string operations        }       }       returnp; }  }  

As with binary search tree, the order of insertion is also fastidious, and binary search tree will degenerate into a linked list in the worst case of sequential insertion. But ternary search tree is worse than binary search tree. Much better.

There must be an operation to traverse a tree.

//This prints all the strings in dictionary order.voidTraverse (Tptr P)//this iterates over all nodes below a node, and if it is a non-root node, it is a string with the same prefix{   if(!p)return; Traverse (P-lokid); if(P->s! =' /') {Traverse (P-eqkid); } Else{printf ("%s\n", (Char*) p->eqkid); } Traverse (P-hikid); }  
Application

Here we introduce two applications, one is fuzzy query, one is to find the string containing the common prefix, one is an adjacent query (the Hamiltonian distance is less than a certain range)
Fuzzy query
Psearch ("Root", ". A.a.a") should be able to match Baxaca, CADAKD and other strings

voidPsearch1 (Tptr p,Char*s) {if(p = =NULL) {  return ; }  if(*s = ='.'|| *s < P->s) {//if *s is '. ' or *s < p->s find the left subtreePsearch1 (p->Lokid, s); }  if(*s = ='.'|| *s > P->s) {//Ibid .Psearch1 (p->Hikid, s); }  if(*s = ='.'|| *s = = p->s) {//*s = '. ' or *s = = P->s to find the next characterif(*s && p->s && p->eqkid! =NULL) {Psearch1 (P->eqkid, S +1); }  }  if(*s = =' /'&& P->s = =' /') {printf ("%s\n", (Char*) p->eqkid); }  } 

Solve the problem of matching in Hamiltonian distances, such as the Hamiltonian distances of hobby and DOBBD,HOCBE are 2

voidNearsearch (Tptr p,Char*s,intD//S is the string to look up, and D is the Hamiltonian distance{  if(p = = NULL | | D <0)  return ; if(D >0|| *s < p->s) {Nearsearch (P-Lokid, S, D); }  if(D >0|| *s > P->s) {Nearsearch (P-Hikid, S, D); }  if(P->s = =' /') {  if((int) strlen (s) <=d) {printf ("%s\n", (Char*) p->eqkid); }  } Else{nearsearch (P->eqkid, *s? S +1: S, (*s = = p->s)? D:D-1); }   }

The search engine enters the bin, and then it finds all the prefixes that start with bin to match similar results. For example, BING,BINHA,BINB is the result of finding all prefix matches.

voidPresearch (Tptr p,Char*s)//S is the prefix you want to find{  if(p = =NULL)return; if(*s < p->s) {Presearch (P-Lokid, s); } Else if(*s > p->s) {Presearch (P-Hikid, s); } Else {  if(* (S +1) ==' /') {Traverse (P->eqkid);//Traverse this node, which is to find all the characters that contain this nodereturn ; } Else{presearch (P->eqkid, S +1); }  }  }  
Summarize

1.Ternary search tree is highly efficient and easy to implement
2.Ternary search tree is generally more efficient than hash, because the probability of a hash collision is large when the data is large, and the ternary search tree is exponential growth
3.Ternary Search tree growth and contraction is very convenient, and the hash changes the size of the need to copy memory re-hash and other operations
4.Ternary search tree supports fuzzy matching, Hamiltonian distance lookup, prefix lookup, and other operations
The 5.Ternary search tree supports many other operations, such as dictionary output of all strings, trie can do it, but it is time consuming.

Reference: Http://drdobbs.com/database/184410528?pgno=1

Ternary search Trees three-point tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.