Trie tree and Its Application

Last Update:2013-11-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Trie tree

--------------------------------------------------------------------------------

The Trie tree, also known as the word search tree and dictionary tree, is a tree structure. It is a variant of the hash tree and a multi-tree structure used for quick search. A typical application is to count and sort a large number of strings (but not limited to strings), so it is often used by the search engine system for text word frequency statistics.

The advantage of the Trie tree is: to minimize unnecessary string comparisons, the query efficiency is higher than that of hash tables. The core idea of Trie is to change the space for time. The public prefix of the string is used to reduce the overhead of the query time to improve the efficiency. The Trie tree also has its disadvantages. The Trie tree consumes a lot of memory.

Structure Features of The Trie tree:
1. The root node has no data

2. Each node except the root node contains only one character.

3. A string is formed from the root node to a node.

The structure of the Trie tree is as follows:

Implementation of the Trie tree

The following is a simple Trie tree implementation. It is assumed that it only contains 26 characters, regardless of case.

--------------------------------------------------------------------------------

 #include <stdlib.h>   class Trie{ public:     Trie();     ~Trie();      int insert(const char* str);     int search(const char* str)const;     int remove(const char* str);      static const int CharNum = 26;  protected:     typedef struct s_Trie_Node{         bool isExist;         struct s_Trie_Node* branch[Trie::CharNum];         s_Trie_Node();     }Trie_Node;      Trie_Node* root; };  Trie::Trie():root(NULL){} Trie::~Trie(){}  Trie::s_Trie_Node::s_Trie_Node():isExist(false){     for(int i = 0; i < Trie::CharNum; ++i){         branch[i] = NULL;     } }  int Trie::insert(const char* str){      if(root == NULL){         root = new Trie_Node();     }      Trie_Node* pos = root;     int char_pos;      while(pos != NULL && *str != '\0'){         if(*str >= 'a' && *str <= 'z'){             char_pos = *str - 'a';         } else if(*str >= 'A' && *str <= 'Z'){             char_pos = *str - 'A';         } else {             return -1;         }          if(pos->branch[ char_pos]  == NULL){             pos->branch[ char_pos ] = new Trie_Node();         }          pos = pos->branch[ char_pos ];          str++;     }      if(pos->isExist){         return 0;     } else {         pos->isExist = true;         return 1;     } }  int Trie::search(const char* str)const{     Trie_Node* pos = root;      int char_pos;     while(pos != NULL && *str != '\0'){         if(*str >= 'a' && *str <= 'z'){             char_pos = *str - 'a';         } else if(*str >= 'A' && *str <= 'Z'){             char_pos = *str - 'A';         } else {             return -1;         }          pos = pos->branch[char_pos];         str++;     }      if(pos != NULL && pos->isExist){         return 1;     } else {         return 0;     } }  int Trie::remove(const char* str){     Trie_Node* pos = root;      int char_pos;      while(pos != NULL && *str != '\0'){         if(*str >= 'a' && *str <= 'z'){             char_pos = *str - 'a';         } else if(*str >= 'A' && *str <= 'Z'){             char_pos = *str - 'A';         } else {             return -1;         }          pos = pos->branch[ char_pos ];         str++;     }      if(pos != NULL && pos->isExist){         pos->isExist = false;         return 1;     } else {         return 0;     } } #include <stdlib.h>class Trie{public: Trie(); ~Trie(); int insert(const char* str); int search(const char* str)const; int remove(const char* str); static const int CharNum = 26;protected: typedef struct s_Trie_Node{  bool isExist;  struct s_Trie_Node* branch[Trie::CharNum];  s_Trie_Node(); }Trie_Node; Trie_Node* root;};Trie::Trie():root(NULL){}Trie::~Trie(){}Trie::s_Trie_Node::s_Trie_Node():isExist(false){ for(int i = 0; i < Trie::CharNum; ++i){  branch[i] = NULL; }}int Trie::insert(const char* str){ if(root == NULL){  root = new Trie_Node(); } Trie_Node* pos = root; int char_pos; while(pos != NULL && *str != '\0'){  if(*str >= 'a' && *str <= 'z'){   char_pos = *str - 'a';  } else if(*str >= 'A' && *str <= 'Z'){   char_pos = *str - 'A';  } else {   return -1;  }  if(pos->branch[ char_pos]  == NULL){   pos->branch[ char_pos ] = new Trie_Node();  }  pos = pos->branch[ char_pos ];  str++; } if(pos->isExist){  return 0; } else {  pos->isExist = true;  return 1; }}int Trie::search(const char* str)const{ Trie_Node* pos = root; int char_pos; while(pos != NULL && *str != '\0'){  if(*str >= 'a' && *str <= 'z'){   char_pos = *str - 'a';  } else if(*str >= 'A' && *str <= 'Z'){   char_pos = *str - 'A';  } else {   return -1;  }  pos = pos->branch[char_pos];  str++; } if(pos != NULL && pos->isExist){  return 1; } else {  return 0; }}int Trie::remove(const char* str){ Trie_Node* pos = root; int char_pos; while(pos != NULL && *str != '\0'){  if(*str >= 'a' && *str <= 'z'){   char_pos = *str - 'a';  } else if(*str >= 'A' && *str <= 'Z'){   char_pos = *str - 'A';  } else {   return -1;  }  pos = pos->branch[ char_pos ];  str++; } if(pos != NULL && pos->isExist){  pos->isExist = false;  return 1; } else {  return 0; }}

Trie tree Application

--------------------------------------------------------------------------------

1. Search for a large number of strings
In many scenarios, such as the search engine's statistics on the Word Frequency in text, the search engine's log statistics on the user's keyword search frequency, and so on. The following are two typical questions:

1. Find the top 10 URLs in a large number of log files.

The Trie tree is no longer suitable for this problem. For statistics on a large number of log files, the trie tree is quite fast. Combined with the minimal heap and trie tree, you can search the log file once to get the result.

2. Implement the prompt input function for a website

This problem requires that input suggestions be displayed in real time when users enter the data, which can be easily implemented using the trie tree.

2. Sort strings
For sorting large-scale strings, you only need to calculate the string once, construct the trie tree, and traverse the output to get the sorting result.

3. Find the longest public prefix of the string
This problem is obvious.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Trie tree and Its Application

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support