"Every day data structure and algorithm" PHP trie data structure Usage scenarios and code examples

Source: Internet
Author: User
Tags ord tree serialization

I. Introduction of Trie

Trie tree, also known as the Dictionary tree, the word search tree or the prefix tree, is a multi-fork tree structure for fast retrieval, such as the English Letter Dictionary tree is a 26-fork tree, the number of the dictionary tree is a 10-fork tree.

Trie Word from retrieve, pronounced as/tri:/ "Tree" was also read as/tra?/"Try".

The trie tree can use the common prefix of a string to conserve storage space. As shown, the Trie tree holds 6 string tea,ten,to,in,inn,int with 10 nodes.

In the trie tree, the common prefix of the string in,inn and int is "in", so you can save space by storing only one copy of "in". Of course, if there are a large number of strings in the system and these strings do not have a common prefix, then the corresponding trie tree will consume memory very much, which is also a disadvantage of the trie tree.

The basic properties of the trie tree can be summed up as:

(1) The root node does not contain characters, and the root node unexpectedly contains only one character per node.

(2) from the root node to a node, the characters that pass through the path are concatenated to the corresponding string for that node.

(3) All child nodes of each node contain different strings.

Second, the advantages of trie

1. The time complexity of finding or matching a string is only related to the depth of the tree, regardless of the number of nodes.

2. Therefore, in the search for large amounts of data, or matching data, or filtering data has a good implementation.

Third, the implementation of PHP code

trie.php

<?PHP/** * Created by Phpstorm. * USER:JYSDHR * DATE:2017/7/4 * time:9:57 * description:php implement trie dictionary data structure*/include"Trienode.php";classtrie{Private $root;  Public function__construct () {$this->root =NewTrienode (); }     Public functionForeach_trie () {Echo"<pre>"; Print_r($this-root); }     Public functionInsert$str)    {        $this->__insert ($this->root,$str); }     Public functionSearchstring $str):BOOL {return $this->__search ($this->root,$str); }    Private function__insert (&$node,$str)    {        if(strlen($str) = = 0)            return; //first character, which fork to insert        $k=Ord(substr($str, 0, 1))-Ord(' A '); if(!isset($node->childs[$k]) ||$node->childs[$k] ==NULL) {            //If the fork does not exist, re-open the fork            $node->childs[$k] =NewTrienode (); //Record characters            $node->childs[$k]->nodechar =$k; $node->childs[$k]->is_end =strlen($str) = = 1?true:false; }        $nextWord=substr($str, 1); $this->__insert ($node->childs[$k],$nextWord); }    /** * @Description: Find out if str exists in tree * @User: JYSDHR*/    Private function__search ($node,$str)    {        if(strlen($str) = = 0)            return false; //first, the STR is split        $k=Ord(substr($str, 0, 1))-Ord(' A '); if(isset($node->childs[$k])) {            $nextWord=substr($str, 1); if(strlen($str) = = 1) {                //match last character                if($node->childs[$k]->is_end)return true; }            return $this->__search ($node->childs[$k],$nextWord); }        return false; }}

trienode.php

<? PHP /*  */class  trienode{    public  $nodeChar,$childs ,$is _end;      Public function __construct ()    {        $thisarray();    }}

testtrie.php

<?PHP//test File demo.phpinclude"Trie.php";$str=file_get_contents(' Bbe.txt ');//reads the entire contents of a file into a string$badword=Explode(" ",$str);//Convert an array$trie=NewTrie ();foreach($badword  as $word)    $trie->insert ($word);//the Array_combine () function creates a new array by merging two arrays, one of which is the key name and the value of the other array is the key value. If one of the arrays is empty, or if the number of elements in the two array is different, the function returns FALSE. The Array_fill () function populates the array with the given values, the returned array has number elements, and the value is values. The returned array uses a numeric index, starting from the start position and incrementing. If number is 0 or less than 0, an error occurs. $badword 1 = array_combine ($badword, Array_fill (0,count ($badword), ' * '));$test _str= ' Knowledgeasdad ';$start _time=Microtime(true);Var_dump(In_array($test _str,$badword));$end _time=Microtime(true);Echo($end _time-$start _time).‘ </br> ';$start _time1=Microtime(true);Var_dump($trie->search ($test _str));$end _time1=Microtime(true);Echo($end _time1-$start _time1).‘ </br> ';$start _time2=Microtime(true);foreach($badword  as $value){    if($value==$test _str){        Echo' 1 ';  Break; }}$end _time2=Microtime(true);Echo($end _time2-$start _time2).‘ </br> ';?>

The above is an example of matching a word in the Bible, In_array is a traversal match, and the longer the matching word appears, the longer it takes, and when the matching word does not exist, the time is unacceptable.

The performance of Trie is relatively stable, whether or not there is, or the match word appears sooner or later, and the running time is not greatly affected, only with the length of the word. (best effect, most stable)

Traversal lookups are basically equivalent to in_array effects.

Iv. Summary

Therefore, in the project development, in doing sensitive word shielding, statistical frequency, dictionary and other functions, you can consider the use of TRIE data structure to generate a trie tree serialization storage, to update the thesaurus when re-maintenance of trie tree, is not that it is not very difficult to refuel.

"Every day data structure and algorithm" PHP trie data structure Usage scenarios and code examples

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.