Automated machine Mind Map

Source: Internet
Author: User
Automatic machines are used in many ways, not only by compilers, but also by regular expressions. Others, such as string search and storage, model verification for complex points, and theorem proof ...... The simplest automatic mechanism should be DFA. To put it simply, a DFA is a map <int, Map <char, int>, where Int Is the state ID, char is the transfer label. If someone is not simple enough, the simplest is int [] [256], which is a matrix of 256 columns. Such a simple thing involves profound ideas and has a very wide range of uses. A few months ago, due to project needs and your own interests, we achieved a series Algorithm And the data structure (written in C ++ 11), the results and achievements far exceed my imagination and expectations at the beginning of my first day, because there are several algorithms, my implementation is far better than any existing implementation, not only the speed is much faster, but also the memory usage is much smaller. Today, I am a little idle. I will organize a mind map and describe it one by one later. Mind Map is better to open a new window and then hide it. here we can only see a small part. Dfa1 DFA minimization1.1 General DFA minimization1.1.1 hopcroft algorithm1.1.1.1 partition refinement1.1.1.2 a mutation to the splitter generator, for performance1.1.1.3 waiting set strategy1.1.1.3.1 stack is running queue is slower1.1.1.4 intersections of partitions are all except * efficient implementation needs double linked list * use array index as the link1.2 path compression this is not dfa minimization, it's just an optimization. A sequence of states which all have just 1 transition. first state of the sequence couldn't be a final state. last state of the sequence couldn't be a confluence
State.2 ADFA acyclic dfa2.1 acyclic DFA minimization General minimization algorithm works for this case, but there are dedicated algorithm, which needs less memory, sometimes faster2.1.1 sorted input2.1.2 random input2.1.3 minadfa or dawg2.2 don't support the mapping use it as a set this is a dynamic set2.3 identify each word cocould be used as a map2.3.1 the Word ID is the dictionary ordinal of the word in the Dawg the word <=> ID mapping is a bidirectional mapping2.3.2 cltk implementaion2.3.2.1 save a number in the state. 2.3.2.2 the number is the total pathes from the state to all reachable final state.2.3.2.3 pros: Number update for random inserting word cocould be efficient enough2.3.2.4 cons: mapping a word to its (dictinary) ordinal is not fast enough: O (| Σ | * strlen (Word) 2.3.3 febird implementation2.3.3.1 save a number in the transition2.3.3.2 Let s be the source of the transition. let C be the label char of the transition. the number saved in the transition is the total paths from S which label char is less than c.2.3.3.2.1 the unique single transition of a State was saved in the state's self, and the number is always 0, thus it cocould be omited2.3.3.2.2 in applications, there are about 50% States just have a single transition2.3.3.2.3 for States which has multiple transitions, the number of its first Transtion is always 0, this 0 is not omited, to omit this 0, implementation code wocould be much complex and nasty2.3.3.3 pros: mapping a word to its dictionary ordinal is fast: O (strlen (Word) 2.3.3.4 cons: number update for dynamic inserting word couldn't be efficient3 Aho-corasick multiple pattern matching3.1 combine with numbered adfa3.2 word set storage in adjacent difference use less memory4 febird implementation: * using states have 1 Single Transition * few States have a few transitions * very few States have except transitions * almost all States have a small range of label char4.1 struct state32 {// There are other impl uint32_t SPOs; char_t Minch; char_t maxch; // sive unsigned term_bit: 1; unsigned pzip_bit: 1 ;}; // If Minch = maxch, SPOs is the target state_id // else SPOs is the offset to targets
Map in mempool // The targets map consist a bitmap and targets id4.2 all States are saved in a vector * State cocould be deleted * deleted states are put into freelist * When state vector is full, double its capacity4.3 written in C ++ 11 * templatized, for Performance & memory utilization * for concise, use Lambda extensively * use class Enum, just for clear5 abastrac concept: Map <state, map <char, State>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.