1. hashing
We often use the hash function of such a string:
// Handwritten, not strictly tested
Unsigned long Hash (char * Str)
{
Assert (null! = Str );
Unsigned long hash_val = 0 xdeedbeeful; // hash Seed
Unsigned char * P = (unsigned char *) STR;
While (* P! = '/0 '){
Hash_val = 37 * hash_val + * P;
++ P;
}
Return hash_val;
}
The implementation of the Markov chain in chapter 1 of Programming Practice uses almost identical hash functions. The advantage of this function is that it is fast and the hash value distribution of English words is good. However, it is too simple to be attacked. There are two attack methods: 1) construct an input sequence so that each string in the sequence is different from each other, but the hash value is the same; 2) construct an input sequence, each string in the sequence is different from each other, and the hash values are not necessarily the same. However, the number of buckets in the hash table used by these hash values is the same as that in the remainder (that is, hash_val % bucket_size is equal ). In this way, the hash table can be degraded into a linked list. This greatly increases the search time. Http://www.cs.rice.edu /~ Scrosby/HASH /)
To solve this problem, attackers need to use much more complex hash functions (MD5 and SHA-1) to construct sequences with the same hash value.
2. Regular Expressions
Regular Expression Engine generally has three types: DFA (deterministic finite automaton), traditional NFA (Nondeterministic Finite Automaton), and posix nfa. The three engines have a stronger function and a slower speed. (Http://msdn.microsoft.com/library/en-us/cpguide/html/cpconmatchingbehavior.asp) (http://www.oreilly.com/catalog/regex/chapter/ch04.html)
DFA provides the fastest speed and ensures linear time (?). NFA requires backtracking technology to support backreference (Forward reference). Generally, NFA is linear time, but the worst case is exponential time!
For example, the expression (x +) + y can match xxx... xxxy but cannot match xxx... XXXX. In a certain version of Python, the matching of this expression is exponential time. Http://mail.python.org/pipermail/python-dev/2003-May/035916.html)
3. quicksort
The average running time of quicksort is O (n log n), but it may degrade to O (N ^ 2), which is a required part of the data structure course. The qsort of C library cannot be guaranteed to be "fast" sorting. Http://www.cs.dartmouth.edu /~ Doug/mdmspe.pdf ). C ++ library's sort () may also degrade to the Oracle (N ^ 2) Turtle speed sorting before transformation. Sgi stl sort () uses a new, hybrid sorting algorithm,Introsort(Introspective sort), That behaves almost exactly like median-of-3 quicksort for most inputs (and is just as fast) but which is capable of detecting when partitioning is tending toward quadratic behavior. by switching to heapsort in those situations, introsort achieves the sameO (n logn)Time Bound as heapsort but is almost always faster than just using heapsort in the first place. (http://www.cs.rpi.edu /~ Musser/GP/algorithms.html ). For the analysis of this sort algorithm, see (http://www.cs.rpi.edu /~ Musser/GP/introsort. ps)