Speaking of the wildcard, we will soon think of * and, with a wildcard character, so that the expression ability greatly enhanced, many Linux commands support this thing, in fact, is glob style pattern.
Even the Redis keys command supports GLOB.
The glob I want to implement supports the following features:
- An asterisk * matches 0 or more arbitrary characters
- ? match exactly one arbitrary character
- [characters] matches any character in square brackets, such as [ABC], either matches a, or matches B, or matches C.
- [!character] excludes the characters in square brackets
- [Character-character], which means that all 2-character ranges can be matched, as in [a-z],[0-9]
The implementation of this thing is actually quite simple, from left to right scan s string and p string, if the last all come to the end, then it can be matched.
The main difficulty lies in the matching of the * number. Because the * number can match 0 or more, you need to test the backtracking. Here by saving the position of the * number, if the back of the walk, pull back to the * position, greedy match.
As for the expansion of square brackets, it is clear that you should make an include and exclude variable.
The code below.
#coding =utf-8def Build_expand (p): #方括号展开 ptr2include = {} Ptr2exclude = {} Ptr2next = {} len_p = Len (p) pptr = 0 while pptr<len_p:if p[pptr] = = ' [': start = pptr Pptr + = 1 include = s ET ([]) exclude = set ([]) while p[pptr]!= '] ': if p[pptr]== '! ': excl Ude.add (p[pptr+1]) pptr + = 2 elif p[pptr+1] = = '-': Include.update ({C HR (x) for X in range (Ord (p[pptr]), Ord (p[pptr+2]) +1)}) Pptr + = 3 Else: Include.add (P[pptr]) pptr + = 1 if include:ptr2include[start] = include If exclude:ptr2exclude[start] = exclude Ptr2next[start] = Pptr + 1 Else: Pptr + = 1 return ptr2include, Ptr2exclude, Ptr2nextdef isMatch (S, p): len_s = Len (s); Len_p = Len (p) sptr = pptr = SS = 0 Star = NoNe ptr2include, ptr2exclude, Ptr2next = Build_expand (p) while Sptr<len_s:if Pptr<len_p and (P[pptr] in ['? ', S[sptr]]): Sptr + = 1; Pptr + = 1 Continue if pptr<len_p and p[pptr] = = ' [': If pptr in Ptr2include and S[sptr] in PTR2INCLUDE[PPTR]: Sptr + = 1 pptr = Ptr2next[pptr] Continue if PP TR in Ptr2exclude and S[sptr] not in ptr2exclude[pptr]: sptr + = 1 pptr = ptr2next[pptr] Continue if Pptr<len_p and p[pptr]== ' * ': star = pptr; Pptr + = 1; SS = Sptr Continue if star is not none:pptr = star + 1; SS + = 1; SPTR = SS Continue return False while Pptr<len (p) and p[pptr]== ' * ': pptr + = 1 return pPt r = = Len_pif __name__ = = ' __main__ ': params = [("AA", "a"), ("AA", "AA"), ("AAA", "AA"), ("AA", "*"), ("AA", "A *"), ("AB", "? *"), ("AaB", "C*a*b"), ("Cab", "C*a*b"), ("Cxyzbazba", " C*ba "), (' abc ', ' Ab[a-c] '), (' Abd ', ' ab[a-c] '), (' Abe ', ' ab[cde] '), (' Abe ', ' ab[!e ] '), (' Abe ', ' ab[!c] '),] for P in Params:print p,ismatch (*p)
The running result is
(' AA ', ' a ') False
(' AA ', ' AA ') True
(' AAA ', ' AA ') False
(' AA ', ' * ') True
(' AA ', ' A * ') True
(' Ab ', '? * ') True
(' AaB ', ' c*a*b ') False
(' Cab ', ' c*a*b ') True
(' Cxyzbazba ', ' C*ba ') True
(' abc ', ' Ab[a-c] ') True
(' Abd ', ' ab[a-c] ') False
(' Abe ', ' ab[cde] ') True
(' Abe ', ' ab[!e] ') False
(' Abe ', ' ab[!c] ') True
Implement Glob style pattern with Python