Let's look at a more interesting program. This time we're going to test whether a string matches a description produced by the Concise pattern (concise) encoding.
In these patterns, some characters or character combinations have unique meanings, including:
Copy Code code as follows:
[] The range descriptor (for example, [A-z] represents a letter in the range A through Z)
\w letters or numbers; equivalent to [0-9a-za-z]
\w is not a letter, a number
\s [\t\n\r\f] null character; equivalent to [\t\n\r\f]
\s non-null characters
\d [0-9] number; equivalent to [0-9]
\d non-numeric characters
\b Backspace (0x08) (only when the range descriptor is inside)
\b Word boundary (word boundary) (when outside the range descriptor)
\b Non-word boundary
* Front element appears 0 or more times
+ The preceding element appears 1 or more times
{m,n} The preceding element appears at least m times, up to N times
? The preceding element appears up to 1 times; the equivalent of {0,1}
| Match the preceding or following expressions
() group (grouping)
The odd words used together in those patterns are called regular expressions. Like Perl, Ruby uses a front slash, not a double quote, to enclose them. If you've never used regular expressions before, maybe they seem to be nothing but rules (regular). But it's wise to spend a little time understanding them. When you need to pattern match, find, or otherwise manipulate a string, its efficient expression ability can cure your headaches (and save a lot of line code).
For example, imagine that we want to test whether a string conforms to such descriptive information "starts with a lowercase f, follows an uppercase letter, and may be followed by many non-lowercase letters." If you're a seasoned C programmer, you probably have a dozens of-line program in your head, right? Admit it, you can't control yourself. In Ruby, you just have to/^f[a-z your string with regular expressions (^[a-z]) *$/check it out.
What about a 16-digit number enclosed by <>? No problem.
Copy Code code as follows:
Ruby> def Chab (s) # "contains hex in angle brackets"
| (S =~/<0 (x| X) (\d|[ a-f]| [A-f]) +>/)!= Nil
| End
Nil
Ruby> Chab "Not this one."
False
Ruby> Chab "Maybe this?" {0x35} "# wrong kind of brackets
False
Ruby> Chab "Or this? <0x38z7e> "# Bogus hex digit
False
Ruby> Chab "Okay, this: <0xfc0004>."
True
Although the initial appearance of the regular expression is quite a headache, you will soon be satisfied with how efficiently you can express your meaning.
Here's a small program that can help you experiment with regular expressions, save it as a regx.rb, and then type ' Ruby Regx.rb ' running on the command line.
Copy Code code as follows:
# Requires an ANSI terminal!
st = "\033[7m"
En = "\033[m"
While TRUE
Print "Str>"
Stdout.flush
str = gets
Break if not str
str.chop!
Print "Pat>"
Stdout.flush
Re = gets
Break if not re
re.chop!
str.gsub! Re, "#{st}\\&#{en}"
Print str, "\ n"
End
print "\ n"
This applet requires input two times, a string, and a regular expression. The string entered is checked by a regular expression, and then all the matches are displayed with an inverse-view high brightness. Forget the details, and then you'll have code analysis.
Copy Code code as follows:
Str> Foobar
Pat> ^fo+
Foobar
~~~
The red section above will be shown in the program input as a counter view. The following "~ ~" line is for the convenience of those who use character-based browsers.
Let's try a few more inputs:
Str> abc012dbcd555
Pat> \d
abc012dbcd555
If you're surprised, look at the table at the beginning of this page: \d is independent of the letter D, but corresponds to a single number.
What if there are more than one way to match a pattern?
Str> Foozboozer
Pat> f.*z
Foozboozer
~~~~~~~~
The reason Foozbooz is matched, not just fooz, is because a regular expression matches the longest substring possible.
The following is a pattern match that separates a colon-delimited numeric time period from a string.
Str> Wed Feb 7 08:58:04 JST 1996
pat> [0-9]+:[0-9]+ (: [0-9]+)?
Wed Feb 7 08:58:04 JST 1996
"=~" is a matching (matching) operator that matches a regular expression, it returns a matching location found in a string, or returns a nil representation pattern that cannot be matched.
Ruby> "ABCdef" =~/d/
3
Ruby> "AAAAAA" =~/d/
Nil