Tutorial on the basic usage of Regular Expressions in Ruby programs, ruby Regular Expressions

Source: Internet
Author: User

Tutorial on the basic usage of Regular Expressions in Ruby programs, ruby Regular Expressions

Most of Ruby's built-in types are similar to other programming languages. Mainly include strings, integers, floats, and arrays. However, only script languages such as Ruby, Perl, and awk provide support for built-in expression types. Although the regular expression is relatively hidden, it is a powerful text processing tool.

Regular Expressions are a simple method to match strings using the specified pattern. In Ruby, a typical way to create a regular expression is to write the pattern between two diagonal lines/pattern /.

After all, Ruby is Ruby, and regular expressions are also objects and can operate like objects.

For example, you can use the following regular expression to write a pattern that matches a string that contains Perl or Python.

<!--more-->/Perl|Python/

In the body of a forward slash, it is two strings that we want to match. They are separated by "|. The pipeline operator indicates "the Left or Right". In this mode, it is Perl or Python.

You can also use parentheses in the pattern, as in arithmetic expressions, so this pattern can also be written

/P(erl|ython)/

You can also specify duplicates in the mode. For example, the plus sign,/AB + c/matches one or more bits behind a in a string and then follows a c. Replace the plus sign with an asterisk, and the regular expression/AB * c/is created to match one a followed by 0 or more B and then followed by a c.

You can also match a group of characters in the mode. Examples of common character types include \ s, which matches a blank character (space, tab, line break, etc.); \ d matches any number; \ w matches any typical word character. Period (.) match (basically) any character.

We combine all these to form a practical regular expression.

/\d\d:\d\d:\d\d/ # a time such as 12:34:56
/Perl.*Python/  # Perl, zero or more other chars, then Python
/Perl Python/  # Perl, a space, and Python
/Perl *Python/  # Perl, zero or more spaces, and Python
/Perl +Python/  # Perl, one or more spaces, and Python
/Perl\s+Python/ # Perl, whitespace characters, then Python
/Ruby (Perl|Python)/ # Ruby, a space, and either Perl or Python

Once a mode is created, it is very depressing not to use it. Matching operator = ~ It is used to match a string with a regular expression. If the match is successful ~ Returns the location where the first match is successful. Otherwise, it returns nil. That is to say, you can use a regular expression in the condition declaration of if and while. For example, the following code snippet,

If the string contains text Perl or Python, output a message.

puts "Scripting language mentioned: #{line}" if line =~ /Perl|Python/

You can use Ruby to replace all the places where Perl and Python appear.

line.gsub(/Perl|Python/, 'Ruby')

Extract an example from the iHower's Ruby on Rails practice Bible and use a regular expression to capture the mobile phone number:

phone = "139-1234-5678"
if phone =~ /(\d{3})-(\d{4})-(\d{4})/
 start_with = $1
 mid_num = $2
 end_as = $3
end

General rules (for normal display, all placed in the code block)

  • /A/match character.
  • /\? /Match special characters ?. Special characters include ^, $ ,? ,.,/, \, [,], {,}, (,), + ,*.
  • . Match any character, such as/a./match AB and ac.
  • /[AB] c/matches the range between ac and bc, and. For example:/[a-z]/,/[a-zA-Z0-9]/.
  • /[^ A-zA-Z0-9]/match strings not in this range.
  • /[\ D]/represents any number
  • /[\ W]/represents any letter, number, or _
  • /[\ S]/Represents blank characters, including spaces, tabs, and line breaks.
  • /[\ D]/,/[\ W]/,/[\ S]/are the above negative conditions.

Advanced rules

  • ? 0 or 1 character. /Mrs? \.? /Match "Mr", "Mrs", "Mr.", "Mrs .".
  • * Represents 0 or multiple characters. /Hello */matches "Hello", "HelloJack ".
  • + Represents 1 or more characters. /A + c/match: "abc", "abbdrec", and so on.
  • /D {3}/matches 3 numbers.
  • /D {}/matches 1-10 digits.
  • /D {3 ,}/ matches more than three numbers.
  • /([A-Z] \ d) {5}/match first with uppercase letters, followed by four strings of numbers.

Regular Expression operation

String and RegExp support = ~ Match:

puts "I can say my name" =~ /name/ #-> 13

a = /name/.match("I can say my name, my name I can say") #-> a is MatchData
puts a[0] #-> name

It can be seen that, if it can match, = ~ Returns the position of the matched string, while match returns a MatchData object. If not, nil is returned. MatchData can retrieve the content that matches each sub-match (or sub-pattern). See the following example:

b1=/[A-Za-z]+,[A-Za-z]+,Mrs?\./.match("Jack,Wang,Mrs., nice person")
puts b1[0] #-> Jack,Wang,Mrs

b2=/(([A-Za-z]+),([A-Za-z]+)),Mrs?\./.match("Jack,Wang,Mrs., nice person:)
puts b2[0] #-> Jack,Wang,Mrs
puts b2[1] #-> Jack,Wang
puts b2[2] #-> Jack
puts b2[3] #-> Wang

M [0] returns a string that matches the primary expression. The following method is equivalent: m [n] = m. captures [n]

Ruby also automatically fills in some global variables for us. They are named by numbers, $1, $2, and so on, $1 contains the substring pattern matched in the first pair of parentheses starting from the left in the regular expression, and so on. We can see that the matching is from outside to inside, from left to right.

Greedy quantifiers and non-Greedy quantifiers

The quantifiers * (representing zero or more) and + (representing one or more) are greedy and match as many characters as possible, we can add ?, Make it a non-Greedy quantizer:

The following code is: one or more characters followed by an exclamation point.

teststr="abcd!efg!"
match=/.+!/.match(teststr)
puts match[0] #-> abcd!efg!

limitmatch=/.+?!/.match(teststr)
puts limitmatch[0] #-> abcd!

Anchor

An anchor must be filled with conditions to continue matching:

  • ^ Beginning of Line
  • $ End of line
  • \ A string start
  • \ Z string end
  • \ Z string end (excluding the last line break)
  • \ B word boundary
c=/\b\w+\b/.match("!!Stephen**")puts c[0] #-> Stephen

Preview assertions

Preview assertions indicate that you want to know what is next, but they do not match

Affirmative preview assertions (? =)
Suppose we want to match a sequence of numbers, which ends with a dot but does not want to use the dot as part of the pattern match.

teststr="123 456 789. 012"
m=/\d+(?=\.)/.match(teststr)
puts m[0] #-> 789

Negative preview assertions (?!)
In the preceding example, if/\ d + (? = \.)/To/\ d + (?! \.)/, The puts m [0] output is 123.

Modifier

The modifier is behind the forward slash of the ending regular expression.

1. I make the regular expression not case sensitive
For example,/abc/I can match Abc, abc, and ABC.

2. m allows the regular expression to match any character, Including line breaks. Generally, the dot wildcard does not match the line breaks.
Conversion between strings and Regular Expressions

Insert a regular expression into a string

teststr="a.c"
re=/#{Regexp.escape(teststr)}/
puts re.match("a.c")[0] #-> a.c
test=re.match("abc")
puts test[0] #-> Nil

Regular Expression to string

puts /abc/.inspect #-> /abc/

Common methods for using regular expressions:

  • Used for if, while, etc.
  • Used for gsub, grep, etc.
  • Used for find_all, scan, etc.

For example, puts "test 1 2 and test 3 4". scan (/\ d/) Outputs ["1", "2", "3", "4"].


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.