Implementing regular expressions to match any character parsing

Source: Internet
Author: User
Tags character set regular expression

How do we implement regular expressions to match the effects of any character? What does this process need to use? What do you need to pay attention to in the specific operation? So now we're going to uncover this mysterious side:

Implement a regular expression to match the truth of any character:

Use the "." Matches almost any character. In the regular expression, "." is one of the most commonly used symbols. Unfortunately, it is also one of the most easily misused symbols.

“.” Match a single character without caring what the matched character is. The only exception is the new line character. The engine mentioned in this tutorial, by default, does not match the new line character. So by default, "." equals the shorthand for character set [^\n\r] (Window) or [^\n] (Unix).

This exception is due to historical reasons. Because the tools used to use regular expressions in the early days are based on rows. They all read a file on one line and apply the regular expression to each row separately. In these tools, strings do not contain new line characters. So "." It does not match the new line character.

Modern tools and languages can apply regular expressions to large strings and even entire files. All of the regular expression implementations discussed in this tutorial provide an option to make the "." Matches all characters, including new line breaks. In tools such as Regexbuddy, EditPad Pro or powergrep, you can simply select the "dot matching new line character". In Perl, "." Patterns that can match new line characters are called "Single-line mode." Unfortunately, this is a very confusing term. Because there are also so-called "multi-line mode." Multiple-line mode affects only the anchoring (anchor) at the end of the line, whereas Single-line mode affects only ".".

Other languages and regular expression libraries also use the terminology defined in Perl. When you use regular expression classes in the. NET framework, you can activate Single-line mode with a statement similar to the following: Regex.match ("string", "Regex", Regexoptions.singleline)

A summary of the implementation of regular expressions to match arbitrary characters:

Conservative use of dot number "."

The point number can be said to be the most powerful meta character. It allows you to be lazy: with a point number, you can match almost any character. The problem, however, is that it often matches characters that should not be matched.

I'll take a simple example to illustrate. Let's take a look at how to match a date with a "mm/dd/yy" format, but we want to allow the user to select the separator character. One solution that will soon come to mind is <<\d\d.\d\d.\d\d>>. It looks like it can match the date "02/12/03". The problem is that 02512703 will also be considered a valid date.

<<\d\d[-/.] \d\d[-/.] \d\d>> seems to be a better solution. Remember that the dot is not a meta character in a character set. This scheme is far from perfect, it will match "99/99/99". and <<[0-1]\d[-/.] [0-3]\d[-/.] \d\d>> went further. Although he will also match "19/39/99". The degree to which you want your regular expression to achieve perfection depends on what you want to achieve. If you want to validate user input, you need to be as perfect as possible. If you just want to analyze a known source and we know that there is no wrong data, it is enough to match the character you want to search with a better regular expression.

Implement regular expressions to match any character related content to introduce you here, I hope that you understand and learn to implement regular expressions to match any character is helpful.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.