JavaScript Tutorial: Getting Started learning regular expressions

Source: Internet
Author: User
Tags end expression net regular expression string

Article Introduction: Regular Expression 30-minute introductory tutorial.

Objective of this article

In 30 minutes you will understand what the regular expression is and have some basic knowledge of it so that you can use it in your own program or Web page.

How to use this tutorial

Don't be intimidated by the complex expressions below, as long as you follow me step-by-step, you will find that the regular expression is not as difficult as you think. Of course, if you have finished this tutorial, it's normal to find out that you know a lot, but you can't remember nearly anything--I think that people who have not contacted the regular expression will have zero chance to remember the grammar mentioned above over 80% after reading this tutorial. Here is just to let you understand the basic principles, you need more practice, more use, to master regular expression.

In addition to being an introductory tutorial, this article attempts to become a regular expression grammar reference manual that you can use in your daily work. As far as the author's own experience is concerned, the goal is well done--you see, I haven't been able to write everything down myself, have I?

What exactly is a regular expression?

When writing a program or Web page that handles strings, there is often a need to find strings that match some of the complex rules. Regular Expressions are the tools used to describe these rules. In other words, regular expressions are code that records text rules.

It is likely that you have used the wildcard character (wildcard)for file lookup under Windows/dos, that is, * and ?. If you want to find all the Word documents in a directory, you will search for *.doc. Here,* will be interpreted as any string. Like wildcard characters, regular expressions are also tools for text matching. It's just a more precise description of your needs than a wildcard--and, of course, the price is more complicated--for example, you can write a regular expression that looks for all of the 0 starts, followed by 2-3 digits, then a hyphen "-", The last is a 7-or 8-digit string (like 010-12345678 or 0376-7654321).


The best way to learn regular expressions is to start with examples, to understand the examples, and to modify the examples themselves. A few simple examples are given here, and they are described in detail.

If you look for hi in an English novel, you can use the regular expression hi.

This is almost the simplest regular expression, and it can exactly match such a string: it is composed of two characters, the first character is H, and the latter is I. Typically, the tool that handles regular expressions provides an option to ignore the case, and if this option is selected, it can match any of the four cases of hi,hi,hi, hi.

Unfortunately, many words contain the two consecutive characters of Hi , such as him,history, High, and so on. If you look it up with Hi , the side of Hi will also be found. If you want to find the word "hi" accurately, we should use \bhi\b.

\b is a special code that is prescribed by regular expressions (well, some people call it metacharacters, Metacharacter), representing the beginning or end of a word, the boundary of a word. Although English words are usually separated by spaces, punctuation marks, or newline, \b does not match any of these word-delimited characters, it matches only one position .

If you're looking for hi, not far behind. Follow a Lucy, you should use \bhi\b.*\blucy\b.

Here,. is another meta character that matches any character other than a line break. * is also a meta character, but it represents not a character, nor a position, but a quantity--it specifies that the content at the front of the * can be reused repeatedly to match the entire expression. So,. * connecting together means any number of characters that do not contain a newline. Now the meaning of \bhi\b.*\blucy\b is obvious: first a word hi, then any arbitrary character (but not a newline), and finally the word Lucy.

If you use other metacharacters at the same time, we can construct a more powerful regular expression. For example, the following example:

0\d\d-\d\d\d\d\d\d\d\d matches a string that starts with 0, then two digits, then a hyphen "-", and finally 8 digits (that is, China's phone number). Of course, this example can only match a case with an area code of 3 digits.

The \d here is a new meta character that matches a digit (0, or 1, or 2, or ...). - Not a metacharacters, just match itself-hyphen (or minus, or middle horizontal, or whatever you call it).

To avoid so many annoying repetitions, we can also write this expression:0\d{2}-\d{8}. Here \d {2}({8}) means that the preceding \d must be repeated 2 times (8 times) consecutively.

Testing Regular Expressions

If you don't think regular expressions are hard to read or write, you're either a genius or you're not from Earth. The syntax of regular expressions is a headache, even for people who often use it. Because it is difficult to read and write, it is easy to make mistakes, so it is necessary to find a tool to test regular expressions.

Some of the details of regular expressions in different environments are not the same, and this tutorial describes the behavior of the Microsoft. Net Framework 4.0 Regular expressions, so I recommend that I write to you. NET tool Regular expression tester. Please refer to the description of the page to install and run the software.

The following is a screenshot of the Regex Tester Runtime:

Now you know a few very useful meta characters, such as \b,.,*, and \d. There are more metacharacters in regular expressions, such as \s matching any whitespace, including spaces , Tabs (tab), line breaks, Chinese full-width spaces, and so on. \w matches letters or numbers or underscores or kanji.

Here's a look at more examples:

\ba\w*\b matches words that begin with the letter a -first (\b), then the letter a, then any number of letters or numbers (\w*), Finally, the end of the word (\b).

\d+ matches 1 or more consecutive digits. Here the + is and * similar to the meta character, the difference is the * match repeat any time (may be 0 times), and + will match repeat 1 or more times.

\b\w{6}\b matches a word that is exactly 6 characters.

table 1. Commonly used metacharacters
Code Description
. Match any character except the line feed
\w Match letters or numbers or underscores or kanji
\s Match any white space character
\d Matching numbers
\b Match the start or end of a word
^ Match the start of a string
$ End of Match string

metacharacters ^(and number 6 on the same point) and $ All match a position, which is a bit similar to \b . ^ matches the beginning of the string you want to use to find, and the$ matches the end. These two code is very useful when validating the input content, for example a website if asks you to fill in the QQ number must be 5 digits to 12 digits, may use:^\d{5,12}$.

The { 5,12} here is similar to { 2} described earlier, except that {2} matches only a little more than a few 2 times,{5,12} can be repeated not less than 5 times, not more than 12 times, Otherwise they do not match.

Because the ^ and $are used, the entire string entered must be used to match the \d{5,12} , which means that the entire input has to be 5 to 12 digits, so if the input QQ number matches the regular expression , that would be in line with the requirements.

Similar to the option of ignoring case, some regular expression processing tools also have an option to handle multiple rows. If this option is selected, the meaning of^ and $ becomes the beginning and end of the matching line.

[1] [2] [3] [4] Next page

Related Article

Cloud Intelligence Leading the Digital Future

Alibaba Cloud ACtivate Online Conference, Nov. 20th & 21st, 2019 (UTC+08)

Register Now >

Starter Package

SSD Cloud server and data transfer for only $2.50 a month

Get Started >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.