HTML5 knowledge Points: A brief analysis of regular expressions

Source: Internet
Author: User

Many people's impressions of regular expressions are used for form validation, which is not really accurate. Regular expressions are widely used in many software applications, including Linux,unix and other operating systems, vbscript,java,php and other development environments, and many applications can be applied to regular expressions.

One, the history of the regular.

First of all to clear a misunderstanding, always some people think that the regular expression is JS himself invented, this is of course not correct. In 1956, a mathematician named Stephen Kleene, on the basis of the early work of McCulloch and Pitts, published a paper titled "Representation of the Neural network", introducing the concept of regular expressions for the first time. The so-called regular expression is used to describe the expression that he calls "the algebra of the regular set", so the term "regular expression" is used.

Later, it was found that this work could be applied to some early studies using Ken Thompson's computational search algorithm, where Ken Thompson was the main inventor of Unix, and the first one to use regular expressions on the application was the QED Editor in Unix. Since then, regular expressions have been an important part of text-based editors and search tools.

The development history of regular expressions is not particularly long-term, but it is quickly absorbed by major programming languages. This is mainly due to its own characteristics:

First, compared with the traditional authentication method, the regular expression can accomplish the verification work that needs to be done more efficiently.

Second, the ability to capture strings, the regular can also do a good job, such as intercepting the URL of the domain name or other content, and so on.

Thirdly, express flexibility and concise wording. A regular expression can be easily implemented from a variety of complex validations in a form to various processing of strings.

Two Definition of regular expressions

Regular expressions describe a pattern of string matching that can be used to retrieve whether a string contains a seed string, replace a matched substring, or remove a substring from a string that matches a certain condition.

Regular expressions are text patterns that consist of ordinary characters (A-Z) and special characters (also called metacharacters). A regular expression, as a template, matches a character pattern to the string you are searching for.

2.1 Ordinary characters

Consists of all printed and nonprinting characters that are not displayed as metacharacters. This includes all uppercase and lowercase characters, all numbers, all punctuation marks, and some symbols.

2.2 Special characters

A special character, such as a character that has a special meaning, such as a * in "*.exe". Simply means to represent any string, if you want to find a file with * in the file name, you need to escape the *, which is preceded by a \.ls \.exe.

Special characters in regular expressions $, (), *,+,... [,?,\,^,|

$ matches the end position of the input string.

() to mark the start and end positions of a sub-expression.

* matches the preceding subexpression 0 or more times.

+ matches the preceding subexpression one or more times.

. matches any single character except for the newline character \ n.

[Marks the beginning of a bracket expression.

Match the preceding subexpression 0 or one time.

{marks the beginning of the qualifier expression.

\ marks the next character as a special character, or a literal character, or a backward reference, or octal escape character.

^ matches the starting position of the input string, unless used in a square bracket expression, at which point it indicates that the character set is not accepted.

| Indicates a choice between the two items.

2.3 Qualifiers

Qualifiers are used to specify how many times a given component of a regular expression must appear to satisfy a match. There are a total of 6 *,+,?, {n},{n,},{n,m}.

* matches the preceding subexpression 0 or more times.

+ Match the previous self-expression one or more times, match the preceding subexpression 0 or one time.

{n} n is a non-negative integer that matches the determined N times.

{N,}n is a non-negative integer that matches at least n times.

{N,M}M and N are non-negative integers, and n<=m, which matches at least N times and matches up to M times.

2.4 Locator

Used to describe the boundaries of a string or word, ^ and $ refer to the beginning and end of the string, \b describes the pre-or post-boundary of a single bite, \b represents a non-word boundary, and a qualifier cannot be used on a locator.

Three The application of regular expressions in the Web

Regular expression is widely used in web system, it can detect data format, replace related text, extract interesting text content and so on.

Example: Verifying the legality of an e-mail address in a string

e-mail address format is < user name @ domain name, for the user name, in addition to numbers and letters, some allow "-", some allow ".", and some can either, or allow other special characters. We can only judge this on the basis of specific circumstances. The word "." is assumed to be allowed in addition to letters and numbers. -"and". "" -"No, well, I'll show you the first." “,”“。 "-" cannot be connected. Domain names in addition to numbers and letters only allowed to appear "-" and must not appear in the first and the lowest, the paragraphs with "." Connection, we can learn from the domain name the last paragraph is greater than one and only letters, according to the above we can write the expression two to determine whether the string is a legitimate e-mail address.

Do you feel a lot of trouble and feel a headache when you listen to the rules? Don't worry, the division explains the following:

^: Match start

([a-z0-9a-z]+[-|\.]?) +: number or letter greater than one, "-" or "." The above combination repeats more than 1 times.

[A-z0-9a-z]: The user name ends with a number or letter.

@: Match "@" [a-z0-9a-z]+ matches multiple digits or letters

(-[a-z0-9a-z]+): matches one plus multiple digits or letters 0 or 1 times.

\.: Match "."

+: Matches content within parentheses multiple times

[A-za-z] {2,} matches the letter more than 2 times

$: Match the end of the above combinations together, you can match a more comprehensive e-mail address, the regular expression is as follows:

^ ([a-z0-9a-z]+[-|\]?) +[a-z0-9a-z]@ ([a-z0-9a-z]+ (-[a-z0-9a-z]+)?.) +[a-za-z]{2,}$

Complex? The whole view is really quite complicated, but when you separate the functions, first implemented, then combined together, in fact, it is not so complicated.

Full stack engineers need to know the regular to do various forms validation, string processing, the architect also understand the regular expression, because the framework will use the regular expression. So, according to the previous article, regular expressions are certainly one of the languages we have to learn. What are the characteristics of the HTML5 regular Expression project development course? The biggest feature is the emphasis on practicality and efficiency. Learn regular expressions on the basis of mastering the front end, and be able to better stand at a higher level to think and learn regular expressions.

How is the regular Expression project development course taught? The following points are mainly highlighted:

First, gradually. This course starts with the basics of getting the students acquainted with the more string manipulations and how the strings are validated. Then, we can find an actual online project, then analyze the way and method of verification, and explain the usage of regular expression.

Second, project-driven. The whole project is to drive the learning of knowledge points, and after understanding the actual needs, we use regular expressions to achieve our needs.

Third, focus on actual combat. After listening to the instructor in class, students have to do a new project in person.

Regular Expression project development course content includes regular expression basics, advanced usage, and the use of a variety of common validation three components, the specific content is as follows:

The first part: the basis of the regular expression. This section mainly contains three aspects:

The basis of the regular expression. The content mainly contains regular syntax, common symbols and simple forms validation.

The second part: Advanced usage and complex regular validation.

Part III: Find a specific project, in the actual experience of the benefits of the regular.

Four Conclusion

Regular expression syntax is concise, powerful, especially in the validation of data, in daily data processing and software development, the regular expression has become an indispensable tool, I believe that with the development of Web network, the application of regular expression will become more and more powerful more and more easy to use.

HTML5 knowledge Points: A brief analysis of regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.