Regular expression (regex) Error usage causes feature Vulnerability Analysis _ Regular expressions

Source: Internet
Author: User
Tags character set

It's written in front of you.

Regular expression its powerful string matching function, leading to the current in a variety of programming languages, are very popular! It is used to describe or match a series of strings that conform to a certain syntactic rule. Many just use regular expressions from listening to this, and then searching the web when they want to use it. Very few people start from the system to learn regular expressions, from the definition of the principle of using system learning. Because, corresponding beginners find it too troublesome, a lot of original characters. It's a headache to see so long a string of characters. So, too lazy to learn. General encounter problems, directly to the Internet search. such as: "Mailbox regular expression, cell phone number of regular expressions, url regular expression ...", we found a very interesting phenomenon, "how the mailbox regular expression can be various, URL regular expressions are not the same", are all in the recommendation, said that they are correct, in the end that is the right?

From different regular expressions, we can draw 2 conclusions. One, regular expression is very flexible, a variety of methods can achieve the same result (all roads lead to Rome), two, regular expression matching results need to be validated, complex regular expressions are prone to error matching. Today, I'm not talking about regular expression flexibility, we look at common regular expression error usage, creating a feature vulnerability example. Hopefully, we'll be more careful when we use them. The following example, from my work in the audit code, often show examples, but also welcome friends to add!

Delimiter "^$" missing bug

<?php
///Detection user name, can only be the word alphanumeric digital
 
$user = "Chengmo8";
 
if (!preg_match ("<strong>/[0-9a-zA-Z]+/</strong>", $user))
{
	exit ("User name is wrong!)". ");
}

This is very common, because there is no delimiter, regular expression search, from the $user, the character from left to right search, the guide to find the character satisfies the condition, it will match, and return True, the program will continue to execute. We test, username input: Chengmo8,chengmo8??!, # $CHENGM, China CADADF, can match successfully, seemingly to restrict only the word alphanumeric digital username. As a result of the lack of qualified single-character expression, it becomes, as long as the string contains letters plus numbers can register. And what we need is that the string must be a letter plus number. The regular expression should be: ^[0-9a-za-z]+$, seemingly simple, and do not forget the ^$ character when doing the whole character match. A matching input character begins with a matching input character ending (default line wrap match either)

This is often done, cell phone number, mailbox, URL, registered username, password and so on. Need to have a qualifying symbol!

Characters in the brackets character ' [] ' use bugs

In regular expressions, common regular expression primitives (. *?, and so on) become normal characters in the bracket character. In the bracket character, the special character is represented, and only the "^-\" 3 characters are special characters. Where, the "^" character, the first character in the left bracket, is the character that represents not all characters in the back!

For example: [^0] cannot be 0 characters. And if it is: [0^], it represents the inclusion of 0^ characters. Because: ^ is already not the left parenthesis the rightmost one character. It's the same as the common word. The "-" character represents a range character, such as: [0-9] represents a match between 0 and 9 direct characters. "\" Escape character, if you want to match the "-" character, you can [0\-9], if you want to match "\", can be: [0\\9], indicating "09\" 3 characters. So, in fact, because many friends in the use of square brackets characters, often wrong special characters.

<?php
///Detection user name, can only be the word alphanumeric number
 
$code = "";
 
The matching character range contains. *?
Preg_match ("/[.*?] +/", $code);
 
The matching character range contains a to Z 26 characters
preg_match ("/[a-z]+/", $code);
 
The matching character range contains a to Z in fact, it is from the default ASCII table, a character to the middle of the z character, altogether 48 characters
Preg_match ("/[a-z]+/", $code);
 
The matching character range contains a to Z which corresponds to the Ascill code, 16
preg_match ("/[x41-x7a]+/", $code);
 
Just want to match the string
and Preg_match ("/[and]/", $code);
Actual match, all containing a,n,d to form all characters, with order Independent

Garon is often misunderstood and only wants to match and, once added to the "[]" character, it can be understood that all characters compose a character set together. Any character in it, you can match, with the order of no relationship! If you really need to match this class, group by character, such as "And|bnd" will match the and string, or the BND string. | The character is, a string or operator. contiguous strings of left and right sides are grouped into a single overall match.

Well, today we'll sort it out, common regular expressions, 2 common errors. Welcome everyone to communicate!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.