[Turn]8 Regular Expressions you should Know

Source: Internet
Author: User
Tags closing tag number sign tld

Regular expressions is a language of their own. When you learn a new programming language, the They ' re this little sub-language the makes no sense at first glance. Many times you has to read another tutorial, article, or book just to understand the ' simple ' pattern described. Today, we'll review eight regular expressions that's should know for your next coding project.

Background Info on Regular Expressions

This is what the Wikipedia have to say about them:

In computing, regular expressions provide a concise and flexible means for identifying strings of text of interest, such a s particular characters, words, or patterns of characters. Regular Expressions (abbreviated as Regex or regexp, with plural forms regexes, regexps, or Regexen) is written in a form Al language that can is interpreted by a regular expression processor, a program that either serves as a parser generator or examines text and identifies parts that match the provided specification.

Now, that's doesn ' t really tell me much about the actual patterns. The regexes I ' ll be going over today contains characters such as \w, \s, \1, and many others that represent something Tota Lly different from what is they look like.

If you're a learn a little about regular expressions before we continue reading this article, I ' d suggest watching T He Regular Expressions for Dummies screencast series.

The eight regular expressions we'll be going over today would allow your to match a (n): username, password, email, hex value (like #fff or #000), Slug, URL, IP address, and an HTML tag. As the list goes down, the regular expressions get more and more confusing. The pictures for each regex in the beginning is easy-to-follow, but the last four is more easily understood by reading T He explanation.

The key thing to remember about regular expressions was that they was almost read forwards and backwards at the same time. This sentence would make more sense when we talk about matching HTML tags.

Note: The delimiters used in the regular expressions is forward slashes, "/". Each pattern begins and ends with a delimiter. If a forward slash appears in a regex, we must escape it with a backslash: "\ \".

1. Matching a UsernamePattern:
1 /^[a-z0-9_-]{3,16}$/
Description:

We begin by telling the parser to find the beginning of the string (^), followed by any lowercase letter (A-Z), number (0- 9), an underscore, or a hyphen. Next, {3,16} makes sure that is at least 3 of the those characters, but no more than 16. Finally, we want the end of the string ($).

String that matches:

My-us3r_n4m3

String that doesn ' t match:

Th1s1s-wayt00_l0ngt0beausername (too long)

2. Matching a PasswordPattern:
1 /^[a-z0-9_-]{6,18}$/
Description:

Matching a password is very similar to Matching a username. The only difference are that instead of 3 to letters, numbers, underscores, or hyphens, we want 6 to + them ({6,18}) .

String that matches:

Myp4ssw0rd

String that doesn ' t match:

mypa$ $w 0rd (contains a dollar sign)

3. Matching a Hex ValuePattern:
1 /^#?([a-f0-9]{6}|[a-f0-9]{3})$/
Description:

We begin by telling the parser to find the beginning of the string (^). Next, a number sign was optional because it is followed a question mark. The question mark tells the parser, the preceding character-in this case a number Sign-is optional Edy "and capture it if it ' s there. Next, inside the first group of parentheses, we can have a different situations. The first is any lowercase letter between a and F or a number six times. The vertical bar tells us that we can also has three lowercase letters between a and f or numbers instead. Finally, we want the end of the string ($).

The reason that I put the six character before was that parser would capture a hex value like #ffffff. If I had reversed it so then the three characters came first, the parser would only pick up #fff and not the other three F ' s.

String that matches:

#a3c113

String that doesn ' t match:

#4d82h4 (contains the letter h)

4. Matching a SlugPattern:
1 /^[a-z0-9-]+$/
Description:

You'll be using the this regex if you ever has to work with mod_rewrite and pretty URL ' s. We begin by telling the parser to find the beginning of the string (^), followed by one or more (the plus sign) letters, n Umbers, or hyphens. Finally, we want the end of the string ($).

String that matches:

My-title-here

String that doesn ' t match:

My_title_here (contains underscores)

5. Matching an EmailPattern:
1 /^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/
Description:

We begin by telling the parser to find the beginning of the string (^). Inside The first group, we match one or more lowercase letters, numbers, underscores, dots, or hyphens. I have escaped the dot because a non-escaped dot means any character. Directly After, there must is an in sign. Next is the domain name which must be:one or more lowercase letters, numbers, underscores, dots, or hyphens. Then another (escaped) dot, with the extension being and the six letters or dots. I have 2 to 6 because of the country specific TLD ' s (. Ny.us or. co.uk). Finally, we want the end of the string ($).

String that matches:

[Email protected] doe.com

String that doesn ' t match:

[Email protected] (TLD is too long)

6. Matching a URLPattern:
1 /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/
Description:

This regex was almost like taking the ending part of the above regex, slapping it between "/http" and some file structure At the end. It sounds a lot simpler than it really are. To start off, we search for the beginning of the caret.

The first capturing group is all option. It allows the URL to begin with "http://", "https://", or neither of them. I have an question mark after the S-to-allow URL ' s, which has HTTP or HTTPS. In order to make this entire group optional, I just added a question mark to the end of it.

Next is the domain name:one or more numbers, letters, dots, or hypens followed by another dot then both to six letters or Dots. The following section is the optional files and directories. Inside the group, we want to match any number of forward slashes, letters, numbers, underscores, spaces, dots, or hyphens. Then we say the this group can be matched as many times as we want. Pretty much this allows multiple directories to being matched along with a file at the end. I have used the star instead of the question mark because the star says zero ormore, not zero orone. If A question mark is to is used there, only one file/directory would is able to be matched.

Then a trailing slash was matched, but it can be optional. Finally we end with the end of the line.

String that matches:

Http://net.tutsplus.com/about

String that doesn ' t match:

Http://google.com/some/file!.html (contains an exclamation point)

7. Matching an IP AddressPattern:
1 /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/
Description:

Now, I ' m not going to lie, I didn ' t write this regex; I got it from here. Now, that's doesn ' t mean that I can ' t rip it apart character for character.

The first capture group really isn ' t a captured group because

1

was placed inside which tells, the parser to not capture this group (more on the This, the last regex). We also want this non-captured group to being repeated three times-the {3} at the end of the group. This group contains another group, a subgroup, and a literal dot. The parser looks for a match with the subgroup then a dot to move on.

The subgroup is also another non-capture group. It's just a bunch of character sets (things inside brackets): The string "All" followed by a number between 0 and 5; The or the string "2" and a number between 0 and 4 and any number; or an optional zero or one followed by and numbers, with the second being optional.

After we match three of those, it ' s onto the next non-capturing group. This one wants:the string "followed by a number between 0 and 5; Or the string "2" with a number between 0 and 4 and another number at the end; or an optional zero or one followed by and numbers, with the second being optional.

We end this confusing regex with the end of the string.

String that matches:

73.60.124.136 (no, that's not my IP address:P)

String that doesn ' t match:

256.60.124.136 (the first group must be "+" and a number between zero and five)

8. Matching an HTML TagPattern:
1 /^<([a-z]+)([^<]+)*(?:>(.*)<\/\1>|\s+\/>)$/
Description:

One of the more useful regexes on the list. It matches any HTML tags with the content inside. As usually, we begin with the start of the line.

First comes the tag ' s name. It must is one or more letters long. This is the first capture group, it comes in handy and we have to grab the closing tag. The next thing is the tag ' s attributes. This was any character but a greater than sign (>). Since This was optional, but I want to match more than one character, and the star is used. The plus sign makes the attribute and value, and the star says as many attributes as you want.

Next comes the third non-capture group. Inside, it'll contain either a greater than sign, some content, and a closing tag; or some spaces, a forward slash, and a greater than sign. The first option looks for a greater than sign followed by any number of characters, and the closing tag. \1 is used which represents, the content that's captured in the first capturing group. In this case it is the name of the tag ' s. Now, if that couldn ' t is matched we want to look for a self closing tag (like an IMG, BR, or HR tag). This needs to has one or more spaces followed by "/>".

The regex is ended with the end of the line.

String that matches:

<a href= "http://net.tutsplus.com/" >Nettuts+</a>

String that doesn ' t match:

(attributes can ' t contain greater than signs)

Conclusion

I hope that you have grasped the ideas behind regular expressions a little bit better. Hopefully you'll be using the these regexes in the future projects! Many times you won ' t need to decipher a regex character by character, but sometimes if you do the it helps you learn. Just Remember, Don's be afraid of regular expressions, they might isn't seem it, but they make your life a IoT easier. Just try and pull out a tag's name from a string without regular expressions! ;)

[Turn]8 Regular Expressions you should Know

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.