Chaos, imprinting Regular Learning quiz _ regular expressions

Source: Internet
Author: User
Tags numeric

Recently, I had the honor to participate in the present of the Open source China and the 51CTO two websites as guests in the thematic questions and answers. During the question-and-answer process, I collected some general questions about the regular expression process, where I devoted a little space to answer

Regular expressions are difficult to learn, and there is no doubt about them. But I think the difficulty is only in grammar. Regular expressions have been in existence for years, and its syntax was born in the 70 's. What kind of scene is that? To give a simple example, Unix under the name of usr, Dev, andthen passed down, now there are many people criticized, USR is not user,dev not device, difficult to learn, also difficult to remember. After years of rapid development, many of the problems of the year have been beautifully packaged, and today's users may be more accustomed to clicking on the "User directory", "Drive" icon, and no longer have to worry about those irregular short names. But unfortunately, the regular expression of the grammar has not changed too much, or even the subsequent increase in the function, but also inherited the previous grammar style, in the programming language increasingly humanized today, it naturally seems very difficult to understand. Today's developers may be more accustomed to Regex.charrange (' a ', ' Z '), rather than [a- z]; [A-z] Such structures are more flying blind, unless converted to regex.checkright (Regex.charrange (' A ', ' Z ')).

However, in another perspective, the two are one thing, but they are different in form, a similar key, a similar vernacular. If we can construct the transformation from the key to the vernacular in our mind, the regular expression is much simpler, and even the splicing of the module can be said. For example, Alipay's serial number is 18 or 26 digits, matched with regular expressions, that is ^ ([0-9]{18}|[ 0-9]{26}) $, or ^[0-9]{18} ([0-9]{8})? $. The logic is simple:^ is used to lock the beginning, $ isused to lock the end,[0-9] matches the numeric character,([0-9]{18}|[ 0-9]{26}) represents two side-by-side options, that is, a numeric string length of 18 bits or 26 bits, and [0-9]{18} ([0-9]{8})? indicates that at least 18 digits of the numeric string need to appear, after which there may be a 8-digit string (so the total length is 26 bits). General regular expression application, it is so simple.

If you think the above is true, then the problem of learning regular expressions is left with the right way to choose. When we learn the programming language, we all emphasize that we can't just read books, write programs, and even the best way is to put the examples in the book to run again, so that we can really learn. But in many people's eyes, regular expressions may not be a programming language, so learning is a point of view, and even content to copy ready-made expressions from the Web. So, one of the common problems is "there is no way to learn", unfortunately, the answer is no-since the copy of other people's code can not learn to program, copy the ready-made expression, casually turn over several documents, of course, also learn not regular. But there's also the lucky news that it doesn't take long to really learn regular expressions.

In my experience, learning regular expressions, the real thing to do is to understand the common functions: Character groups, multiple branches, matching patterns, and look around. It can be said that to understand these points, 80% of the regular problem can be solved. But to understand these points, you need specialized learning: What is the character group to solve the problem, how it is used? What is the problem with multiple selection branches and how is it used? We should take some time to study and think, and all these are understood, and then we will study how to solve the complex problem of expression. If you can study for 1-2 hours a day, there will be significant results within two weeks, and one months can almost be a fairly high level of cultivation. And, in my experience, in learning a new programming language, not only to the book's examples are entered into the run, but also to do their own to change the sample code to see what happens, and then think about why. If you do this when you are learning regular expressions, you will surely be able to do more with less.

If you really understand these common functions and have a clear idea of their value and use, then another problem can be solved--how does a regular expression differ in different languages? Although regular expressions in different languages have different rules, the idea behind them is unified, but the different forms of expression, or the way in which concepts are landed. The advantage is that a programming language document will not explain in detail what a character group is or what a multiple-selection branch is, but will tell you in detail how the character group is represented in the language, and how the multiple-selection branch is represented (you can search for character class or alternation in these documents). So if your brain is clear enough, even if you're unsure how to write the final expression, you just need to look at the document to be able to solve it. For example, a character group \s that matches a white-space character , writes the \\s in a Java string, because \s is not a legitimate escape sequence in a Java string, so it must be preceded by \ to escape \; You can write directly in PHP \s Because PHP keeps unrecognized escape sequences intact as it processes strings; some tools under Unix must be written [[: Space:]], which is the representation of Perl-style \s in the POSIX specification. It seems to be troublesome, and that's all, because we know that what we need here is "matching character groups for white space characters."

The above has written so many, may some people can say: the regular expression this thing, does not ascend the elegance, does not need to spend so much energy. Perhaps it is this point of view, forming a "Do not seriously study the regular expression" of the ideological roots. Fortunately, this question is really good to understand, because many things are the truth. For example, we don't ask everyone to be a writer, but everyone is likely to write a few serious articles when they need it, "not a writer" is not a reason to "write a serious article when you need it". In order to write serious articles when needed, it is necessary to devote time to learning and practising writing. Regular expressions of learning, in fact, this is the truth.

This argument can persuade some people, but there are some people who can't be persuaded. At the same time, as I have observed, those who cannot be persuaded do not seem to have spent too much effort on other "things", but are often haunted by regular expressions. In contrast, a truly professional programmer, like the productive programmer, would be willing to spend 2 hours writing a regular expression to save endless time later. Of course, all of the above premise, are able to correct the study of regular expression, or learn valuable skills of the attitude. The people who do the software have read Brooks's famous "No silver Bullets", so here may as well borrow his words, the regular expression of learning, there is no silver bullets.

This article is original by Yurii, reproduced please specify the source:   chaos, imprinted

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.