Some of the things in this course are based on the cognitive rules of beginners to adjust, not rigorous, such as many places in the multi-appdomain conditions are wrong, but said rigorous everyone dizzy, so continue to not rigorous talk about it.
Many of the interview questions are in this stage of the course.
. NET advanced technology is a high-level content, based on their own basis to determine the depth of learning.
Reference material: "C # Advanced Programming", C # Illustrated tutorial "CLR Via C #"
The prelude to Regular Expressions: Hell
Requirement 1: "192.168.10.5[port=8080]", this string indicates that 8080 port of the server with IP address 192.168.10.5 is open, use the program to parse this string, and then print out "The port of the server with IP address = * * * is open".
Requirement 2: "192.168.10.5[port=21,type=ftp]", this string represents the 21 port of the server with IP address 192.168.10.5 is provided by the FTP service, where if ", Type=ftp" section is omitted, The default is the HTTP service. Please use the program to parse this string, and then print out the "Port of the server with IP address = * * * * * * * * * * * * *"
Requirement 3: Determine if a string is an email? Must contain @ and., not at @ or. Start or end, @ to be in the last.
Requirement 4: Extract all emails from a single text: I have all 333M photos, want to send me email:[email protected]. I also want [email protected],[email protected], landlord good: [email protected]. Requirement 5: Extract all the pictures and hyperlinks in the Web page.
Getting Started with regular expressions: Paradise
Regular expressions are the techniques used for text processing, which are language-independent and are implemented in almost all languages. JavaScript is also used.
A regular expression is a text pattern consisting of ordinary characters and special characters (called metacharacters). This pattern describes one or more strings to match when looking up a text body. A regular expression, as a template, matches a character pattern to the string you are searching for.
Just like the wildcard "*.jpg", "%ab%", it is a special string matching string regular expression is very complex, do not want to grasp at once, understand what the regular expression can do (string matching, string extraction, string substitution), master the common use of regular expressions, You can use it later.
Look for the highlights of the job. A regular expression is also involved in filtering sensitive words, validator, and so on later in the project.
Meta-character 1
To learn the regular expression, understanding meta-characters is a must to overcome the difficulties. Don't try to remember.
.: matches any single character except \ n. For example, the regular expression "B.G" can match the following string: "Big", "Bug", "B g", but does not match "Buug", "B." G "Can match" Buug ".
[]: Matches any one of the characters in the parentheses. For example, the regular expression "b[aui]g" matches the bug, big and bag, but does not match beg, Baug. You can use the hyphen "-" in parentheses to specify the interval of the character to simplify the representation, such as the regular expression [0-9] can match any numeric character, so that the regular expression "a[0-9]c" equivalent to "a[0123456789]c" can Match "a0c", "A1c", "A2C" such as String, you can also create multiple intervals, such as "[A-za-z]" can match any uppercase and lowercase letters, "[a-za-z0-9]" can match any uppercase or lowercase letters or numbers.
(): The expression that is enclosed in () is defined as "group", and the character that matches the expression is saved to a staging area, which is useful when the string is extracted. To represent some characters as a whole. Change the priority, define the extraction group of two roles.
| : A logical OR operation of two matching criteria. ' Z|food ' can match "z" or "food". ' (z|f) Ood ' matches "Zood" or "food".
Meta-character 2
*: Match 0 to more sub-expressions before it, and wildcards * okay. For example, the regular expression "zo*" can Match "Z", "Zo" and "Zoo", so ". *" means that you can match any string. "Z (b|c) *" →zb, ZBC, ZCB, ZCCC, ZBBBCCC. "Z (AB) *" can match Z, Zab, Zabab (with parentheses to change precedence).
+: Matches the preceding subexpression one or more times, and * contrasts (0 to multiple). For example, the regular expression + + matches 9, 99, 999, and so on. "zo+" can Match "Zo" and "Zoo" and cannot match "Z".
? : matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" or "does". Typically used to match the "optional section".
{n}: matches the determined n times. "Zo{2}" →zoo. For example, "e{2}" cannot match "E" in "bed", but can match two "E" in "seed".
{N,}: matches at least n times. For example, "e{2,}" cannot match "E" in "bed", but can match all "E" in "Seeeeeeeed".
{n,m}: matches at least n times and matches up to M times. "e{1,3}" will match the first three "E" in "Seeeeeeeed".
Metacharacters 3
^ (shift+6): Matches the start of a row. For example, the regular expression "^regex" can match the beginning of the string "Regex I will use", but does not match "I will use regex". ^ Another meaning: No! (not understood at the moment)
$: Matches line terminator. For example, the regular expression "cloud $" can match the string "Everything is a cloud" end, but cannot match the string "floating clouds Ah"
Shorthand expressions
Note that these shorthand expressions do not consider the escape character, where \ represents the character \, not the C # string level \, which requires the use of either @ or \ double escaping in C # code. Distinguishes between C # level transitions and regular expression level transfers, just as C # escapes the wildcards regular expression with an escape character of \. The transfer of regular expressions is after C # (layer exploits). Think of the escape character of C # as%. In C # It appears that @ "\-" is the ordinary string of \-, except that in the regular expression analysis engine it appears that he has a special meaning.
"\\d" or @ "\d"
\d: Represents a number, equivalent to [0-9]
\d: Represents a non-numeric equivalent to [^0-9]
\s: Represents a line break, Tab tab, and other whitespace characters
\s: Represents non-whitespace characters
\w: Matches letters or numbers or underscores or kanji, which are characters that can form words
\w: Non-\w, equivalent to [^\w]
D:digital;s:space, W:word. Uppercase is "non"