Regular Expression summary, Regular Expression
1. Match tag a and its url:
Regex regA = new Regex(@"<a[\s]+[^<>]*href=(?:""|')([^<>""']+)(?:""|')[^<>]*>([^<>]+)</a>", RegexOptions.IgnoreCase);
Note: In the above regular expression,
Used to match various attributes before and after the href attribute:
[^<>]*
Used to match the url in the middle of the href attribute quotation marks:
([^<>""']+)
Used to match the content between tags:
([^<>]+)
2. Match the img tag and its url:
Regex regImg = new Regex(@"]*src=(?:""|')([^<>""']+(?:\.jpg|\.jpeg|\.png|\.gif))(?:""|')[^<>]*>", RegexOptions.IgnoreCase);
3. matching the tag and the content in the tag center:
reg = new Regex(@"<dl class=""ksDl"">(?:(?!</dl>)[\s\S])*</dl>", RegexOptions.IgnoreCase);
Note: When the html string is as follows, it can be matched in two places,
<Dl class = "ksDl"> <div> test </div> </dl> <dl class = "ksDl"> <div> test </div> </dl>
If the regular expression is written like this:
reg = new Regex(@"<dl class=""ksDl"">[\s\S]*</dl>", RegexOptions.IgnoreCase);
Only one matching place is allowed. Pay attention to the functions of the following parts, which are excluded when matching the intermediate content </dl>
(?:(?!</dl>)[\s\S])*