Original address: http://blog.csdn.net/lhfly/article/details/7684319
Organize two methods in C # that use regular expressions to get the properties or values of a page's source code label:
1. Get the value in the tag: <a href= "Www.csdn.net" class= "main" >CSDN</a> Result: csdn
/// <summary> ///gets the value of the specified label in the character/// </summary> /// <param name= "str" >string</param> /// <param name= "title" >label</param> /// <returns>value</returns> Public Static stringGettitlecontent (stringStrstringtitle) { stringTMPSTR =string. Format ("<{0}[^>]*?> (? <text>[^<]*) </{1}>", title, title);//get content between <title>Match Titlematch=Regex.match (str, TMPSTR, regexoptions.ignorecase); stringresult = titlematch.groups["Text"]. Value; returnresult; }
2. Get the attributes in the tag: <a href= "Www.csdn.net" class= "main" >CSDN</a> get "href" Result: www.csdn.net
/// <summary> ///gets the value of the specified label in the character/// </summary> /// <param name= "str" >string</param> /// <param name= "title" >label</param> /// <param name= "attrib" >Property name</param> /// <returns>Properties</returns> Public Static stringGettitlecontent (stringStrstringTitlestringattrib) { stringTMPSTR =string. Format ("<{0}[^>]*? {1}= ([' \ "\"]?) (? <url>[^ ' \ "\" \\s>]+) \\1[^>]*>", title, attrib);//get content between <title>Match Titlematch=Regex.match (str, TMPSTR, regexoptions.ignorecase); stringresult = titlematch.groups["URL"]. Value; returnresult; }
Note: The above method is to get the value of the first result in a string. You can use foreach to read all the matching attributes or values in the Titilemath.
C # use regular expressions to get the properties or values of a page's source code label