Regular Expressions and crawling are webpage Images

Source: Internet
Author: User
Regular Expression

Namespace: using system. Text. regularexpressions;

Common classes:

RegEx

Matchcollection

Match

Group

Groupcollection

Common Methods:

RegEx. ismatch (); Return Value bool

RegEx. Match (); Return Value match

RegEx. Matches (); Return Value matchcollection

RegEx. Replace (); Return Value string

Regular Expressions capture images:

Reference namespace: using system. net;

Using system. IO;

Ideas: 1. Get all the information on the web page from the Internet, 2. Use Regular Expressions for matching and obtain the specific address of the image to be obtained, 3. Download;

Static void Mian (string [] ARGs)

{

WebClient WC = new WebClient ();

String html = WC. downloadstring (@ "--- page --- ");

Matchcollect MC = RegEx. Matches (HTML, @ "<\ s? IMG [^>] + src = "" ([^ ""] +) "); // use a regular expression for matching, because there are many images, therefore, it is common to store a list set;

List <string> PIC = new list <string> ();

Foreach (Match m in MC) // traverse

{

If (M. Success) // If the matching string can be placed in the PIC set

{

PIC. Add (M. Group [1]. value. Trim (); // obtain the image src = "~~~ "Form; extract image name

}

}

String url = @ "webpage address ";

For (INT I = 0; I <PIC. Count; I ++)

{

String temp = PIC [I];

Temp = URL +/+ temp; // Add the URL address before the image name;

PIC [I] = temp; // re-change the image name in the PIC set. This image is a complete webpage image address.

}

String address = "target location to download ";

If (! Directory. exists (Address) // first, you can determine whether a folder is used in the disk.

{

Directory. createdirectory ("file ");

}

Else

{

For (INT I = 0; I <PIC. Count. I ++)

{

String name = RegEx. Match (PIC [I], @ "./(. +)"). Groups [1]. value;

// RegEx. Match (PIC [I], @ "./(. +)"); for matching, display the image name "/~~~ "Form;

// RegEx. match (PIC [I], @ ". /(. + )"). groups [1]. the value captures the image name, which is created to be the same as the online name during download;

WC. downloadfile (PIC [I], path. Combine (address, name); // download complete

}

}

Console. readkey ();

}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.