Analysis of data collection procedures using ASP. NET Techniques

Source: Internet
Author: User

ASP. NET skills-data collection program introduction first let's look at some concepts, the so-called data collection program is also a Web page thief Program (Don't scold me ), after writing something, I hope you can study it together.

ASP. NET skills-the first step of the data collection program, at the beginning of data download, some websites need to log on to see the corresponding data, this requires us to send the login username and password, but I logged on to the server, but the server was not rubbish. I redirected the server to him and generated two sessions. I don't know how to capture these 2nd sessions. so I speculate ^-^. I caught the SESSION with the software and caught a software called Ethereal. I used the following code to add it to the header of the HTTP request.

 
 
  1. WebClient myWebClient = new WebClient();  
  2. string sessionkey=textBox78.Text;  
  3.      string refererurl=textBox77.Text;  
  4.      myWebClient.Headers.Clear();       
  5.      myWebClient.Headers.Add("Cookie",sessionkey);  
  6.      myWebClient.Headers.Add("Referer", refererurl);  
  7.      myWebClient.Headers.Add("User-agent", "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031107 Debian/1.5-3"); 

In this way, the server is cheated, haha

ASP. NET skills-data collection procedure step 2, code download

 
 
  1. byte[] myDataBuffer = myWebClient.DownloadData(remoteUri);  
  2.  download = Encoding.Default.GetString(myDataBuffer); 

ASP. NET skills-data collection program step 3, data matching, I read the stream into the data, and then use IndexOf to get the location of two key fields, I know this is stupid, but it is difficult to use a regular expression (who will give me some advice ), after matching the string, I used the following function to remove the HTML code:

 
 
  1. private string StripHTML(string strHtml)  
  2.   {  
  3.    string [] aryReg ={  
  4.           @"<script[^>]*?>.*?</script>",  
  5.           @"<(\/\s*)?!?((\w+:)?\w+)(\w+(\s*=?\s*(([""'])(\\[""'tbnr]|[^\7])*?\7|\w+)|.{0})|\s)*?(\/\s*)?>",  
  6.           @"([\r\n])[\s]+",  
  7.           @"&(quot|#34);",  
  8.           @"&(amp|#38);",  
  9.           @"&(lt|#60);",  
  10.           @"&(gt|#62);",   
  11.           @"&(nbsp|#160);",   
  12.           @"&(iexcl|#161);",  
  13.           @"&(cent|#162);",  
  14.           @"&(pound|#163);",  
  15.           @"&(copy|#169);",  
  16.           @"&#(\d+);",  
  17.           @"-->",  
  18.           @"<!--.*\n"           
  19.          };  
  20.  
  21.    string [] aryRep = {  
  22.            "",  
  23.            "",  
  24.            "",  
  25.            "\"",  
  26.            "&",  
  27.            "<",  
  28.            ">",  
  29.            " ",  
  30.            "\xa1",//chr(161),  
  31.            "\xa2",//chr(162),  
  32.            "\xa3",//chr(163),  
  33.            "\xa9",//chr(169),  
  34.            "",  
  35.            "\r\n",  
  36.            "" 
  37.           };  
  38.  
  39.    string newReg =aryReg[0];  
  40.    string strOutput=strHtml;  
  41.    for(int i = 0;i<aryReg.Length;i++)  
  42.    {  
  43.     Regex regex = new Regex(aryReg[i],RegexOptions.IgnoreCase );  
  44.     strOutput = regex.Replace(strOutput,aryRep[i]);  
  45.      
  46.    }  
  47.  
  48.    strOutput.Replace("<","");  
  49.    strOutput.Replace(">","");  
  50.    strOutput.Replace("\r\n","");  
  51.  
  52.  
  53.    return strOutput;  
  54.   }  

After that, the database is stored. You can understand this. however, when I write data, an EXCEPTION occurs, saying that my field is too long to be written into the database. I use ACCESS, I will try to use SQL.

The data collection program of ASP. NET skills will be introduced here, and it will be helpful for you to write data collection programs using ASP. NET.

  1. Analysis on ASP. NET runtime environment Establishment
  2. ASP. NET Overview
  3. Analysis on the advantages of ASP. NET in eleven aspects
  4. Analysis on ASP. NET database connection pool settings
  5. How to Learn the nine steps of ASP. NET

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.