Crawler crawl data, post data garbled solution

Source: Internet
Author: User
Recently writing a crawler, the target site is: http://zx.bjmemc.com.cn/, it may be to prevent crawling data, it gives its own data encryption. The data is not captured with Google's own grab kit. So the fiddler.
The Fiddler crawl results are as follows:

Visible, in addition to the header information, the following data information is displayed as garbled. This will not be able to use the program to simulate the browser to send data.
One solution is to get the hexadecimal encoding of this string. Switch the fiddler to Hexview, as shown in the following figure:

The blue part is header information, the black font is the transmitted data. You can also right-click to uncheck the show header, so that the display is data information.
In our program we can convert a long string of hexadecimal characters into strings and send them to the Web server.
The conversion program is as follows:

[CSharp]  View plain copy Public static byte[] getbytearray (string frame)    {        byte[] buffer = new byte[frame. length / 2];    //Note that each of the two hexadecimal characters represents a binary encoding        for   (Int i = 0; i < frame. length / 2; i++)        {            int t = getdata (Frame[2 * i])  * 16 + getdata ( FRAME[2&NBSP;*&NBSP;I&NBSP;+&NBSP;1]);           buffer[i]  =  (byte) t;        }         return buffer;  }       static int getdata (char p)      //get ASCII encoding    {       if  (p <=  ' 9 '  && p >=  ' 0 ')        {            return p -  ' 0 ';        }       else       {            return p -  ' a '  + 10;        }  }  
1. Select the hex character long string you want to export-right-"Save selected bytes-" to a file. But encountered a problem is, how to copy the hexadecimal string of fiddler, if the manual copy, very unrealistic, one too long, and secondly afraid of the wrong. can be transformed with notepad++ or UltraEdit. I use notepad++ experiment success:

2, open this file with notepad++, found is still garbled
3. Download hex Display plugin download: hexeditor_0_9_5_uni_dll.zip
4, decompression after the HexEditor.dll file placed in the notepad++ installation directory plugins directory, restart notepad++ 5, again with notepad++ open file, Plugins->hex-editor->view In HEX, shown as follows:

6. Select the hexadecimal string you want to copy, copy, and then create a new text, paste, is a string with a space. Remove spaces and line breaks.
7, there is a way to quickly remove the space, select a space, ctrl+f, switch to the replacement label, and then click the Replace all buttons, so that all the spaces are replaced by an empty string, and then delete all the spaces. Reprint Please specify: Kangrui Tribe» Crawler Crawl data, post data garbled solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.