1024, happy holidays! Find and find friends (thieves and crawlers from Century jiayuan) and beautiful Crawlers
October 1024, programmer's holiday ~ Happy holidays!
Don't work overtime tonight. I will give it to you later!
Don't grieve yourself. Go home for a good meal at night.
Body
I have always been interested in crawlers and data, and have crawled a lot of data. Of course, c # is required ~~
Today, we announce a very early thief program. Data comes from Century jiayuan.
Demo: find. izk. cloud
No picture, no truth
Description
I directly found the interface address on the page of century jiayuan, constructed the relevant parameters, carried out the post request, and then the data came out... that's it !!!
APIS like this are public, and there are no restrictions yet. It's simply... not too simple!
Interface address: http://search.jiayuan.com/v2/search_v2.php
You may need to note the construction of parameters and paste a code snippet:
1 string postdata = string.Format("sex={4}&key=&stc=1:{0},2:{1}.{2},23:1&sn=default&sv=1&p={3}&f=select&listStyle=bigPhoto&pri_uid=0&jsversion=v5", area, ages, agee, pageindex,sex);
Region, age group, paging, Gender
Because it was an early project, the first HttpHelper encapsulated by others for Http requests.
1 HttpHelper http = new HttpHelper (); 2 HttpItem item = new HttpItem () 3 {4 URL = "http://search.jiayuan.com/v2/search_v2.php", // URL required item 5 Method = "Post ", // The default value of URL option is Get 6 Timeout = 100000, // The default value of connection Timeout is 100000 7 ReadWriteTimeout = 30000, // The default timeout value for writing Post data is 30000 8 IsToLower = false. // whether the HTML code is converted to lowercase is optional. The default value is 9 Cookie = "", 10 UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36", // your browser type, version, the default Operating System Options include 11 Accept = "text/html, application/xhtml + xml ,*/*", // Optional options have default 12 ContentType = "application/x-www-form-urlencoded; charset = UTF-8", 13 Postdata = postdata, 14}; 15 HttpResult result = http. getHtml (item); 16 string html = result. html;
Of course, after crawling so much data, I also have a set of self-encapsulated httphelper ~~ I will share it with you later.
Currently, the project is hosted on GitHub and can be used by anyone who needs it ~
Code address