Preface:
Before writing this article, I first thank the seniors who gave me ideas and helped me.
The following 4 blog posts are seniors original, which has a lot of useful, worth learning things, hope to help everyone!
1, hands-on teaching you to use C + + to write ACM automatic brush problem Artifact (rushed into the HDU home)
2, [C #] Reverse attack--self-made day brush thousand questions AC automatic machine to conquer HDU OJ
3, C # using post to implement Hangzhou electric OJ AC automatic robot, AC rate up to 50%~~
4, continue node crawler-hundred line code self-made automatic AC Robot Day solution thousand problems captured Hdoj
At the beginning of the idea of holding to try, and used their program, um ~ O (* ̄▽ ̄*) O.
1, such as the use of C + + to write AC machine seniors, his program is compiled with VS2013, forgive my C + + is not a good study, according to his tutorial tried several times also did not run successfully, and ultimately can only fail! But the source program I have kept.
2, is getting Started C # I see this program, decisive fork, may not read, but ... It's always useful! Here! This is a simulation of the operation of the program, simulation replication, simulation mouse click Submit, the disadvantage is that if the program is running in the background will stop ~ (╯-╰) ... That's not what you want ...
3, Bo Master or write in C #, 50% of the AC rate looks good "*/¥#", but no open source, it seems that their own in the ytu OJ AC rate is only 56%.
4, using node to simulate the user's process, is actually a simulation login + simulation Submission process, according to experience, the simulation submitted this post process will definitely have a cookie. Where does the code for the submission come from? It's good to crawl the search engine directly. The whole idea is clear: analog login (POST), crawling code (get) from the search engine, simulating commit (POST). Does this look very good? Yes, just node general environment I find the tutorial through the Baidu configuration, but still can not run, the estimated lack of an environment it! Only more than 100 lines of code to write the AC machine is not to be underestimated, can not run ... "laugh, cry," "Laugh and Cry"
BodyAs for the online search, after all, rather than write their own, then, first!
The 12th place's chin is me Oh!
Please ignore my correct rate, compared to what seniors do, still a lot worse ... However, first give you hard to brush the question of acmer compensate a not, after all, this is a very speculative way, only for entertainment, please forgive me!
Let's start with a step-by-step Auto AC tour!
1. Also use socket programming to simulate HTTP protocol GET request to send page request to server
string reqinfo = "POST" + (String) Othpath + "Http/1.1\r\nhost:" + (String) host + Elseinfo + Typee + Conlen + (Strin g) s + "\r\ncookie:" + Cookie + "\r\nconnection:close\r\n\r\n" + rescode;if (socket_error = = Send (sock, Reqinfo.c_str (), R Eqinfo.size (), 0) {cout << "send error! Error code: "<< wsagetlasterror () << endl;closesocket (sock);}
We use socket programming through BIND (), connect (), send (), recv () These functions to establish a connection to the server. Next we think: What happens when we click the button, we use Send () what information we need to send to the server, which involves getting the HTTP protocol get request.
2. Get csdn Blog link with search engine
First, a few pictures ~
See the results of these different search engines, do you have any findings?
Yes, pay attention to the lower left corner Oh! Baidu and of the search engines have done the encryption of the link, for such a link, we from the source code is difficult to extract the original address of the corresponding page! So finally found Bing search engine (Youdao search can also), but the disadvantage is that these are not used to us the search engine for the page we want to index the amount of not a lot. But it's enough ~
Does anyone want to ask why I don't have to csdn the built -in search? I can only say, csdn search engine has not been written by my own find function easy to use, and sometimes the site search has not been found in the article!
void Getcsdnurl (String &allhtml)/// Extract Web page csdn blog url {blogurl.clear (); Smatch Mat;regex pattern ("href=\" (http ://blog.csdn[^\\s\ "]+) \" "); String::const_iterator start = Allhtml.begin (); String::const_iterator end = AllHtml.end () ; while (Regex_search (Start, end, mat, pattern)) {string msg (Mat[1].first, Mat[1].second); Blogurl.push_back (msg); start = Mat[0].second;}}
Used to do with PHP< public platform:imqxms>In the background of the use of regular expressions, the results found in C + +. So directly find the source of the seniors for my use!(say the name for your attention: IMQXMS)
3. Extracting program code from HTML source code
This image is part of the program code in the source code of the Web page!
The obvious feature is that <pre></pre> and our usual #include <>, we can see, ' < ' and ' > ' Some characters have been changed to other characters, This is probably the escape character in the Network Markup language!
void GetCode (String &allhtml) //Extract code section {codehtml = ""; int pos = Allhtml.find ("#include"); if (pos! = string::npos) {for (int i = pos; i < (int) allhtml.length (); i++) {if (allhtml[i] = = ' < ' &&allhtml[i + 1] = = '/' &&al Lhtml[i + 2] = = ' t ' &&allhtml[i + 3] = = ' E ' &&allhtml[i + 4] = = ' x ' &&allhtml[i + 5] = = ' t ')) return;e LSE if (allhtml[i] = = < ' &&allhtml[i + 1] = = '/' &&allhtml[i + 2] = = ' P ' &&allhtml[i + 3] = = ' R ' & Amp;&allhtml[i + 4] = = ' E ' &&allhtml[i + 5] = = ' > ') return; Codehtml + = Allhtml[i];}} Else{cout << "Not found the right code!" "<< Endl;return;}}
What we need is the original complete code, then the next task is to convert all the HTML escape characters in this program into C + + or other language Fu Yi!
Also, let's talk about some important coding for Web forms.
One is URL encoding, because if the data to be transferred contains some symbols, it may conflict with the delimiter of the original text, such as &/= And so on some symbols for the address, so you need to escape these symbols
URL encoding is used to escape the Web page.
Then a problem is the page encoding, gb2312 and utf-8 two kinds of
GB2312 is a Chinese code, Utf-8 is a more common encoding for more languages, two encodings on the ASCII code of letters, is the same
However, in Chinese characters, gb2312 is 2 bytes, Utf-8 is 3 bytes, which will lead to two encoding content URL encoding will be different after
So we also need to do a function to convert different encodings!
char* u2g (const char* UTF8)///utf-8 to Gb2312{int len = MultiByteToWideChar (Cp_utf8, 0, UTF8,-1, NULL, 0); wchar_ t* wstr = new Wchar_t[len + 1];memset (wstr, 0, Len + 1); MultiByteToWideChar (Cp_utf8, 0, UTF8,-1, WSTR, len); len = WideCharToMultiByte (CP_ACP, 0, Wstr,-1, NULL, 0, NULL, NULL); c har* str = new Char[len + 1];memset (str, 0, Len + 1); WideCharToMultiByte (CP_ACP, 0, Wstr,-1, str, len, NULL, NULL); if (WSTR) delete[] Wstr;return str;} char* g2u (const char* gb2312)///gb2312 to Utf-8{int len = MultiByteToWideChar (CP_ACP, 0, gb2312,-1, NULL, 0); WCHAR _t* wstr = new Wchar_t[len + 1];memset (wstr, 0, Len + 1); MultiByteToWideChar (CP_ACP, 0, gb2312,-1, WSTR, len); len = WideCharToMultiByte (Cp_utf8, 0, Wstr,-1, NULL, 0, NULL, NULL) ; char* str = new Char[len + 1];memset (str, 0, Len + 1); WideCharToMultiByte (Cp_utf8, 0, Wstr,-1, str, len, NULL, NULL); if (WSTR) delete[] Wstr;return str;}
Here, that's it.Baidu (Bing)is easy to use
String Htmltoc (String &codehtml)///html escaped character escape processing {string ans;for (int i = 0; i < (int) CODEHTML.L Ength (); i++) {if (codehtml[i] = = & ' &&codehtml[i + 1] = = ' L ' &&codehtml[i + 2] = = ' t ' &&codehtml[i + 3] = = ';') < <{ans + = ' < '; i + = 3;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' G ' &&codehtml[i + 2] = = ' t ' &&codehtml[i + 3] = = ';') > >{ans + = ' > '; i + = 3;} else if (codehtml[i] = = '/' &&codehtml[i + 1] = = ' n ')/////N; \\n{ans + = "\\n"; i + = 1;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' a ' &&codehtml[i + 2] = = ' m ' &&codehtml[i + 3] = = ' P ' &&codehtml[i + 4] = = '; ') & &{ans + = ' & '; i + = 4;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' Q ' &&codehtml[i + 2] = = ' U ' &&codehtml[i + 3] = = ' O ' &&codehtml[i + 4] = = ' t ' &&codehtml[i + 5] = = '; ') "\" {ans + = ' \ "; i + =5;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' n ' &&codehtml[i + 2] = = ' B ' &&codehtml[i + 3] = = ' s ' &&codehtml[i + 4] = = ' P ' &&codehtml[i + 5] = = '; ') ' {ans + = '; i + = 5;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' # ' &&codehtml[i + 2] = = ' 4 ' &&codehtml[i + 3] = = ' 3 ' &&codehtml[i + 4] = = '; ') + +{ans + = ' + '; i + = 4;} else if (codehtml[i] = = ' & ' &&codehtml[i + 1] = = ' # ' &&codehtml[i + 2] = = ' 3 ' &&codehtml[i + 3] = = ' 9 ' &&codehtml[i + 4] = = '; ') ' \ ' {ans + = '; i + = 4;} else ans + = codehtml[i];} return ans;}
I wonder why some seniors write the code without this part, say ... [Pie mouth], in fact, there is a problem, the above code to achieve the conversion of characters, but this algorithm and KMP pattern matching which is better?
I'm not going to tell you that I had a problem with the C language exam last night (replace all of the don ' t in the sentence with do not) and use the method above!
---------------------------seems to have said, (╯-╰) ...
4. Use cookies in Web pages to simulate online
Because every account login will have a unique cookie, the kind of simulation mouse click on the program, is essentially a browser, but I do not do this in the console, so the AC machine can only be opened in your browser, and your account in the case of online can be run
String Cookie = "exesubmitlang=2; Phpsessid= "+ Phpsessid +"; cnzzdata1254072405= "+ cnzzdata;string reqinfo =" POST "+ (String) Othpath +" Http/1.1\r\nhost: "+ (String) host + El Seinfo + Typee + Conlen + (string) s + "\r\ncookie:" + Cookie + "\r\nconnection:close\r\n\r\n" + rescode;
Speaking of Cookice, want to see your browser in the Cookice information, Google Browser For example, the right mouse button >> Check inside there are surprises Oh!
It's my trumpet that's used to brush the question.
5, since can be done from the Web page to extract a program source code, then, I can also in the Hdu OJ status bar to find our final results after the submission!
void GetResult (string &allhtml, int Prob)///parse out the result in state.php, space, time {Stateans = "", Statesapce = "", Statetime = ""; Ch Ar D[200];_itoa (problemid, D, ten); Strcat (d, "</a>"); int pos = Allhtml.find ((string) d); int Mpos = Pos;int tpos;if (Mp OS = = String::npos) Return;else{mpos + = 17;while (true) {if (allhtml[mpos] = = < ') {Tpos = Mpos;break;} Statesapce + = Allhtml[mpos]; mpos++;} cout << "Space:" << statesapce << Endl;} Tpos + = 9;while (true) {if (allhtml[tpos] = = ' < ') break; Statetime + = Allhtml[tpos]; tpos++;} cout << "Time consuming:" << statetime << endl;if (pos = = string::npos) Return;else{pos = Pos-52;int Begin;while ( True) {if (allhtml[pos] = = ' > ') {begin = Pos;break;} pos--;} for (int i = begin + 1; allhtml[i]! = ' < '; i++) Stateans + = Allhtml[i];} cout << Results: <<---------------:::::: "<< Stateans << Endl;}
if (stateans== "Accepted") break;//If AC, jump out of this cycle, that is, the next topic!
6. Don't forget to add the sleep function behind your loop!
Otherwise the entire page of the status page will be your submission.
It seems that I did not sleep, and then was not known to add a friend "pie mouth", do not know how he added to my! And then told me that I was on the avionics, so I asked a few questions.
Some people may ask, hdu above has 1000-5674 questions ? Why you only AC more than 2000 questions, for this problem, I can only blame the algorithm did not optimize, perhaps to the home of ACM to find source Oh!
7, executable files in my github, welcome everyone fork, do not brush too fast oh! It's going to be more than mine. "Pray," "Pray," "Pray."
As for the source code, release it later! Thousands of good wronged, but don't say!
PostScript
Every acmer should have their own blog, then this!
<a href= "www.myth1314.com" > Desk Painting Youth </a>
Look forward to your message and comments Oh!
Some time ago someone said that my blog can not leave a message, do not know if it is true ... o (* ̄▽ ̄*) ブ
HDU Automatic Brush Machine auto AC (easily enter HDU home)