We open the Google Play first page, click on the top right corner of the "Login" button, that is, jump to the landing page
Every time I want to use a crawler to log on to a site, I will first enter an account password Click login once, to see what data will post after landing. Well, I think the most convenient and most often used method is: Mozilla Firefox--web developer Tools--Network
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvq2htyurptg==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center ">
Firefox browser-web developer Tools-Network
Now that we know that Google Play is going to submit 14 parameters, then we'll find out how the values for these 14 parameters come in and assemble them into a POST request. Through analysis. In fact, most of the references are actually available on the page!
Look at the picture
Firefox-web developer Tools-Viewer (Figure 1)
In addition to the Bgresponse value, all other values can be found in the page source code. I won't say it in detail. Let's assume that the students who have done Google landing know. To achieve the login, the key is to get to bgresponse this value.
So what do you mean by bgresponse this thing?
bgresponse
is specifically to verify that the bot is using Google's botguard
technology, assuming that the value can not be sent correctly,Google will also agree to your landing success. Wanna know why? Because, Google will track this account and the session! To be able to participate in StackOverflow's explanation.
Stackoverflow-botguard (Fig. 2)
The next thing to do is how to get this value, in the page source code, we can see a section of JS, this JS is in our click on the login button when the call, in which btresponse this value is generated in this section of JS!
!
Then follow this JS to know that Bgresponse is actually an initialization value (which we can see as a key), and a JS algorithm. Btresponse This value is obtained by this algorithm and key!
The JS method triggered by the login button (Figure 3)
The following is the encryption algorithm and the initialization value key, just the post part, the code is too long not posted here.
Key (Figure 4)
watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvq2htyurptg==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center ">
Algorithm (Figure 5)
See here, to get response this is very easy to pull. With the code in Figure 3, we can change it a little bit:
function Getbgvalue () { var bg= '; try { Document.bg.invoke (function (response) { bg=response; }); } catch (Err) { bg= '; } return BG; }; var bg=getbgvalue ();//The value of Bgresponse is obtained here. Console.log (BG); Phantom.exit ();
Finally, is there a question of how to invoke JS code in Python to get the response value? Here I recommend the use of phantomjs!
For example, under Terminal, input./PHANTOMJS google.js can execute JS. So, the equivalent of calling an external command in Python, calling Phantomjs is possible! Thank you for reading, and you are welcome to comment! Your comments and reading are my greatest motivation!
Source code: Click to download
If you are interested in learning web crawler students Welcome to add qq:335418265, find like-minded people to learn to work hard This is also the purpose of writing this article!
Web crawler login Google paly store