Browser client Smart Automation: How do I get a dynamically generated URL from the JavaScript runtime on the page?

Source: Internet
Author: User

Browser Client Smart Automation: How do I get a dynamically generated URL from the JavaScript runtime on the page? Demand

"Page Smart stitching" refers to the use of heuristic query Dom tree, the "next page" link, take out its href attribute. Chromium's official plugin Dom distiller done similar work, primarily to turn the multi-page click process into a single-page ajax continuous reading experience.

The problem is that some Web sites now, in order to prevent browser clients from doing this, set the href attribute to "#" (or Javascript:void ()), and then bind a JS handler in its OnClick event to dynamically generate the next page URL.

In this case, how can the client automate to get the next page URL?

Method 1: Use JS to implement a JavaScript source code interpreter

If you can get the onclick handler JS source code, you can implement a JavaScript source code interpreter, JS interpreter has actually been implemented already (need to virtual a fake global window object, execution context, Proxy DOM tree access, And finally intercept the window.open request or Location assignment statement).

The key problem is that you can't get the onclick handler JS source code! You can only get a JS function object. But the function object's prototype.tostring seems to be able to get the source code? (verification to be tested)

Method 2: Set a specific "expected" data structure, but the network module is blocking this request at the bottom

This method needs to be modified in the kernel. It may be easier to achieve the difficulty:

    1. When parsing to the href= ' # ' of a linked element, a virtual click event (UI interaction from a non-real user) is sent to the element.
    2. At the same time, send a expect data structure to the Network layer NET module IPC: "Please capture the next main document network request, its referer is the current URL, send me this request URL"
      there seems to be an incorrect match, Please refer to the information theory/coding theory related theories.
    3. The click handler of the link element will execute normally while the new URL request is triggered, and this URL request will be captured because the expected match was set previously
    4. The Browser UI main thread receives this new URL and performs the next processing

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Browser Client Smart Automation: How do I get a dynamically generated URL from the JavaScript runtime on the page?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.