Is there any way to capture data asynchronously loaded through ajax on a webpage?

Source: Internet
Author: User
I recently went to a website to capture some data. I checked on this website and found that the data I want to capture is asynchronously loaded through ajax. Is there any way to capture it? I plan to use node. js or php is about to capture some data on a website recently. I checked on the website and found that the data I want to capture is loaded asynchronously through ajax, is there any way to capture it? I plan to use node. js or php

Reply content:

I recently went to a website to capture some data. I checked on this website and found that the data I want to capture is asynchronously loaded through ajax. Is there any way to capture it? I plan to use node. js or php

The developer tool takes a look at the request details to see if verification is not required (some websites are very aggressive and can be requested at will ). If there is any authentication mechanism, you don't have to worry about it. crawler work.

Search for headless browser, front-end testing framework or something.

In fact, there are many solutions, such as selenium, phantomjs, casperjs, and qtwebkit.

We use casperjs. After each ajax request is completed, save the webpage and put it into the queue. In this way, the analysis program behind it only needs to analyze html.

When casperjs and nodejs are used together, there may be a small problem from time to time (I have not encountered any problems and it is a good solution). If you don't want to bother, npm will install spookyjs, it is said that casperjs can be used as a node module.

Of course, the request is not complex. If verification is not required, simply observe the request.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.