The spider can only capture href (& lt; ahref = & quot; Default. aspx & quot; & gt; test & lt;/a & gt;) it is best not to include parameters (& lt; ahref = & quot; Default. aspx? Id = 1 & quot; & gt; test & lt;/a & gt;
After several days of study, I summarized the following points: 1. the spider can only capture the href (test) in the tag. it is best not to include the parameter (test) in the backend. if the spider with the parameter is not limited, then the URL must be used to overwrite the URL. 2. the spider will not execute JavaScript. In other words, if onclick is used in TAG a, the spider will not catch it. 3. the spider can only capture the page of The get request but not the page of the post request. 4. we hope that all the front-end pages of the webpage will be caught by the spider, but we do not want the backend pages to be caught by the spider. Instead, we only need to know which site is the front-end page and which is the background page. here we need to create A file named "robots.txt" (robots.txt is a protocol that is not recommended for commands) robots.txt is the first file used by the search engine to search for the website. The focus of this article is as follows: At, I said that the spider will not execute JavaScript, so does it mean that as long as the AJAX effect is used, it will not be caught by the spider? The answer is No. about Ajax. let's take a look at a demo using AJAX.Test
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.