PHP: Capture Program

Source: Internet
Author: User
A few days ago a small project, the specific needs
You can specify a website domain name, and then you can use their own domain name to access, the structure of the site is the same as the other side
Obviously, this is a thief program.
Implementation ideas: For general static URLs (such as: /2014/06/19/index.html
When you first visit (for example: www.xxx.com/2014/06/19/index.html
Just pick it up. www.sohu.com/2014/06/19/index.htmlWeb page
Then, under the root directory of your own web site, create the appropriate folders and files ( 2014->06->19->index.html
But for dynamic URLs
Like what: /index.php?type=news, you should know that the folder name or file name cannot contain some special characters.
For this, you can replace some special characters
However, now customers have put forward some wonderful requirements, such as: need to collect the structure of the site do not like each other, it is best to customize their own URL structure
Like what: www.sohu.com/2014/06/19/index.html
Results: www.xxx.com/2014_06_19_index.html
Instead of: www.xxx.com/2014/06/19/index.html
For this, we have what better implementation of the scheme?
Or is there a more powerful open source program?

Reply content:

A few days ago a small project, the specific needs
You can specify a website domain name, and then you can use their own domain name to access, the structure of the site is the same as the other side
Obviously, this is a thief program.
Implementation ideas: For general static URLs (such as: /2014/06/19/index.html )
When the first visit (for example: www.xxx.com/2014/06/19/index.html )
Just pick up the www.sohu.com/2014/06/19/index.html page.
Then create the appropriate folders and files in the root directory of your own web site ( 2014->06->19->index.html )
But for dynamic URLs
For example /index.php?type=news , to know that a folder or file name cannot contain some special characters.
For this, you can replace some special characters
However, now customers have put forward some wonderful requirements, such as: need to collect the structure of the site do not like each other, it is best to customize their own URL structure
Like what:www.sohu.com/2014/06/19/index.html
Results:www.xxx.com/2014_06_19_index.html
Instead of:www.xxx.com/2014/06/19/index.html
For this, we have what better implementation of the scheme?
Or is there a more powerful open source program?

File_get_contents () function get web page source code
Http://www.w3school.com.cn/php/func_filesystem_file_get_contents.asp

The Strtok () function handles file names
Http://www.w3school.com.cn/php/func_string_strtok.asp

Collect it with the locomotive, it's very powerful.

Write a simple route, and then match the file.

Reverse Proxy

Curl crawls the contents of the page and then Preg_match_all matches the regular expression. Get the content specified on the page.

Is it bad to collect and redistribute catalogs first?

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.