Shell Command Curl and wget use proxy to collect Web page summary Encyclopedia

Source: Internet
Author: User
Tags auth

The Linux Shell provides two very useful commands for crawling Web pages, which are curl and wget, respectively.

As the basic service of large data analysis and research, rice flutter agent has done a thorough research and summary.

Curl and wget use proxies

Curl supports HTTP, HTTPS, SOCKS4, SOCKS5

Wget supports HTTP, https

Shell Curl wget Sample

#!/bin/bash # # Curl Support HTTP, HTTPS, SOCKS4, SOCKS5 # wget support HTTP, HTTPS # # M-Flutter Proxy Example: # # M-Flutter Agent purchase: # # # 2015-11-09 # "M-Flutter Agent": this example, on the CentOS, Ubuntu, MacOS and other servers, are tested through the # # HTTP proxy format h Ttp_proxy=http://ip:port # HTTPS proxy format Https_proxy=http://ip:port # # Proxy No auth # curl and wget, crawl http web page {' http ': ' http:/        			/ '} curl-m--retry 3-x # http_proxy wget-t--tries 3-e "http_proxy=" # H Ttp_proxy # Curl and wget, crawling HTTPS Web pages (note: Add parameters without SSL security authentication) {' https ': ' '} curl-m--retry 3-x http: # https_proxy wget-t--tries 3-e "http s_proxy= "--no-check-certificate # https_proxy # C URL Support Socks # where, SOCKS4 andSOCKS5 two kinds of protocol agents can crawl both HTTP and HTTPS Web pages {' socks4 ': ' '} curl-m--retry 3--socks4 curl-m--retry 3--socks4 proxy2.php {' socks5 ': ' '} curl-m--retry 3--socks5 http://proxy.mimvp.c om/test_proxy2.php curl-m--retry 3--socks5 # wget does not support s Ocks # Proxy auth (agent requires username and password Authentication) # curl and wget, crawling HTTP Web pages curl-m--retry 3-x: 5718 # http curl-m--retry 3-x http://username:password@ 8 # HTTPS curl-m--retry 3-u username:password-x 718 # http curl-m--retry 3-u username:password-x 718 Https:// # HTTPS Curl-m--retry 3--proxy-user username:password-x HTTP://PROXY.MIMV # http Curl-m--retry 3--proxy-user username:password-x HTTPS://PR # HTTPS wget-t--tries 3-e "http_proxy=http://username:password@" http: wget-t--tries 3-e "https_proxy=http://username:password@" https :// wget-t--tries 3--proxy-user=username--proxy-password=password-e "http_proxy= " wget-t--tries 3--proxy-user=username-- Proxy-password=password-e "https_proxy=" # Curl Support Socks curl-m--retry 3-u username:password--socks5 # HTTP curl-m--retry 3-u username:password--socks5 # HTTPS curl-m--retry 3--proxy-user Username:password-- SOCKS5 # http curl-m--retry 3--proxy-user username:passwo

 Rd--SOCKS5 # HTTPS # wget does not support socks

wget configuration File Settings Agent

Vim ~/.wgetrc

Use_proxy =
on wait =

# profile settings, immediately take effect, directly execute wget crawl command can
wget-t--tries 3 proxy2.php
wget-t--tries 3

Shell Set temporary local agent

# Proxy No auth
export http_proxy=
export https_proxy=http://

# Proxy auth (proxy requires username and password Authentication)
Export http_proxy=http://username:password@ : 8888:8080
export https_proxy=http://username:password@

# Direct Crawl page
curl-m--retry 3			# http_proxy
curl-m--retry 3 proxy2.php		# https_proxy
wget-t--tries 3			# http_proxy
Wget-t--tries 3		# https_proxy

# Cancel settings
unset http_proxy
unset Https_proxy

Shell Setup System Global Agent

# Modify/etc/profile, save and restart server
sudo vim/etc/profile		# Everyone valid
sudo vim ~/.BASHRC			# Everyone effective
Vim ~/.bash_profile			# Personal Effective #
at the end of the file, add the following
# Proxy no auth
export http_proxy=http://
Export https_proxy=

# proxy auth (agent requires user name and password Authentication)
Export http_proxy=http://username:password@
export https_proxy=http:// username:password@

# Executes the source command to make the configuration file take effect (temporarily)
source ~/.bash_profile

# # If you need a machine to take effect permanently, you will need to reboot the server
sudo reboot

Meter Flutter Agent Sample

M-flutter agent, focusing on providing enterprises with large domestic data research services, technical team from Baidu, Millet, Ali, innovation workshops, for domestic enterprises to provide large data collection, data modeling analysis, the results of export display services.

The M-Flutter agent sample contains more than 10 programming languages or scripts, including Python, Java, PHP, C #, go, Perl, Ruby, Shell, Nodejs, Phantomjs, Groovy, Delphi, and easy language, through a large number of operational instances, The use of proxy IP is the correct way to facilitate web crawling, data collection, automated testing and other fields.

Meter Flutter Agent Example official website:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.