Urllib2 provides a wide range of URL-based resource processing methods ~ You can use handler to implement various functions ~ Likewise, automatic redirect and cookie analysis and acquisition are implemented based on status code (redirect Based on HTTP status code is also implemented in urllib. fancyurlopener ~)
The step-by-step code is as follows:
Import urllib2 as ul2, cookielib as Cl, urllib as UL
Cj = Cl. cookiejar ()
Opener = ul2.build _
Python write crawlers use the urllib2 method, pythonurllib2
Use urllib2 for python write Crawlers
The Usage Details of urllib2 are sorted out.
1. Proxy Settings
By default, urllib2 uses the environment variable http_proxy to set HTTP Proxy.
If you want to explicitly control the Proxy in the program without being affected by environment variables, you can use the Proxy.
Create test14 to implement a simple proxy Demo:
import urllib2 enable_proxy = True proxy_handler = urllib2.ProxyHand
Just use, this article is well written, turn around to collect.Reprinted from Tao Road | usage details of the Python standard library urllib2There are many useful tool classes in the Python standard library, but it is not clear how to use the detail description on the standard library document, such as URLLIB2, which is the HTTP client library. Here is a summary of some of the URLLIB2 library usage details.
1 Proxy Settings
2 Timeout Settings
3 Adding a specific Header to the HT
There are a number of useful tools classes in the Python standard library, but when used specifically, the standard library documentation does not describe the details of the usage, such as URLLIB2 this HTTP client library. Here's a summary of some of the URLLIB2 's usage details.
setting of 1.Proxy2.Timeout settings3. Add a specific Header to the HTTP Request4.Redirect5.Cookie6. Use the Put and DELETE method of HTTP7. Get HTTP Return code8.Debug Log
Settings for Proxy
URLLIB2 uses environmen
contents of the first page as an example to detail the use of cookies, the following is the example given in the document, we have to change this example to achieve the functionality we want
Import Cookielib, urllib2
CJ = Cookielib. Cookiejar ()
opener = Urllib2.build_opener (urllib2. Httpcookieprocessor (CJ))
r = Opener.open ("http://example.com/")
#coding: utf-8
Import Urllib2,urllib
import cookielib
url = R ' http://www.renren.com/ajaxL
Recently, an Iot project has a module: Air-Conditioning Control and TV control. Therefore, Windows is used. open and window. showmodaldialog. I think it will not be used frequently in the future. I just need to sort it out for you to view it later and will not forget it. I would also like to share with you!
Write the operations and code that you think are useful.
Setvalue ("iframe1", "iframediv ");
Window. Parent. Opener = NULL; // if this sentence is
Log on to the website using Python
For most forums, we need to log in first to capture the posts for analysis. Otherwise, we cannot view them.
This is because the HTTP protocol is a stateless (stateless) protocol. How does the server know whether the user requesting the connection has logged on? There are two methods:
Explicitly use the session ID in the URI;
The process of using cookies is that a cookie will be retained locally after you log on to a website. When you continue to browse the w
Window. Open is usually used to open a new window.
To obtain the control of the parent window, you can use window. opener to obtain the parent window.
However, if showmodaldialog is used, it is invalid.
If necessary, you need to modify the enabled syntax and the syntax in showmodaldialog.
To enable the syntax for the 2nd parameters, see self. The example is as follows:
VaR rc = Window. showmodaldialog (strurl, self, sfeatures );
Then there is the synt
In front of Urllib2 's simple introduction, the following is a partial urllib2 of the use of the details.
Settings for 1.Proxy
URLLIB2 uses the environment variable HTTP_PROXY to set the HTTP proxy by default.
You can use proxies if you want to explicitly control the proxy in your program and not be affected by environment variables.
Create a new test14 to implement a simple proxy demo:
Import urllib2 enable_proxy = True Proxy_handler = urllib2. Proxyhandler ({"http": ' http://some-proxy.com
Python login website details and examples, python login website details
Python logon website details and examples
For most forums, we need to log on first to capture the posts for analysis; otherwise, we cannot view them.
This is because the HTTP protocol is a Stateless (Stateless) protocol. How does the server know whether the user requesting the connection has logged on? There are two methods:
Explicitly use the Session ID in the URI;
When using cookies, a Cookie is stored locally after y
Python uses a proxy to access the server there are 3 main steps:1. Create a proxy processor Proxyhandler:Proxy_support = Urllib.request.ProxyHandler (), Proxyhandler is a class whose argument is a dictionary: {' type ': ' Proxy IP: port number '}What is handler? Handler is also known as a processor, and each handlers knows how to open URLs through a specific protocol, or how to handle various aspects of URL opening, such as HTTP redirection or HTTP cookies.2. Customize and create a opener:Opener
when writing a Python crawler, will we consider the use of cookies in addition to the exception handling of crawlers? When using cookies, have you ever wondered why you should use cookies? Let's take a look. cookies, which are data that some websites store on the user's local terminal in order to identify the user and track the session (usually encrypted) for example, some websites need to be logged in to access a page, and before you log in, you may not be allowed to crawl a page content. Then
Python writes crawlers using the Urllib2 method
Collated some of the details of Urllib2 's Use.Settings for 1.ProxyURLLIB2 uses the environment variable HTTP_PROXY to set the HTTP proxy by Default.Suppose you want to understand the control of a Proxy in a program without being affected by environment Variables. Ability to use Proxies.Create a new test14 to implement a simple proxy demo:Import urllib2 enable_proxy = True Proxy_handler = urllib2. Proxyhandler ({"http": ' http://some-pro
Python uses a proxy to access the server there are 3 main steps:1. Create a proxy processor Proxyhandler:Proxy_support = Urllib.request.ProxyHandler (), Proxyhandler is a class whose argument is a dictionary: {' type ': ' Proxy IP: port number '}What is handler? Handler is also known as a processor, and each handlers knows how to open URLs through a specific protocol, or how to handle various aspects of URL opening, such as HTTP redirection or HTTP cookies.2. Customize and create a opener:Opener
handle.The first thing you need to know about all handler's parent class Basehandler class is that he has many subclasses:
Httpdefaulterrorhandler: for handling HTTP response errors, errors throw httperror exceptions
Httpredirecthandler: for handling directed
Httpcookieprocessor: For processing cookies
Proxyhandler: Used to set proxy, default proxy is empty
Httppasswordmgr: A table for managing passwords, maintaining usernames and passwords
Httpbasicauthhandler: For
The client then requests the domain again with the correct account and password contained in the header. This is "Basic Authentication". To simplify this process, we can create aExamples of Httpbasicauthhandler and opener to use this handler.Httpbasicauthhandler uses a map called password-managed objects to handle the domain of URLs and user names and passwords. If you know what the domain is (from the authentication header sent from the server), you
, but it doesn't seem to support session(1) The simplest page accessRes=urllib2.urlopen (URL)Print Res.read ()(2) plus data to get or postdata={"name": "Hank", "passwd": "HJZ"}Urllib2.urlopen (URL, urllib.urlencode (data))(3) Plus HTTP headersheader={"user-agent": "mozilla-firefox5.0"}Urllib2.urlopen (URL, urllib.urlencode (data), header)
Use opener and handler opener = Urllib2.build_opener (handler) Ur
is, Download the kernel directly to the specified address space, directly run, I use this method above!After the modification, make,make image, download, after running there is no such situation, but he stopped here do not output:[[emailprotected]]# run tbdm9000 I/o: 0x20000300, id:0x90000a46 dm9000:running i N-bit modemac:08:00:3e:26:0a:6bcould not establish linkoperating @ 100M full duplex modeusing dm9000 devicetftp from Server 192.168.1.2; Our IP address is 192.168.1.244Filename ' uimage-s3
The example of this article describes the way in which PHP uses Curl to implement spoofed IP sources. Can be achieved to forge IP sources, counterfeit domain names, counterfeit user information, to share for everyone's reference. The implementation methods are as follows:
Define spoofed user browser information http_user_agent
Copy Code code as follows:
$binfo =array (' mozilla/4.0 compatible; MSIE 8.0; Windows NT 5.1; trident/4.0. NET CLR 2.0.50727; infopath.2; asktbptv/5.17.0.25
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.