= url + '? '+ Dataresponse = urllib2.urlopen (url2) the_page = response. read () print the_page
The following example describes how to use cookies by simulating logon to Renren and then displaying the homepage content. The following is an example in this document. we will transform this example to implement the functions we want.
Import cookielib, urllib2cj = cookielib. cookieJar () opener = urllib2.build _ ope
://www.pixiv.net/login.php? Return_to = 0 "," user-agent ":" mozilla/5.0 (windows nt 10.0; win64; x64; rv: 45.0) gecko/20100101 firefox/45.0 "}
Because it is To Crawl multiple page graphs, I use the cookie login method here, but because the cookie may change to run every time, you have to log on again:
Cookie = http. cookiejar. mozillaCookieJar (". cookie ") # update handler = urllib when the cookie is overwritten every time. request. HTTPCookieProcessor (cookie)
Two important concepts in urllib2: Openers, Handlers, and urllib2openers
Before starting the following content, let's first explain the two methods in urllib2: info/geturlThe response object response (or HTTPError instance) returned by urlopen has two useful methods: info () and geturl ()
1. geturl ():
This returns the obtained real URL, which is useful because urlopen (or used by the opener object) may be redirected. The obtained URL may be different
This article first introduces two methods of urllib2, and then introduces in detail the two important concepts of urllib2: Openers and Handlers. we hope to help you before you start the following content, the following two methods in urllib2 are described: info/geturl.
The response object response (or HTTPError instance) returned by urlopen has two useful methods: info () and geturl ()
1. geturl ():
This returns the obtained real URL, which is useful because urlopen (or used by the
displaying the homepage content. The following is an example in this document. We will transform this example to implement the functions we want.
Import cookielib, urllib2cj = cookielib. cookieJar () opener = urllib2.build _ opener (urllib2.HTTPCookieProcessor (cj) r = opener. open ("http://example.com/") # coding: utf-8import urllib2, urllibimport cookieliburl
Before we start, let's explain two methods in Urllib2: Info/geturl
Response (or Httperror instance) of the reply object returned by Urlopen has two useful methods, info () and Geturl ()
1.geturl ():
This returns the actual URL obtained, which is useful because Urlopen (or opener objects used) may have redirects. The URL you get may be different from the request URL.Take a hyperlink from everyone, for example,Let's build a urllib2_test10.py to compar
This is a creation in
Article, where the information may have evolved or changed.
Excelize is a Microsoft-based Office OpenXML standard that is written by Golang to manipulate office Excel document class libraries. You can use it to read, write to XLSX files. Compared to other open source class libraries, Excelize supports writing documents with pictures (tables), supports inserting pictures into Excel, and does not lose the chart style after saving.
All along the technical group will have new students to ask questions about Urllib and URLLIB2 and cookielib related issues. So I'm going to summarize here and avoid wasting resources by answering the same questions over and over again.This is a tutorial class text, if you already know urllib2 and cookielib so please ignore this article.First, start with a piece of code,#cookieimport urllib2import Cookielibcookie = cookielib. Cookiejar () opener = Url
Cookies are used for server session, user logon, and status management. This article mainly introduces how to process cookies using python, if you are interested, you can refer to the previous article to learn about crawler Exception Handling. Next, let's take a look at how to use cookies.
Why use cookies?
Cookie refers to the data (usually encrypted) stored on the user's local terminal by some websites to identify users and track sessions)
For example, some websites need to log on before they
Python crawler cookie usage, pythoncookie
In the previous article, we learned about crawler Exception Handling, so let's take a look at how to use cookies.
Why use cookies?
Cookie refers to the data (usually encrypted) stored on the user's local terminal by some websites to identify users and track sessions)
For example, some websites need to log on before they can access a page. Before you log on, it is not allowed to capture the content of a page. Then we can use the Urllib2 library to save th
ways to access web pages using Python: urllib, urllib2, httplibUrllib is relatively simple and has relatively weak functions. httplib is simple and powerful, but does not seem to support session(1) simplest page accessRes = urllib2.urlopen (URL)Print res. Read ()(2) Add the data to get or postData = {"name": "Hank", "passwd": "hjz "}Urllib2.urlopen (URL, urllib. urlencode (data ))(3) add an HTTP HeaderHeader = {"User-Agent": "Mozilla-Firefox5.0 "}Urllib2.urlopen (URL, urllib. urlencode (data),
There are many practical tool classes in the Python standard library. here we will summarize the usage details of urllib2: Proxy setting, Timeout setting, adding specific, Cookie to HTTPRequest, using http put and DELETE methods.
Proxy settings
By default, urllib2 uses the environment variable http_proxy to set HTTP Proxy. If you want to explicitly control the Proxy in the program without being affected by environment variables, you can use the following method:
The code is as follows:
Import
To query the score, you need to log on and then display the score of each discipline, but only the score is displayed without the score, that is, the weighted average score. Let's talk about our school website:
Http://jwxt.sdu.edu.cn: 7777/zhxt_bks/zhxt_bks.html
To query the score, you need to log on and then display the score of each discipline, but only the score is displayed without the score, that is, the weighted average score.
We first prepare a POST data, then prepare a cookie for recei
This is a creation in
Article, where the information may have evolved or changed.
Https://raw.githubusercontent.com/Luxurioust/excelize/master/excelize.png
Excelize is a Microsoft-based Office OpenXML standard that is written by Golang to manipulate office Excel document class libraries. You can use it to read, write to XLSX files. Compared to other open source class libraries, Excelize supports writing documents with pictures (tables), supports inser
average value of the output score, that is, the score point.#---------------------------------------Import urllibImport urllib2Import cookielibCookie = cookielib. CookieJar ()Opener = urllib2.build _ opener (urllib2.HTTPCookieProcessor (cookie ))# Data to be POST #Postdata = urllib. urlencode ({'Stuid': '123 ','Pwd': '123'})# Customizing a request #Req = urllib2.Request (Url = 'http: // your xt.sdu.edu.cn:
main ways to access web pages using Python: urllib, urllib2, httplibUrllib is relatively simple and has relatively weak functions. httplib is simple and powerful, but does not seem to support session(1) simplest page accessRes = urllib2.urlopen (url)Print res. read ()(2) Add the data to get or postData = {"name": "hank", "passwd": "hjz "}Urllib2.urlopen (url, urllib. urlencode (data ))(3) add an http HeaderHeader = {"User-Agent": "Mozilla-Firefox5.0 "}Urllib2.urlopen (url, urllib. urlencode (da
source code to know where the post data is actually sent:
Well, this is the address for submitting post data.
In the address bar, the complete address should be as follows:
Http://jwxt.sdu.edu.cn: 7777/pls/wwwbks/bks_login2.login
(The access method is simple. You can click the link in Firefox to view the link address)
5. Test the knife
The next task is to use python to simulate sending a post data and obtain the returned cookie value.
For more information about Cookie operations, see this b
Summary of the usage details of the Python standard library urllib2, pythonurllib2
There are many practical tool classes in the Python standard library, but the detailed description of the use is not clear in the standard library documentation, such as the HTTP client library urllib2. Here we summarize the Usage Details of urllib2.
1. Proxy Settings2. Timeout settings3. Add a specific Header to the HTTP Request4. Redirect5. Cookie6. Use the PUT and DELETE methods of HTTP7. Get the HTTP return co
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.