(non) Correct posture get collection P-station illustrator

(non) Correct posture get collection P-station illustrator _python

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Front of the front

rookie one betel pepper juice vegetableschinese A light spray not to be ridiculed BG

And no BG, forced to get a point

Where do you come to the point of the matter.
The computer room is 7-24 open, thinking of using it to pick up a few things.
Then I went to the P station when I suddenly think of their three-digit collection of pictures can be caught down.
Wrote two job-free nights, PS:

P Station Frequent Abnormal login is feedback to the mailbox, careful mailbox is brushed into 999+ login

This is the browser F12 login can see, in order not to make the jump after emptying we need to hook up preserve log

You can see here in addition to the login account password and a similar verification of something Postkey
Flip page source code to see

<input type= "hidden" name= "Post_key" value= "Cb02a4460fd7a41fb46d0129e4ca5ece" >

Access to PIXIV is to be a referer (not translate), according to personal understanding, it represents the jump before the URL is what
This thing must have to be very important or not access to the original picture's favorites list

The first is the collection of URLs

To make a reptile more like a reptile, we need to look for a tag with the next class in each page.

<span class= "Next" ><a href= "rest=show&amp;p=2" rel= "Next" class= "_button" title= "next Page" >

Such a thing, inside of a is pointing to the next page.
If you can't find the next page, that's the end of the picture .

Each page of the picture has many different resolution addresses, with ID 59345668 picture, the original picture in this line

That's the one.

And then the match was made, and the note was that there was no such a comic.

There are two ways to save 1 for a picture.

Urllib.urlretrieve (URL, path, rollfunc) #url是图片地址, path is the local directory, that Rollfunc is yy out of the download progress of Things

That is to climb the Baidu Post bar used to 2.

With open (path, WB) as File:
    file.write (Requests.get (URL). Content) #url同上, path ibid, w is write B is in binary way

There's only a second way out of here, because come on. Multithreading

It's big, isn't it, it's exciting, right?
For orders of magnitude, obviously, we can get them to do it in parallel, and the speed increase is obviously
To make it easier for me to skip the thread, because threading is better.
Look at the code

Def cal (A, b):
    print a + b

jobs = [] for

p in xrange (m):
    jobs.append (Threading. Thread (target = cal, args = (1, 2)) for

job in jobs:
    Job.start () for

job in jobs:
    Job.join ()

Here we first define a list to record all the threads and then the encapsulation
Target is the function that needs to be invoked (obviously) args is the calling parameter (or obviously), args must be a tuple
Start () and join () are starting and waiting, ensuring that all threads end

Then there's the final flow.
Login 1-> Find a picture of your favorite page and load the download task into a thread-> run it-> find the next page of hyperlinks-> back to 1 strings

In the past, Pascal, C + + are direct Stra = Strb + strc, behind the python so the efficiency and low
The reason for this is that string is immutable in Python, so every time the + operation is reopened to generate a string
Search the web again, find a solution.

Stra = '. Join (STRB, STRC)

Code

The first time I wrote something long, it took a few nights to write and stop. Still have a sense of accomplishment (even if the code is ugly)

Import Threading Import requests import OS import re def getpage (HTML, URL, headers): while 1:try:
            page = Html.get (URL, headers = headers, timeout = 3). Content break except Exception, E: Print e pass return page def logIn (HTML, URLs, headers, data): while 1:try:html . Post (URL, headers = headers, data = data, timeout = 2) return except Exception, E:print
        E Pass def get (HTML, URL, headers, index, filepath): headers[' Referer '] = URL page = "While 1: try:page = Html.get (url, headers = headers, timeout = 3). Content Break except EXC Eption, e:print e pass reg = Re.compile (R '  ') Postkey = Re.findall (Reg, LoginPage) [0] data[' post_key '] = Postkey logIn (HTML, loginurl, headers, data) Favurl = P Refavurl index = 0 while 1:headers[' Referer '] = favurl favpage = getpage (HTML, Favurl, headers) reg = re.com
        Pile (R ' "data-type=" illust "data-id=" (\d+) "data-tags=") lis = Re.findall (Reg, favpage) jobs = [] for p in LIS: url = '. Join ([Preurl, p]) Index + = 1 jobs.append (threading. Thread (target = get, args = (HTML, URLs, headers, index, filepath)) for job in Jobs:job.start () for job
    In Jobs:job.join () Reg = Re.compile (R ' class= "next" ><a href= "(. +?)" ") Bacurl = Re.findall (Reg, favpage) If bacurl:p = Re.sub (' &amp; ', ' & ', bacurl[0]) Favurl = Pre
        Favurl + PPrint Favurl Else:break

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

(non) Correct posture get collection P-station illustrator _python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

(non) Correct posture get collection P-station illustrator _python

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support