Use Python to break 12306 's last line of defense

Source: Internet
Author: User
Tags ticket ssl certificate

Everyone classmates good, I am hadron, long time did not bring you the latest technical articles, recently several students asked me 12306 automatic ticket can be achieved, I took advantage of these two days with Python did a 12306 automatic ticket to rob the project, Here I come to take you together to see how to overcome the evil of the 12306 steps. The end of the article has benefits!!!

We have to do 12306 of the ticket and the official does not provide the corresponding interface (also impossible to provide), then we can only find 12306 of the data packet and the ticket purchase process to simulate the browser behavior to automate the operation, said straightforward is the crawler, the next step into the front, high energy, please fasten the seat belt ~ ~

First of all, we need to confirm the ticket before the ticket, then carry out the normal check, open 12306 tickets Https://kyfw.12306.cn/otn/leftTicket/init Enter the origin and destination to search.

What is the way we can expect to get trips and related information when we see this page? For the 0 basis of the students will think of the first time in the source code to find, but here in fact there is no relevant content in the source code, Because the request is in the JS in the way Ajax asynchronous request is loaded dynamically, not included in the source code, so we can only grasp the package to see the browser and server data interaction, I use the Google Browser so open the Developer tool shortcut key is F12.

Note that the option to select the Red Line box, as long as the browser and server data interaction will be displayed in the following list box, we click the Query button again.

The results found that there are two requests in the list, that is to say we click the Query button after the browser to the server to make two requests, then we come to the return value analysis of the request is the real acquisition of the train related data requests, so that we use Python to simulate the browser operation.

First time Request:

It is obvious that the value returned for the first request does not have the train information we need.

Second Request:

The second request to see a lot of data, although we have not yet seen the train information, but we found that it has a feature, that is, there is a list of values inside there are 6 elements, and just our search from Changsha to Chengdu vehicle is also 6 data, so the two certainly have a certain relationship, Then we'll use Python to get this data before we proceed to the next analysis:

#-*-coding:utf-8-*-importurllib2 importsslssl._create_default_https_context = Ssl._create_unverified_context Defgetlist (): req = urllib2. Request (' Https://kyfw.12306.cn/otn/leftTicket/query?leftTicketDTO.train_date=2017-07-10&leftTicketDTO.from _station=cdw&leftticketdto.to_station=csq&purpose_codes=adult ') req.add_header (' User-Agent ', ' Mozilla/ 5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/59.0.3071.115 safari/537.36 ') HTML = Urllib2.urlopen (req). Read () Returnhtml printgetlist ()

First define a function to get the train list information:

Gets the url:https://kyfw.12306.cn/otn/leftticket/query?leftticketdto.train_date=2017-07-10& of the request from the packet capture data Leftticketdto.from_station=cdw&leftticketdto.to_station=csq&purpose_codes=adult

To prevent our requests from being detected by 12306, we can simply add headers to simulate browser requests.

Req. add_header (' user-agent ', ' mozilla/5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/59.0.3071.115 safari/537.36 ')

of which:

Ssl. _create _default_https _context = SSL. _create _unverified _context

Zhengshu5.com
dajinnylee.cn
Xc.xyseo.net
xyseo.net/xuancai/

Because 12306 uses the HTTPS protocol, and the SSL certificate itself does not get the endorsement of the browser, so Python by default is not to request untrusted certificate of the site, we can use this line of code to shut down the authentication of the certificate

So let's see if we can get the information we want:

It turns out that we have no problem with the operation, and then we first get the list with 6 data.

The returned data is in JSON format, but there is no JSON type in the Python standard data type, so it is a string for Python, If we want to operate this JSON very conveniently, we can use the JSON package in Python to turn the JSON string into a dict type, and then take the list out and return it with the Dict key value.

#-*-Coding:utf-8-*-importurllib2 importssl importjsonssl._create_default_https_context = ssl._create_unverified_ Contextdef getList (): req = urllib2. Request (' https://kyfw.12306.cn/otn/leftTicket/query?leftTicketDTO.train_date= 2017-07-10&leftticketdto.from _station= cdw&leftticketdto.to_station= csq&purpose_codes=adult ') req.add_header (' user-agent ', ' MOZILLA/5. 0 (WindowsNT10.0; Win64; x64) applewebkit/537. (khtml, like Gecko) chrome/59. 0.3071. 115safari/537. + ') HTML = Urllib2.urlopen (req). Read () Dict = json.loads (html) result= dict[' data ' [' Result '] Returnresult

The final return is a list data, we first put this data for out and then see what each piece of data have something:

Foriingetlist (): Print I

We'll look at what the first piece of data looks like when we get out:

| Booking | 76000g131805| g1318| icw| izq| icw| cwq| 07:54| 18:54| 11:00| n| uhesfcaidex22z0zwfqttduzxjfuwpdia148i6tnk5spiqfp| 20170710| 3| w2| 21> 16| 0|0| | | | | | | | | | | none | none | No | | o0m090| OM9

In fact, we would like to stay a little bit will find that contains g1318,07:54,18:54, no such train information, but it seems to be messy, but they all have a feature, each data is by the | This symbol is separated, so we can use the segmentation to see what can be found?

Foriingetlist (): Forn INI. Split (' | '): print n break

You can see all the values are printed out, we can then add a sequence number to be clear to see what the value of each ordinal is, for example, there is a train hard seat 3 tickets left, soft sleeper and 8 tickets left, Then we can see which ordinal corresponds to a value of 3 which is the corresponding value of the number is 8 to figure out which ordinal is what the seating or other parameters.

c = 0fori ingetlist (): Forn ini.split (' | '): print ' [%s]%s '% (c,n) c + = 1c = 0break# Index 3 = n° # index 8 = Departure Time # Index 9 = arrival time

To here do not know the students have found a problem, is that I use this function can only get to the data from Changsha to Chengdu, and others do not necessarily buy this direction of the train, so we have to figure out the URL of the request of the departure station and the arrival station value is how to come.

https://kyfw.12306.cn/otn/leftticket/query?leftticketdto.train_date= 2017-07-10&leftticketdto.from_station= cdw&leftticketdto.to_station= Csq&purpose_codes=adult

The parameters to find the departure and arrival stations are:

leftticketdto.from_station= CDWLEFTTICKETDTO.TO_STATION=CSQ

However, through the search and analysis I did not find that the two parameters are regular, then that is to say that the two values are in the previous request has been obtained, by checking the Web page source code is not found, then can only be caught by the way to find the package.

In the process of grasping the package found a package return value is accompanied by the city code, the URL is as follows:

https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.9018

So, we'll copy the city data out of this, create a new cons.py file and save it.

Then we can directly in this data to match the corresponding city code by inputting the parameters into the departure city and arriving city, the codes are as follows:

Station = {} Fori incons.station_names. Split (' @ '): ifi:tmp = I. Split (' | ') station[tmp[1]] = tmp[2] #print stationtrain_date = raw_input (' Please enter departure time ') From_sta tion = station[raw_input (' Please enter the Departure city ')]to_station = station[raw_input (' Please enter the City ')]

By entering the time, the city can get the corresponding train information.

Then we will carry out some simple judgment, we can achieve check the corresponding time, location, whether the train is more than the ticket.

At the same time, the combination of login, purchase tickets and other processes, through the automatic judgment whether there is a ticket, if no ticket will continue to refresh, until there is a ticket after the automatic login after the order by SMS or telephone, such as automatic contact with the purchase of the ticket person mobile phone can be, such as:

Use Python to break 12306 's last line of defense

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.