Forwarding is the largest support for small series
This article goes from the network
2017 years of time flies, blink of an eye in the 2018 Spring Festival in less than two months, people away from home have to buy home tickets to the matter put on the agenda. Recently, China's traffic news released the "2018 Spring Festival Big Data," the data show that the 2018-year Spring Festival passengers are expected to break through 3 billion people, a year earlier growth.
...
Unimaginable data,
It heralds the return and return tickets for this year's Spring festival
Purchase difficulty will be further increased ...
Buying tickets is more than Li Bai across the "Shu Road" The difficulty is also big oh ~
Today is January 3, 2018, can buy the first day of the Spring Festival (February 1) train tickets. When you want to check the train ticket information, but also to open the non-responsive 12306 official website and Ads N-second app, and annoyed?
Instead of writing a command-line version of the train ticket viewer in Python, you can get the train ticket information you want by tapping one command at the command line!
Effect
Interface design
Let's first give this small application a name, since and query ticket information, then call it tickets good. We want users to be able to get the information they need just by entering the departure station, arrival station and date, so tickets should be used like this:
$ tickets from to date
In addition, there are various types of trains, high-speed rail, bullet train, express, fast and direct, we would like to provide options to query only specific one or several trains, so we should have these options:
-G high Speed rail
-D Bullet train
-T Express
-K Fast
-Z Direct
These options should be used in combination, so ultimately our interface should look like this:
$ tickets [-gdtkz] from to date
The interface has been determined, and the rest is to implement it.
Code implementation
A good practice for writing Python programs is to use the Virtualenv tool to build a virtual environment. Our program uses PYTHON3 development, below in your working directory to build a folder tickets, go in to create a virtual environment and activate it:
$ virtualenv-p/usr/bin/python3 venv$. Venv/bin/activate
Install the library you need to use for your experiment:
$ sodo pip Install requests prettytable docopt
Requests, no more introduction, use Python to access the HTTP resources of the necessary libraries.
docopt, Python3 command line parameter parsing tool.
prettytable, format information printing tool that allows you to print data like MySQL.
1 Parsing parameters
Python has a lot of write command-line parameter parsing tools, such as Argparse, Docopt, click, here we use docopt this easy-to-use tool.
Docopt can parse parameters in the format we define in the document string, such as in tickets.py:
# Coding:utf-8
"" "Train tickets query via command-line.
Usage:
Tickets [-gdtkz] <from> <to> <date>
Options:
-H,--Help displays the assistance menu
-G high Speed rail
-D Bullet train
-T Express
-K Fast
-Z Direct
Example:
Tickets Shanghai Beijing 2017-12-05
""
From docopt import docopt
def CLI ():
"" "Command-Line Interface" ""
arguments = docopt (__doc__)
Print (arguments)
if __name__ = = ' __main__ ':
CLI ()
Let's Run the program here:
$ python3 tickets.py Shanghai Beijing 2017-12-05
We get the following result:
{'-d ': false, '-G ': false, '-K ': false, '-T ': false, '-Z ': false, ' <date> ': ' 2017-12-05 ', ' <from> ': ' Shanghai ', ' < ;to> ': ' Beijing '}
2 Getting Data
The parameters have been resolved, the following is how to get the data, which is the most important part. First we open 12306, enter the remaining ticket query page, if you use Chrome, then press F12 Open Developer tool, select the Network column, in the Query box clock we enter Shanghai to Beijing, date 2017-12-05, click on the query, we found in the debugging tool, The query system actually requested this URL:
https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=adult&querydate=2017-12-05&from_station=shh& To_station=bjp
And the data is returned in JSON format!
The next question is simple, we just need to build the request URL and parse the returned JSON data. But we found that the URL inside from_station and to_station is not kanji or pinyin, but a code name, and we want to enter the Chinese characters or pinyin, how do we get the code? We open the Web source to see if there is any discovery.
Sure enough, we found this link in the Web page: https://kyfw.12306.cn/otn/resources/js/framework/station_name.js?station_version=1.8955 It looks like the Chinese name, pinyin, shorthand and code number of all stations are included. But this information is crowded together, and we only want the station pinyin and capital letter code information, how to do?
The regular expression is the answer, we write a little foot match to extract the desired information, in the parse_station.py:
# coding:utf-8import Reimport requestsfrom pprint Import pprinturl = ' Https://kyfw.12306.cn/otn/resources/js/framework /station_name.js?station_version=1.8955 ' text = requests.get (URL, verify=false) stations = Re.findall (R ' ([A-Z]+) \| ( [a-z]+] ', text) stations = Dict (stations) stations = Dict (Zip (stations.values (), Stations.keys ())) Pprint (stations, indent=4)
Note that the result of matching the above regular expression into a dictionary, the dictionary key is uppercase large, which is obviously not the result we want, so we have a transformation of the key value in turn.
We run this script, which will return all stations and their uppercase letters in the form of a dictionary, and we redirect the results to stations.py,
$ python3 parse_station.py > stations.py
We add a name to this dictionary, stations, and then enter the Chinese name of the station, and we can get its letter code directly from this dictionary:
...
From stations import stations
def CLI ():
arguments = docopt (__doc__)
From_staion = Stations.get (arguments[' <from> ')
To_station = Stations.get (arguments[' <to> ')
Date = arguments[' <date> ']
# Build URLs
url = ' Https://kyfw.12306.cn/otn/lcxxcx/query?purpose_codes=adult&querydate={}&from_station={}&to_ station={} '. Format (
Date, from_staion, to_station
)
Everything is ready, let's request this URL to get the data! Here we use the requests library, which provides a very easy-to-use interface,
...
Import requests
def CLI ():
...
# Add Verify=false parameter does not validate certificate
r = Requests.get (URL, verify=false)
Print (R.json ())
From the results, we can observe that the information related to the ticket needs to be further extracted: Def CLI ():
...
r = requsets.get (URL);
rows = R.json () [' Data '] [' datas ']
3 Parsing data
We encapsulate a simple class to parse the data:
From prettytable import Prettytableclass traincollection (object): # shows the train, departure/arrival station, departure/arrival time, duration, seat, second class sit, soft sleeper, hard sleeper, hard seat header = ' Train station time duration first second softsleep hardsleep hardsit '. Split () def __init__ (self, rows): Self.rows = rows D EF _get_duration (Self.row): "" "gets the train run Time" "" duration = Row.get (' Lishi '). Replace (': ', ' h ') + ' m ' If Duration.startswith (' 0 0 '): return duration[4:] If Duration.startswith (' 0 '): return Duration[1:] Return duration @property def trains (self): for row in self.rows:train = [# train row[' Station_train_code '], # Departure, arrival station ' \ n '. Join ([row[' From_staion_name '], row[' To_station_n Ame ']), # Departure, arrival time ' \ n '. Join ([row[' start_time '], row[' arrive ']), # duration self._get_duration (row), # sit row[' Zy_num '), # Second-class sit row[' Ze_num ', # sleeper row[' rw_num ', # soft sit row[' yw_num '], # hard sit row[' yz_num '] yield train def pretty_print (self): "" "The data has been Taken, the rest is to extract the information we want and display it. The ' prettytable ' library allows us to format and display data like a MySQL database. "" "pt = prettytable () # Sets the title of each column pt._set_field_names (Self.header) for train in SElf.trains:pt.add_row (train) print (PT)
4 Display Results
Finally, we summarize the above process and output the results to the screen:
... class traincollection: ... def CLI (): arguments = docopt (__doc__) from_staion = Stations.get (arguments[' <FROM&G t; ']) To_station = Stations.get (arguments[' <to> ']) date = arguments[' <date> '] # build URL url = ' https://kyfw.12306.cn /otn/lcxxcx/query?purpose_codes=adult&querydate={}&from_station={}&to_station={} '. Format (date, from _staion, to_station) R = Requests.get (URL, verify=false) rows = R.json () [' Data '] [' datas '] trains = traincollection (rows) Trains.pretty_print () if __name__ = = ' __main__ ': CLI ()
5 last metre
At this point, the main body of the program has been completed, but the result printed above is black and white, very boring, let's add color to it:
def colored (color, text):
Table = {
' Red ': ' \033[91m ',
' Green ': ' \033[92m ',
# no Color
' NC ': ' \033[0 '
}
CV = table.get (color)
NC = Table.get (' NV ')
Return '. Join ([CV, Text, NC])
Modify the program to show the departure station and departure time in red, the arrival station and arrival time is shown in green:
...
' \ n '. Join ([Colored (' green ', row[' from_staion_name ')
Colored (' Red ', row[' To_station_name ')]),
' \ n '. Join ([Colored (' green ', row[' start_time ')
Colored (' Red ', row[' Arrive_time ')]),
...
Over, you can try the students themselves yo ~
Spring Festival will be near, small Ann hope you can successfully rob tickets happy home for the New Year ~
and send the ticket to the other calendar.
Spring Festival train Tickets for sale today, Python let you rob the ticket fast one step